![]()
透過自己寫 Systemd Unit Files 方式來取代傳統的 (crontab) cron table 來啟動程式,執行一段時間卻發生程式莫名死掉了.
main process exited, code=killed, status=9/KILL
我們可以利用 Linux 指令 auditctl & ausearch 來監控是被誰殺掉 kill 了.
測試環境 CentOS 8 x86_64 (虛擬機)
程式如下.寫了一支會定期 60 秒寫入時間資料到檔案的 c++ 程式.
[root@localhost ~]# vi test.cpp
#include <fstream>
#include <iostream>
#include <chrono>
#include <ctime>
#include <unistd.h>
using namespace std;
int main() {
while(true)
{
ofstream myFile_Handler;
// File Open
myFile_Handler.open("/tmp/1.txt", std::ios_base::app);
auto timenow = chrono::system_clock::to_time_t(chrono::system_clock::now());
// Write to the file
myFile_Handler << ctime(&timenow) << endl;
// File Close
myFile_Handler.close();
// Sleep
sleep(60);
}
}
編譯成可執行檔.
[root@localhost ~]# g++ test.cpp -o test bash: g++: command not found... Install package 'gcc-c++' to provide command 'g++'? [N/y] y
程式放到 /sbin/ 路徑.
[root@localhost ~]# cp test /sbin/
開始編輯 Systemd Unit Files (一般使用者寫的 service 檔案放在 /etc/systemd/system/ ,系統的放在 /usr/lib/systemd/system).
[root@localhost ~]# vi /etc/systemd/system/test.service [Unit] Description=Test Job [Service] Type=simple ExecStart=/sbin/test [Install] WantedBy=multi-user.target
只簡單設定3個區塊
- [unit]
Description=Test Job
Description 敘述該 Systemd Unit Files 目的.
- [Service]
Type=simple ExecStart=/sbin/test
Type=simple – A long-running process that does not background its self and stays attached to the shell.
ExecStart – 指定執行程式. - [Install]
WantedBy=multi-user.target
指定哪一個 runlevel 執行.
啟動服務
[root@localhost ~]# systemctl enable test.service
Created symlink /etc/systemd/system/multi-user.target.wants/test.service → /etc/systemd/system/test.service.
[root@localhost ~]# systemctl start test.service
[root@localhost ~]# systemctl status test.service
● test.service - Test Job
Loaded: loaded (/etc/systemd/system/test.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2022-11-25 11:02:40 CST; 7s ago
Main PID: 89850 (test)
Tasks: 1 (limit: 49322)
Memory: 292.0K
CGroup: /system.slice/test.service
└─89850 /sbin/test
Nov 25 11:02:40 localhost.localdomain systemd[1]: Started Test Job.
檢視 /sbin/test (定期 60 秒寫入時間資料到檔案 /tmp/1.txt) 是否有正常執行.
[root@localhost ~]# cat /tmp/1.txt Fri Nov 25 11:02:40 2022 Fri Nov 25 11:03:40 2022 Fri Nov 25 11:04:40 2022
設定監控 kill 程式.
[root@localhost ~]# auditctl -a exit,always -F arch=b64 -S kill -k kill_process [root@localhost ~]# auditctl -l -a always,exit -F arch=b64 -S kill -F key=kill_process
- -a [list,action|action,list]
- exit – Add a rule to the syscall exit list.
- always – Allocate an audit context, always fill it in at syscall entry time, and always write out a record at syscall exit time.
- -F
arch – The CPU architecture of the syscall. Supports 32 bit (b32) , 64 bit (b64) - -S
Syscall name - -k
key Set a filter key on an audit rule.
測試一下,手動把程式刪除 kill.
[root@localhost ~]# ps -aux | grep -i test root 68586 0.0 0.0 13780 1832 ? Ss 16:54 0:00 /sbin/test root 68858 0.0 0.0 12136 1156 pts/0 S+ 17:21 0:00 grep --color=auto -i test [root@localhost ~]# kill 68586 [root@localhost ~]# ps -aux | grep -i test root 68865 0.0 0.0 12136 1144 pts/0 S+ 17:21 0:00 grep --color=auto -i test
監控的資訊都會被寫入以下檔案.
[root@localhost ~]# cat /var/log/audit/audit.log
上面檔案內容資訊過多,所以我們通常是透過 ausearch 指令來搜尋,可以透過先前自訂的 key 來搜尋.
[root@localhost ~]# ausearch -k kill_process time->Fri Mar 31 17:21:11 2023 type=PROCTITLE msg=audit(1680254471.502:45035): proctitle=617564697463746C002D6100657869742C616C77617973002D46006172636800623634002D53006B696C6C002D6B006B696C6C5F70726F63657373 type=SYSCALL msg=audit(1680254471.502:45035): arch=c000003e syscall=44 success=yes exit=1068 a0=4 a1=7fff65ffb4d0 a2=42c a3=0 items=0 ppid=8029 pid=68855 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm="auditctl" exe="/usr/sbin/auditctl" key=(null) type=CONFIG_CHANGE msg=audit(1680254471.502:45035): auid=0 ses=2 op=add_rule key="kill_process" list=4 res=1 ---- time->Fri Mar 31 17:21:25 2023 type=PROCTITLE msg=audit(1680254485.165:45036): proctitle="-bash" type=OBJ_PID msg=audit(1680254485.165:45036): opid=68586 oauid=-1 ouid=0 oses=-1 ocomm="test" type=SYSCALL msg=audit(1680254485.165:45036): arch=c000003e syscall=62 success=yes exit=0 a0=10bea a1=f a2=0 a3=7f1216e0c280 items=0 ppid=8021 pid=8029 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm="bash" exe="/usr/bin/bash" key="kill_process"
或是透過 Syscall name 的方式來搜尋.
[root@localhost ~]# ausearch -sc kill time->Fri Mar 31 17:21:25 2023 type=PROCTITLE msg=audit(1680254485.165:45036): proctitle="-bash" type=OBJ_PID msg=audit(1680254485.165:45036): opid=68586 oauid=-1 ouid=0 oses=-1 ocomm="test" type=SYSCALL msg=audit(1680254485.165:45036): arch=c000003e syscall=62 success=yes exit=0 a0=10bea a1=f a2=0 a3=7f1216e0c280 items=0 ppid=8021 pid=8029 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm="bash" exe="/usr/bin/bash" key="kill_process"
可手動把 監控 rules 刪除,或是寫到 /etc/audit/rules.d/ 讓該 rule 永久有效.
[root@localhost ~]# auditctl -D No rules
沒有解決問題,試試搜尋本站其他內容