sadm_sysmon.pl
The SADMIN System Monitor (SysMon) is executed at regular interval from the SADMIN client crontab (/etc/cron.d/sadm_client). It monitor everything specified in the configuration file and generate a new SysMon report file ($SADMIN/data/rpt/hostname.rpt).
NAME
sadm_sysmon.pl
Perform selected monitoring test defined in SysMon configuration file
SYNOPSIS
# sadm_sysmon.pl
DESCRIPTION
The SADMIN System Monitor (SysMon) is executed at regular interval from the SADMIN client crontab (/etc/cron.d/sadm_client). It monitor everything specified in the configuration file and generate a new SysMon report file ($SADMIN/data/rpt/hostname.rpt).
Summary of what step the System Monitor go through when it is started
- First it check if the System Monitor lock file exist ($SADMIN/sysmon.lock).
- If it doesn’t, it’s created and we proceed with the next step.
- If it already exist, SysMon check when it was created. If it was more than 30 minutes ago it is recreated and execution continue. If it was created less than 30 minutes, a warning message is issued and the monitor execution is stop.
- Next the SADMIN configuration file is read (get company name and email address of sysadmin).
- Then it try to open the configuration file (“$SADMIN/cfg/${HOSTNAME}.smon”).
- If the file exist, then the System Monitor configuration file is loaded in memory and processing begin.
- If the file doesn’t exist, it copy the template named “$SADMIN/cfg/.template.smon” to “$SADMIN/cfg/${HOSTNAME}.smon” and processing begin.
- If the sysmon template file can’t be found, then execution is aborted after the user is advise with an error message.
- The ‘df’ command is run and the result is store in memory. Any filesystem that is not in SysMon configuration is added with ‘Warning’ threshold set to 85% and ‘Error’ at 90%.
- An empty System Monitor report file is created (${SADMIN}/dat/rpt/${HOSTNAME}.rpt).
- Now each line of SysMon configuration are tested and update the SysMon array in memory.
- Error or Warning found are written to the System Monitor report file.
- Last step is to unload the updated SysMon array into the SysMon configuration file and to remove the lock file ($SADMIN/sysmon.lock).
Before the System Monitor exit, it write what we call a report file, which contain any error or warning detected while processing each line. This System Monitor report file is named “hostname.rpt” and it is created in the “$SADMIN/dat/rpt” directory. This file will then get transferred to the SADMIN server by the sadm_fetch_clients.sh for analysis and trigger an alert if needed.
Each test performed by Sysmon represent a line in the configuration file, for more information see the SysMon configuration file page.
Here is a list of all the type of test that Sysmon can perform:
- Verify the system multipath state
- Check the system load average and notify you, if warning or error threshold is sustained of certain period of time.
- Monitor CPU usage and notify you if warning or error threshold is sustained of certain period of time.
- Verify Swap space utilization and alert if warning or error threshold is exceeded.
- Monitor filesystem usage and alert if warning or error threshold is reached.
- Can ping an IP or a hostname and issue an alert if doesn’t respond after 3 attempts.
- Verify that a particular service is running and advise you if it isn’t, it can even restart it if you want.
- Can check if a deamon of your choice is running
- Can check if we have an http response of a particular web site.
- Can run your own script and advise you if it terminate with a non zero exit code.
EXAMPLE
- You can manually run the System Monitor by entering the command “sadm_sysmon.pl”. You can also run it by entering the command “smon” that will run the system monitor and show you the content of the sysmon report file and finally a list of the script(s) that terminated with error on current system.
- Remember that you can always view the current of all your systems scripts and problems (Error/Warning) at the system monitor web page or on the command line by using the “sview” command.
# sadm_sysmon.pl
Creating lock file /sadmin/sysmon.lock
Loading SADMIN configuration file /sadmin/cfg/sadmin.cfg
------------------------------------------------------------------------------
SADMIN SYStem MONitor Tools - Version 2.43
------------------------------------------------------------------------------
O/S Name = linux
Debugging Level = 5
SADM_BASE_DIR = /sadmin
Hostname = holmes
Virtual Server = N
CMD_SSH = /bin/ssh
------------------------------------------------------------------------------
Loading SysMon configuration file /sadmin/cfg/holmes.smon
File /sadmin/cfg/holmes.smon loaded in sysmon_array (274 lines loaded)
Checking for new filesystems ...
No new filesystem detected
Checking CPU Load Average ...
Uptime line: 10:57:23 up 15 days, 3:40, 1 user, load average: 0.18, 0.22, 0.38
Load Average is at 0 - W: 20 E: 35
Checking CPU Usage ...
CPU Usage line: 0 0 279432 3501528 15148 852272 0 0 0 0 629 1477 1 0 99 0 0
CPU User: 1 - System: 0 - Total: 1
- Warning Level: 85 - Error Level: 95
Checking Swap Space ...
Swap Info Line: Swap: 7340024 279432 7060592
Swap size: 7340024 - Usage: 279432 - Percentage use: 3 %
Checking service crond,cron
- service crond status ... [RUNNING]
[OK] Service is running - Total returned (1)
Checking service chronyd
- service chronyd status ... [RUNNING]
[OK] Service is running - Total returned (1)
Checking service ssh,sshd
- service sshd status ... [RUNNING]
[OK] Service is running - Total returned (1)
Checking service postfix
- service postfix status ... [RUNNING]
[OK] Service is running - Total returned (1)
Checking service at,atd
- service atd status ... [RUNNING]
[OK] Service is running - Total returned (1)
Checking Multipath ...Status of Multipath skipped - Command multipathd not present on system
Multipath status is not in use - Code = (1) (1=ok 0=Error)
[OK] Filesystem / at 39% ... Warning: 85 - Error: 90
[OK] Filesystem /usr at 72% ... Warning: 85 - Error: 90
[OK] Filesystem /boot at 56% ... Warning: 85 - Error: 90
[OK] Filesystem /wiki at 5% ... Warning: 85 - Error: 90
[OK] Filesystem /wsadmin at 82% ... Warning: 85 - Error: 90
[OK] Filesystem /backups at 63% ... Warning: 85 - Error: 90
[OK] Filesystem /psadmin at 4% ... Warning: 85 - Error: 90
[OK] Filesystem /storix at 12% ... Warning: 85 - Error: 90
[OK] Filesystem /opt at 71% ... Warning: 85 - Error: 90
[OK] Filesystem /tmp at 1% ... Warning: 85 - Error: 90
[OK] Filesystem /sysadmin at 9% ... Warning: 85 - Error: 90
[OK] Filesystem /linternux at 5% ... Warning: 85 - Error: 90
[OK] Filesystem /var at 31% ... Warning: 85 - Error: 90
[OK] Filesystem /sadmin at 38% ... Warning: 85 - Error: 90
[OK] Filesystem /home at 55% ... Warning: 85 - Error: 90
[OK] Filesystem /install at 6% ... Warning: 85 - Error: 90
[OK] Filesystem /gitrepos at 10% ... Warning: 85 - Error: 90
[OK] Filesystem /history at 1% ... Warning: 85 - Error: 90
-----
Updating SADM Sysmon configuration file (/sadmin/cfg/holmes.smon)
Deleting SYStem MONitor lock file /sadmin/sysmon.lock
#SYSMON 2.43 holmes - Last Boot: 2021-05-28 07:16:27 - Sat Jun 12 10:57:24 2021 - Execution Time 1.00 seconds
ENVIRONMENT
- The “$SADMIN” environment variable must be defined and contains the root directory of the SADMIN tools (normally /opt/sadmin). It should be already done, the setup script have updated the ‘/etc/profile.d/sadmin.sh’ and the ‘/etc/environment’ files.
- The SADMIN configuration file, is needed and loaded in memory at the beginning of every scripts. This file should already exist and contains your SADMIN configuration and preference setting.
- For Shell script the Shell Library is used and for Python script the Python Library is used.
EXIT STATUS
Exit Code | Description |
---|---|
0 | An exit status of zero indicates success. |
1 | Failure is indicated by a nonzero value, typically ‘1’. |
AUTHOR
Jacques Duplessis
Any suggestions or bug report can be submitted at the support page
COPYRIGHT
Copyright © 2022 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later
This is free software, you are free to change and redistribute it.
There is NO WARRANTY to the extent permitted by law.
SEE ALSO
Link to … | Description |
---|---|
sadm_sysmon_tui.pl | Command line summary of alerts and failed scripts of all your servers. |
sadm_sysmon.pl | Client system monitor |
sadm_fetch_clients.sh | rsync all .rch/.log/.rpt from actives clients to the SADMIN server |
SysMon configuration file | Client System Monitor configuration file |
sadmin.cfg | SADMIN main configuration file |
System Monitor report file | Output file generated by System Monitor |