sadm_sysmon.pl

6 minute read

The SADMIN System Monitor (SysMon) is executed at regular interval from the SADMIN client crontab (/etc/cron.d/sadm_client). It monitor everything specified in the configuration file and generate a new SysMon report file ($SADMIN/data/rpt/hostname.rpt).

NAME

sadm_sysmon.pl
Perform selected monitoring test defined in SysMon configuration file

SYNOPSIS

# sadm_sysmon.pl 
$SADMIN/bin/sadm_sysmon.pl - v2.43
Posted 2021-06-10 - Updated 2021-06-25
Supported on Linux, Aix, MacOS

DESCRIPTION

The SADMIN System Monitor (SysMon) is executed at regular interval from the SADMIN client crontab (/etc/cron.d/sadm_client). It monitor everything specified in the configuration file and generate a new SysMon report file ($SADMIN/data/rpt/hostname.rpt).


Summary of what step the System Monitor go through when it is started

  • First it check if the System Monitor lock file exist ($SADMIN/sysmon.lock).
    • If it doesn’t, it’s created and we proceed with the next step.
    • If it already exist, SysMon check when it was created. If it was more than 30 minutes ago it is recreated and execution continue. If it was created less than 30 minutes, a warning message is issued and the monitor execution is stop.
  • Next the SADMIN configuration file is read (get company name and email address of sysadmin).
  • Then it try to open the configuration file (“$SADMIN/cfg/${HOSTNAME}.smon”).
    • If the file exist, then the System Monitor configuration file is loaded in memory and processing begin.
    • If the file doesn’t exist, it copy the template named “$SADMIN/cfg/.template.smon” to “$SADMIN/cfg/${HOSTNAME}.smon” and processing begin.
    • If the sysmon template file can’t be found, then execution is aborted after the user is advise with an error message.
  • The ‘df’ command is run and the result is store in memory. Any filesystem that is not in SysMon configuration is added with ‘Warning’ threshold set to 85% and ‘Error’ at 90%.
  • An empty System Monitor report file is created (${SADMIN}/dat/rpt/${HOSTNAME}.rpt).
  • Now each line of SysMon configuration are tested and update the SysMon array in memory.
  • Last step is to unload the updated SysMon array into the SysMon configuration file and to remove the lock file ($SADMIN/sysmon.lock).

Before the System Monitor exit, it write what we call a report file, which contain any error or warning detected while processing each line. This System Monitor report file is named “hostname.rpt” and it is created in the “$SADMIN/dat/rpt” directory. This file will then get transferred to the SADMIN server by the sadm_fetch_clients.sh for analysis and trigger an alert if needed.

Each test performed by Sysmon represent a line in the configuration file, for more information see the SysMon configuration file page.


Here is a list of all the type of test that Sysmon can perform:

  • Verify the system multipath state
  • Check the system load average and notify you, if warning or error threshold is sustained of certain period of time.
  • Monitor CPU usage and notify you if warning or error threshold is sustained of certain period of time.
  • Verify Swap space utilization and alert if warning or error threshold is exceeded.
  • Monitor filesystem usage and alert if warning or error threshold is reached.
  • Can ping an IP or a hostname and issue an alert if doesn’t respond after 3 attempts.
  • Verify that a particular service is running and advise you if it isn’t, it can even restart it if you want.
  • Can check if a deamon of your choice is running
  • Can check if we have an http response of a particular web site.
  • Can run your own script and advise you if it terminate with a non zero exit code.

Back to the top

EXAMPLE

  • You can manually run the System Monitor by entering the command “sadm_sysmon.pl”. You can also run it by entering the command “smon” that will run the system monitor and show you the content of the sysmon report file and finally a list of the script(s) that terminated with error on current system.
  • Remember that you can always view the current of all your systems scripts and problems (Error/Warning) at the system monitor web page or on the command line by using the “sview” command.
# sadm_sysmon.pl

Creating lock file /sadmin/sysmon.lock
Loading SADMIN configuration file /sadmin/cfg/sadmin.cfg
------------------------------------------------------------------------------
SADMIN SYStem MONitor Tools - Version 2.43
------------------------------------------------------------------------------
O/S Name                 = linux
Debugging Level          = 5
SADM_BASE_DIR            = /sadmin
Hostname                 = holmes
Virtual Server           = N
CMD_SSH                  = /bin/ssh
------------------------------------------------------------------------------

Loading SysMon configuration file /sadmin/cfg/holmes.smon
File /sadmin/cfg/holmes.smon loaded in sysmon_array (274 lines loaded)

Checking for new filesystems ...
No new filesystem detected

Checking CPU Load Average ...
Uptime line:  10:57:23 up 15 days,  3:40,  1 user,  load average: 0.18, 0.22, 0.38
Load Average is at 0 - W: 20 E: 35

Checking CPU Usage ...
CPU Usage line:   0  0 279432 3501528  15148 852272    0    0     0     0  629 1477  1  0 99  0  0

CPU User:   1 - System:   0  - Total:   1
 - Warning Level: 85 - Error Level: 95

Checking Swap Space ...
Swap Info Line: Swap:       7340024      279432     7060592
Swap size: 7340024 - Usage: 279432 - Percentage use: 3 %

Checking service crond,cron
  - service crond status ... [RUNNING]
[OK] Service is running - Total returned (1)

Checking service chronyd
  - service chronyd status ... [RUNNING]
[OK] Service is running - Total returned (1)

Checking service ssh,sshd
  - service sshd status ... [RUNNING]
[OK] Service is running - Total returned (1)

Checking service postfix
  - service postfix status ... [RUNNING]
[OK] Service is running - Total returned (1)

Checking service at,atd
  - service atd status ... [RUNNING]
[OK] Service is running - Total returned (1)

Checking Multipath ...Status of Multipath skipped - Command multipathd not present on system
Multipath status is not in use - Code = (1) (1=ok 0=Error)
[OK] Filesystem / at 39% ... Warning: 85 - Error: 90
[OK] Filesystem /usr at 72% ... Warning: 85 - Error: 90
[OK] Filesystem /boot at 56% ... Warning: 85 - Error: 90
[OK] Filesystem /wiki at 5% ... Warning: 85 - Error: 90
[OK] Filesystem /wsadmin at 82% ... Warning: 85 - Error: 90
[OK] Filesystem /backups at 63% ... Warning: 85 - Error: 90
[OK] Filesystem /psadmin at 4% ... Warning: 85 - Error: 90
[OK] Filesystem /storix at 12% ... Warning: 85 - Error: 90
[OK] Filesystem /opt at 71% ... Warning: 85 - Error: 90
[OK] Filesystem /tmp at 1% ... Warning: 85 - Error: 90
[OK] Filesystem /sysadmin at 9% ... Warning: 85 - Error: 90
[OK] Filesystem /linternux at 5% ... Warning: 85 - Error: 90
[OK] Filesystem /var at 31% ... Warning: 85 - Error: 90
[OK] Filesystem /sadmin at 38% ... Warning: 85 - Error: 90
[OK] Filesystem /home at 55% ... Warning: 85 - Error: 90
[OK] Filesystem /install at 6% ... Warning: 85 - Error: 90
[OK] Filesystem /gitrepos at 10% ... Warning: 85 - Error: 90
[OK] Filesystem /history at 1% ... Warning: 85 - Error: 90

-----
Updating SADM Sysmon configuration file (/sadmin/cfg/holmes.smon)
Deleting SYStem MONitor lock file /sadmin/sysmon.lock
#SYSMON 2.43 holmes - Last Boot: 2021-05-28 07:16:27 - Sat Jun 12 10:57:24 2021 - Execution Time 1.00 seconds

Back to the top

ENVIRONMENT

  • The “$SADMIN” environment variable must be defined and contains the root directory of the SADMIN tools (normally /opt/sadmin). It should be already done, the setup script have updated the ‘/etc/profile.d/sadmin.sh’ and the ‘/etc/environment’ files.
  • The SADMIN configuration file, is needed and loaded in memory at the beginning of every scripts. This file should already exist and contains your SADMIN configuration and preference setting.
  • For Shell script the Shell Library is used and for Python script the Python Library is used.

EXIT STATUS

Exit Code Description
0 An exit status of zero indicates success.
1 Failure is indicated by a nonzero value, typically ‘1’.

AUTHOR

Jacques Duplessis
Any suggestions or bug report can be submitted at the support page

Copyright © 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later
This is free software, you are free to change and redistribute it.
There is NO WARRANTY to the extent permitted by law.

SEE ALSO

Link to … Description
sadm_sysmon_tui.pl Command line summary of alerts and failed scripts of all your servers.
sadm_sysmon.pl Client system monitor
sadm_fetch_clients.sh rsync all .rch/.log/.rpt from actives clients to the SADMIN server
SysMon configuration file Client System Monitor configuration file
sadmin.cfg SADMIN main configuration file
System Monitor report file Output file generated by System Monitor