WPS Monitoring

As the current baseline for the identifying the process id (PID) of WebSphere based on java initiated processes is clumsy on Solaris environment.  With this value added automation shell script IBM WPS or WebSphere developer, Administrators, Architects, QA analysts or a proactive Technical Manager can save considerable amount of their time and efforts of their teams.

Quantification of automation

Current manual process takes approximately 5 minutes to know about the process causing the CPU spike or  MEMORY spike. especially on production servers it is very much required to take thread dump, heap dump at time of issues to do the further analysis. It is better to capture the statistics for tuning the performance of the WPS environment.

Believe it or not! This automation task saves almost 600 productive hours per year including Dev, QA, Production environments. It may varies depends on number of machines involved in WPS.

Functional baseline are achieved with ps command but the name of the java instance initiated the process is identified by /usr/ucb/ps command. Using the grep and awk/nawk we can pick out the desired output for the script.

The 'wasps' Automation Goals

This week we have developed a simple shell script for monitoring WebSphere Process Server running on Solaris environment. The script is targeted the following goals:

1.       To display the process related statistics such as Percentage of CPU Load, Percentage of Memory used by WebSphere java processes ( includes Application Servers, Message Server, NodeAgents, DeploymentManager)

2.       Process id corresponding to java initiated processes, this can be further useful for taking thread dumps or heap dumps of trouble centric WebSphere Java instances.

3. Create a script file with following stuff and name it as "wasps", means WebSphere Application Server process stats.assign the chmod for execution by u+x option.

#!/usr/bin/ksh

/usr/ucb/ps -axwwwwwwwww|grep java|grep -v grep |nawk '{print $1"\t"$NF}'>file1

ps -U $LOGNAME -o pid,pcpu,pmem,comm|grep -i java> file2

echo " PID   CPU  MEM    ROCESS_DIR                              SERVER   "

echo "----------------------------------------------------------------------------------"

/usr/xpg4/bin/awk 'FNR==NR{a[$1]=$2 FS $3;next}{ print $0, a[$1]}' file1 file2

 

Result of automation

Output is giving fine desired statistics, we can further extend this script with setting up the following

1. Threshold for CPU and MEM

.2. mail when abnormal conditions

3. run the script in crontab or autosys job scheduler

 

 PID   CPU  MEM             PROCESS_DIR                              SERVER 

-----------------------------------------------------------------------------

26763  0.7  4.3 /opt/ibm/WebSphere/ProcServer/java/bin/sparcv9/java dmgr

21425  0.4  3.2 /opt/ibm/WebSphere/ProcServer/java/bin/sparcv9/java myapp.Messaging.zmyserver10Node01.0

 6853  2.1  5.1 /opt/ibm/WebSphere/ProcServer/java/bin/sparcv9/java myapp.AppTarget.zmyserver10Node01.0

16703  0.4  2.1 /opt/ibm/WebSphere/ProcServer/java/bin/sparcv9/java nodeagent

Conclusions

The interesting ITIL following IT managers questions are listed out as below:

You can make this value added script for your WPS environment get appreciations from your reporting manager/Client. Here I Concluding with the above script is we can perfectly monitor  the CPU, MEM percentage usage by WebSphere processes. The only my concern is that the script every run will be generates two files: file1, file2. Is there any way to avoid using these two files? Do you have any fresh thoughts?

Keywords

WebSphere, WAS, WPS, WebSphere Process Server, CPU monitoring, Memory monitoring, pid, comm, awk, Sparc, Solaris, nawk