WPS Monitoring
As the current baseline for the identifying the process id (PID) of WebSphere based on java initiated processes is clumsy on Solaris environment. With this value added automation shell script IBM WPS or WebSphere developer, Administrators, Architects, QA analysts or a proactive Technical Manager can save considerable amount of their time and efforts of their teams.
Quantification of automation
Current manual process takes approximately 5 minutes to know about the process causing the CPU spike or MEMORY spike. especially on production servers it is very much required to take thread dump, heap dump at time of issues to do the further analysis. It is better to capture the statistics for tuning the performance of the WPS environment.
Believe it or not! This automation task saves almost 600 productive hours per year including Dev, QA, Production environments. It may varies depends on number of machines involved in WPS.
Functional baseline are achieved with ps command but the name of the java instance initiated the process is identified by /usr/ucb/ps command. Using the grep and awk/nawk we can pick out the desired output for the script.
The 'wasps' Automation Goals
This week we have developed a simple shell script for monitoring WebSphere Process Server running on Solaris environment. The script is targeted the following goals:
1. To display the process related statistics such as Percentage of CPU Load, Percentage of Memory used by WebSphere java processes ( includes Application Servers, Message Server, NodeAgents, DeploymentManager)
2. Process id corresponding to java initiated processes, this can be further useful for taking thread dumps or heap dumps of trouble centric WebSphere Java instances.
3. Create a script file with following stuff and name it as "wasps", means WebSphere Application Server process stats.assign the chmod for execution by u+x option.
#!/usr/bin/ksh
/usr/ucb/ps -axwwwwwwwww|grep java|grep -v grep |nawk '{print $1"\t"$NF}'>file1
ps -U $LOGNAME -o pid,pcpu,pmem,comm|grep -i java> file2
echo " PID CPU MEM ROCESS_DIR SERVER "
echo "----------------------------------------------------------------------------------"
/usr/xpg4/bin/awk 'FNR==NR{a[$1]=$2 FS $3;next}{ print $0, a[$1]}' file1 file2
Result of automation
Output is giving fine desired statistics, we can further extend this script with setting up the following
1. Threshold for CPU and MEM
.2. mail when abnormal conditions
3. run the script in crontab or autosys job scheduler
PID CPU MEM PROCESS_DIR SERVER
-----------------------------------------------------------------------------
26763 0.7 4.3 /opt/ibm/WebSphere/ProcServer/java/bin/sparcv9/java dmgr
21425 0.4 3.2 /opt/ibm/WebSphere/ProcServer/java/bin/sparcv9/java myapp.Messaging.zmyserver10Node01.0
6853 2.1 5.1 /opt/ibm/WebSphere/ProcServer/java/bin/sparcv9/java myapp.AppTarget.zmyserver10Node01.0
16703 0.4 2.1 /opt/ibm/WebSphere/ProcServer/java/bin/sparcv9/java nodeagent
Conclusions
The interesting ITIL following IT managers questions are listed out as below:
What are you doing in these areas for your SCA services?
Has monitoring the WPS been discussed in your organization?
How can you evaluate different Application Servers, NodeAgent? Do you take their word name ? how about performance?
How often do you need to report on performance?
You can make this value added script for your WPS environment get appreciations from your reporting manager/Client. Here I Concluding with the above script is we can perfectly monitor the CPU, MEM percentage usage by WebSphere processes. The only my concern is that the script every run will be generates two files: file1, file2. Is there any way to avoid using these two files? Do you have any fresh thoughts?
Keywords
WebSphere, WAS, WPS, WebSphere Process Server, CPU monitoring, Memory monitoring, pid, comm, awk, Sparc, Solaris, nawk