October 29, 2016

Get notified of long running PHP processes

If your whole stack is PHP, chances are you also use PHP on the command line, and also have PHP scripts configured to run automatically or periodically. Or you’re just too familiar with PHP, so it’s your go-to tool for all computing tasks.

It’s easy to forget about these automated scripts. Sometimes they get stuck for various reasons, like network problems or a problem in a third-party API, or some corner case you failed to cover. Then one day, you realize a script you programmed to run every midnight, which you expected to finish its work in two hours every time it ran, has been running for twelve days!

There are many solutions to prevent these kind of situations. A sensible thing to do would be to setup and configure Monit. But if you’re looking for a more minimalist solution, I’ll tell you what i did: I wrote a very simple bash script, which basically gets the list of running processes using ps, and finds lines with ‘php’ using grep and then uses a regex to pick the lines which have ELAPSED TIMEs above 10 hours. I then added this script to my .bash_profile, so every time i login to my server, it tells me if there are any. I then check its output and source code to figure out what’s wrong, and usually kill it afterwards. Note that this shouldn’t be happening too frequently. I mean if your scripts keep getting stuck running forever, you need to check your code.

Here is the whole code. There is a bit of extra work, for keeping the headers and make it output nothing when there is no process matched.

#! /bin/bash

listWithHeader=$(ps -eo uid,pid,etime,comm,args | egrep "php|COMM")
header=$(echo "$listWithHeader" | head -n 1)
longProcessList=$(echo "$listWithHeader" | egrep '((-[0-9])|[1-9])[0-9]:([0-9]{2}:?){2}')

lineCount=$(echo "$longProcessList" | grep -v '^$' | wc -l)

if [[ $lineCount -gt 0 ]]; then
    printf "PHP Processes Running Longer Than 10 Hours:\n"
    echo "$header"
    echo "$longProcessList"
    echo ""
fi

The pattern in ELAPSED TIME column is: [[DD-]hh:]mm:ss. So if a dash (-) is present, it means 1 day or longer. With this information, this is the regex to catch 10 hours or more:

((-[0-9])|[1-9])[0-9]:([0-9]{2}:?){2}

This regex will match something like -03:20:12 (one or more days, three hours…) or 12:10:54 (twelve hours…)

If you need to check for, say, 5 hours instead of ten, you could modify it like this:

((-[0-9]{2})|[0-9][5-9]):([0-9]{2}:?){2}

I use this for PHP scripts, but you could use it for anything just by changing the search string you provide to first egrep.

egrep "php|COMM"

COMM is for egrep to also match the headers line. If you don’t care about the headers, you can get rid of that as well.

An example output:

PHP Processes Running Longer Than 10 Hours:
  UID   PID  ELAPSED COMM             ARGS
  501  4806 12:30:36 /usr/bin/php     /usr/bin/php ./meRunYouLongTime.php

ahmet@home:$ 

© Ahmet Kun 2019