Thorough look at PHP's pcntl_fork()

26 Apr 2007

Even for die-hard PHP coders, pcntl_fork() can be quite daunting. The trouble is, most PHP users have learnt to code linear, and mostly only for outputting dynamic content, like webpages.

The pcntl-functions (and to some extent the posix-functions) are different: they provide methods to program interactive applications unsuitable to run in a webenviroment like Apache. It's quite possible to program your own webserver in PHP using these methods.

When you did some serious programming in languages like C, pcntl_fork() shouldn't have any secrets for you. It behaves almost identically like fork() does. However, for the novice application programmers amongst us (like I am), pcntl_fork() will appear highly illogical.


Why fork()?

Simple: PHP does not support any kind of threading, or asynchronous processing. Therefore, you have to fork if you want your application to do two or more things at the same time.

Say you are programming a networked server, which listens to a certain port, accepts connections and keeps them open. That can be done synchronously with while-loops and arrays containing socket resources but that poses one problem: when an action that is taken on one connection takes a while to finish, all other actions are stalled. The program is busy with that one connection. Threading would solve this: you simply launch a thread for each connection (well, kinda).

With pcntl_fork(), this can be done too. However, it's important to know what it actually does.

How does it work?

First: let's take a look at what a process usually is. All processes have an unique ID, which is called the PID. This is the reference to that process wich can be used to send signals and the like. All processes also have a PPID, which is a reference to the parent process: the process which created this process. This number is usually 0, which means that there is no parent.

What fork() does (and pcntl_fork() does too, since it is basically the same) is copying the current process. That's right: copying. It does not create an empty process, nor does it 'call' another program, it just creates an identical copy of the current process. Well, almost identical.

There are three differences to be noted:

  • PID
  • PPID
  • The magic difference

The first difference is logical: it's a new process, so it receives a new PID. The second is logical too: the parent process of the new process is the old process, so the PPID of the copy is the PID of the original. From now on, we'll call the copy 'the child' and the original 'the parent'.

There is a third difference, which can look illogical at first glance: the return-value of pcntl_fork().

The magic difference

pcntl_fork() returns an INT. To the child, it return 0. To the parent, it returns either -1 or the PID of the new child. Now, this is interesting.

When the return-value is -1, something went wrong. The child is not created since the fork() system call didn't run or returned an error. The rest of the code is processed as if the fork() never happened.

If pcntl_fork() return 0, you know that this is the child. This is the best thing to use to separate the parent from the child.

On all other values, you know that this is the parent and the child was succesfully created with a new PID. The return value from pcntl_fork() is that PID. This creates a lot of possibilities for the parent: we know that we are the parent, and we know the name of the child.

What's next?

Lets look at some code:

  1. // define some variables
  2.  $somevar1 = "a";
  3.  $somevar2 = "b";
  4. // fork
  5.  $pid = pcntl_fork();
  6.  if($pid == -1) {
  7.   // Something went wrong (handle errors here)
  8.  } elseif($pid == 0) {
  9.   // This part is only executed in the child
  10.  } else {
  11.   // This part is only executed in the parent
  12.  }
  13. // this part is executed in both the parent and the child, if they still live
  14.  echo($somevar1);
  15.  echo($somevar2);

What we have here is a program that copies itself once. Both copies have the same defined variables and execute the same functions (both echos). The if-elseif-else part is the main point where you can separate the two processes.

When you want the child to do something alltogether different then the parent, you should define that using the returned $pid. Remember: everything that is defined before pcntl_fork() (variables, functions, classes) exist in both the parent and the child. This can be undesirable, so it should be handled carefully.

You can exit each process without consequences for the other. Well, almost.

Parental responsibilities

When a child process dies, it's death should be handled by the parent. If it isn't, the child becomes a zombie: it doesn't consume resources, but it still is a process with a PID and all that. This is undesirable, since most (all?) operating systems have an upper limit on the processes it can handle.

When a child dies, a signal is sent to the parent (SIGCHLD). The parent can then handle the death of the child for internal processing. The correct way to unzombie a child is using pcntl_waitpid(). You can use that function to wait until the child dies, or to detect that a child has already died. Use pcntl_wait() when you want to do this for a myriad of children. Look at the relevant section of the PHP manual for more options (including letting the function know not to suspend normal operation).

Using SIGCHLD, however, is not always foolproof. When you quickly create many shortlived children, handling SIGCHLD in combination with pcntl_waitpid() might not handle all zombie processes. I find this way to work best:

  1. $children = array(); // Create an array which will contain the PIDs of the children
  2. while(true) {
  3.  $pid = pcntl_fork();
  4.  if($pid == -1) {
  5.   // Something went wrong (handle errors here)
  6.   die("Could not fork!");
  7.  } elseif($pid == 0) {
  8.   // This part is only executed in the child
  9.   usleep(500);
  10.   exit(); // The child dies after a short while, becoming a zombie
  11.  } else {
  12.   // This part is only executed in the parent
  13.   $children[] = $pid; // Push the PID of the created child into $children
  14.  }
  15.  // When children die, this gets rid of the zombies
  16.  while(pcntl_wait($status, WNOHANG OR WUNTRACED) > 0) {
  17.   usleep(5000);
  18.  }
  19.  // We want to handle the death of the children, ie: getting them out of the $children array.
  20.  while(list($key, $val) = each($children)) {
  21.   if(!posix_kill($val, 0)) { // This detects if the child is still running or not
  22.    unset($children[$key]);
  23.   }
  24.  }
  25.  $children = array_values($children); // Reindex the array
  26. }

When we examine the above code, we'll see that the children that are created also have the $children array. That might be desirable, since the children then know their brothers and sisters. Note that $children in the child does not contain it's own PID!

When you don't want to use $children in the child, it's a good idea to unset() it. The parent will still have it, but it won't consume extra memory in the child. This goes for all variables define before pcntl_fork() was called, offcourse.

Summoning a daemon

In UNIX (and the like), a daemon is process that runs in the background. It has no (living) parent. It is really easy to daemonize your PHP-program using pcntl_fork():

  1. // define some variables
  2.  $somevar1 = "a";
  3.  $somevar2 = "b";
  4. // fork
  5.  $pid = pcntl_fork();
  6.  if($pid == -1) {
  7.   // Something went wrong (handle errors here)
  8.  } elseif($pid == 0) {
  9.   // This part is only executed in the child
  10.  } else {
  11.   // This part is only executed in the parent
  12.   exit();
  13.  }
  14. // this part is executed as the daemon we wanted. The parent is dead.

We need to consider some other things though: a PHP-script usually times out after a while, causing it to exit. For a decent daemon, this is undesirable. Use the following at the start of your program to prevent it:

  1. ini_set("max_execution_time", "0");
  2. ini_set("max_input_time", "0");

When you want the program (daemon or not) to handle systemcalls, use this bit of code:

  1. declare(ticks = 1);
  2. function sig_handler($signo) {
  3.  if($signo == SIGUSR1) {
  4.   // Handle SIGUSR1
  5.  } elseif($signo == SIGTERM) {
  6.   // Handle SIGTERM
  7.   $sigterm = true;
  8.  } elseif($signo == SIGHUP) {
  9.   // Handle SIGHUP
  10.   $sighup = true;
  11.  }
  12. }
  13. pcntl_signal(SIGUSR1, "sig_handler");
  14. pcntl_signal(SIGTERM, "sig_handler");
  15. pcntl_signal(SIGHUP, "sig_handler");

We need the declare part to be sure that each signal is handled when it is received. Read up about ticks and pcntl_signal for more details.

Conclusion and example

We've now seen that pcntl_fork() creates a copy of the current process and what the differences are between the parent and the child. We've covered the basics about separting the two processes and handling the death of the child. We've also learnt how to daemonize our PHP script and how to live forever. And lastly, we've learnt how to handle systemcalls.

All-in-all, pcntl_fork() is a great and powerfull tool for writing PHP scripts that behave as grown up applications. It still would be a lot better (or at least more flexible) if we could make use of asynchronous processing or threads, but PHP just is not made for that (JavaScript can do that, and for application programming: take a look at Perl or give up and learn C++. I've done neither).

Now, let's write a basic server which daemonizes, accepts connections (from telnet or something) and handles them in a good fashion. Connect to it with telnet on port 2007, say something followed by a return and see what it does. It echoes the PID of the daemon which can be used to send a SIGTERM, which will cause the program to exit in a good fashion, or a SIGHUP which will cause the program to reinit.

  1. # Give us eternity to execute the script. We can always kill -9
  2.  ini_set("max_execution_time", "0");
  3.  ini_set("max_input_time", "0");
  4. # Set up the basic
  5.  declare(ticks = 1);
  6.  $wnull = null;
  7.  $enull = null;
  8.  $max = 30;
  9.  $child = 0;
  10.  $children = array();
  11.  $maxseen = 0;
  12.  $totseen = 0;
  13.  $sigterm = false;
  14.  $sighup = false;
  15.  $started = time();
  16. # Do funky things with signals
  17.  function sig_handler($signo) {
  18.   global $sigterm;
  19.   global $sighup;
  20.   if($signo == SIGTERM) {
  21.    $sigterm = true;
  22.   } elseif($signo == SIGHUP) {
  23.    $sighup = true;
  24.   } else {
  25.    echo("Funny signal!\n");
  26.   }
  27.  }
  28.  pcntl_signal(SIGTERM, "sig_handler");
  29.  pcntl_signal(SIGHUP, "sig_handler");
  30. # Fork and exit (daemonize)
  31.  $pid = pcntl_fork();
  32.  if($pid == -1) {
  33. # Not good.
  34.   die("There is no fork()!");
  35.  } elseif($pid) {
  36.   echo($pid);
  37.   exit();
  38.  }
  39.  $parentpid = posix_getpid();
  40. # And we're off!
  41.  while(!$sigterm) {
  42. # Set up listener
  43.   if(($sock = socket_create_listen(2007, SOMAXCONN)) === false) {
  44.    echo("No sense in creating socket. Reason: " . socket_strerror(socket_last_error()) . "\n");
  45.    $sighup = true;
  46.   }
  47. # Whoop-tee-loop!
  48.   while(!$sighup && !$sigterm) {
  49. # Patiently wait until some of our children die. Make sure we don't use all powers that be.
  50.    while(pcntl_wait($status, WNOHANG OR WUNTRACED) > 0) {
  51.     usleep(5000);
  52.    }
  53.    while(list($key, $val) = each($children)) {
  54.     if(!posix_kill($val, 0)) {
  55.      unset($children[$key]);
  56.      $child = $child - 1;
  57.     }
  58.    }
  59.    $children = array_values($children);
  60.    if($child >= $max) {
  61.     usleep(5000);
  62.     continue;
  63.    }
  64. # Wait for somebody to talk to.
  65.    if(socket_select($rarray = array($sock), $wnull, $enull, 0, 0) <= 0) {
  66.     usleep(5000);
  67.     continue;
  68.    }
  69.    if(($conn = socket_accept($sock)) === false) {
  70.     echo("Miscommunicating. Reason: " . socket_strerror(socket_last_error()) . "\n");
  71.     $sighup = true;
  72.     continue;
  73.    }
  74. # Fork a child.
  75.    $child++;
  76.    $totseen++;
  77.    if($child > $maxseen) {
  78.     $maxseen = $child;
  79.    }
  80.    $pid = pcntl_fork();
  81.    if($pid == -1) {
  82. # Not good.
  83.     die("There is no fork!");
  84.    } elseif($pid) {
  85. # This is the parent. It doesn't do much.
  86.     socket_close($conn);
  87.     $children[] = $pid;
  88.     usleep(5000);
  89.    } else {
  90. # This is a child. It dies, hopefully.
  91.     socket_close($sock);
  92.     $bufsize = 2048;
  93.     while(true) {
  94. # Happy buffer reading!
  95.      if(($tbuf = socket_read($conn, $bufsize, PHP_BINARY_READ)) === false) {
  96.       echo("Misread. Reason: " . socket_strerror(socket_last_error()) . "\n");
  97.       break;
  98.      }
  99.      $rbuf = $tbuf;
  100.      while(strlen($tbuf) == $bufsize) {
  101.       if(($tbuf = socket_read($conn, $bufsize, PHP_BINARY_READ)) === false) {
  102.        echo("Misread. Reason: " . socket_strerror(socket_last_error()) . "\n");
  103.        break;
  104.       }
  105.       $rbuf .= $tbuf;
  106.      }
  107. # Formulating answer
  108.      $wbuf = "\n\nTotal requests: $totseen\nMaximum simultaneous: $maxseen\nCurrently active: " . (count($children) + 1) . "\nRunning since: " . date("D, d M Y H:i:s T", $started) . "\nServer time: " . date("D, d M Y H:i:s T", time()) . "\n\nYour request: $rbuf\n\n";
  109. # Going postal!
  110.      if(socket_write($conn, $wbuf) === false) {
  111.       echo("Miswritten. Reason: " . socket_strerror(socket_last_error()) . "\n");
  112.       break;
  113.      }
  114.      break;
  115.     }
  116. # Let's die!
  117.     socket_close($conn);
  118.     exit();
  119.    }
  120.   }
  121. # Patiently wait until all our children die.
  122.   while(pcntl_wait($status, WNOHANG OR WUNTRACED) > 0) {
  123.    usleep(5000);
  124.   }
  125. # Kill the listener.
  126.   if(socket_close($sock) === false) {
  127.    echo("No sense in closing socket. Reason: " . socket_strerror(socket_last_error($sock)) . "\n");
  128.   }
  129.   $sighup = false;
  130.   $started = time();
  131.  }
  132. # Finally!
  133.  exit();

Happy coding!




Comments

sonic

sonic server daemon has a plugin type arch. if you dont want to build your own...

By anon (not verified) at Tue, 02/09/2008 - 10:08am | reply

Check if daemon is running?

How can I check if a daemon is running?

By anonymous (not verified) at Wed, 20/08/2008 - 5:15pm | reply

Depends...

That depends on where you want to check from. If you want to check from the parent, you already know the PID of the daemon (it is returned by pcntl_fork). You can then check by doing:

  1. if(posix_kill($pid, 0)) {
  2.   echo("Daemon is still running!");
  3.  } else{
  4.   echo("Daemon has run away!");
  5.  }

posix_kill will sent a given signal (in this case, 0) to a given PID. It will return true when successful, and false when not. Signal 0 doesn't do anything, so it is the preferred way to check.

Remember, after doing pcntl_fork there are two processes: the child and the parent. The child will see 0 as the return value of pcntl_fork and the parent will get the PID from the child as the return value.

If you want to check from another place than the parent if the daemon is running, you should create a pidfile. This is done by writing out the PID to a file with the daemons name, and is often done to prevent multiple instances of it. The child should write out the pidfile, and it can do so by using posix_getpid.

By FST777 at Wed, 20/08/2008 - 8:47pm | reply

Shared memory.

Hi,
That's a good article!

Now, there is a way to provide IPC (InterProcess Communication) in PHP, to allow communication between the forks (or between 2 PHP scripts).
Please, look at this: http://fr2.php.net/manual/en/ref.sem.php

SEM (Semaphore), which give a way to control the resources-access (for conflicts): ttp://en.wikipedia.org/wiki/Semaphore_%28programming%29
SHM (SHared Memory segment), which creates a shared place to stock variables.
MSG (MeSsaGe queue), which allow to send message to other forks/threads.

Examples of implementation:
(french links, sorry)
http://www.noisette.ch/wiki/index.php/PHP/Multithread
http://blog.lalex.com/post/2004/06/15/Multi-threading-en-PHP-:-vers-une-solution-MAJ

By Mickaël Menu at Mon, 11/06/2007 - 6:02am | reply

Thanks for the input!

Thanks for the input!

If one is going to do anything serious with multiple processes and intercommunicating processes, those links are indeed useful.

If only I could speak French... ;-)

Note, however, that if you are going to use a background PHP daemon and a front end web page PHP script, I would advise not to use those techniques, but rather communicate through files / databases.

By FST777 at Mon, 11/06/2007 - 6:52am | reply

Class: Thread

This is a class (with english documentation) which implements pseudo-threads in PHP, using IPC : http://www.phpclasses.org/browse/package/1136.html

Look at the code, it's very easy to understand (with the PHP manual under the hand). :)

Personnaly I use declare( ticks = 1) + register_tick_function() to check the Message Queue, and not pcntl signals like the above class.

By Mickaël Menu at Mon, 11/06/2007 - 12:33pm | reply

Summoning a background process with database access

Hi! Very good this post of yours! Thanks!

I still have a doubt though: I'm designing a data-aware object and one of its methods (a queue) must run in background. Problem is: as soon as the parent-process dies, the child looses the database connection.

How do we avoid this? Do I have to connect to the database from the child method?

By José Luís Carneiro (not verified) at Sat, 09/06/2007 - 9:36pm | reply

First of all: thanks!

First of all: thanks! :-)

I'm not completely sure on this, but I believe it depends on the fact that a connection to a database lasts for as long as there is no call to close the connection or as long as the process lives.

When using mysql_connect() in a "regular web page" from PHP, you don't *have* to call mysql_close(), as long as you are sure the PHP process is not going to last forever. This might be the problem with your situation.

I need testing for this, which I haven't done yet, but I assume that the connection to the database is shared between the parent and its children if the call to open it was done before pcntl_fork(). If that is the case, it might be possible that the connection is implicitly closed as soon as the parent process dies. I wonder what happens when a child dies, though.

If all that is true, you need to reestablish the connection after pcntl_fork(). With all I know and assume now, the best practice would be closing all the existing connections after forking and reopening them for every instance. Alternatively, you could check the resource identifier before each request. Maybe using mysql_pconnect() helps too.

By the way: my first suggestion is always a good design method, just in case the database has a weird per-connection handling of locks or transactions. Besides, I'm hoping that the background process is not called via pcntl_fork() in a PHP script that is called via a web page. Things get nasty when you do it that way. Use things like cron or a daemon that reads out a database / file with request for that purpose.

By FST777 at Sun, 10/06/2007 - 12:04am | reply

Sorry, I forgot...

... to say: to make things worse, although I have not much PHP practice, I'm using it's OOP... The afore-mentioned queue is a method of an object called (surprise!) "queue".

I used pcntl_fork to mantain all the variables in the background... Is it possible to mimick this with exec(), shell_exec() or any other function()?

By José Luís Carneiro (not verified) at Fri, 15/06/2007 - 11:13pm | reply

Some ideas...

I think I would go with cron-jobs calling a PHP script. The variables can be stored in a database or flat file (I'd use SQLite, which is magnificent). That also works with the various exec() calls, which might be more suitable for your environment. You can always call the script from a remote machine using some sort of scheduling mechanism.

You could do a foreach routine right before the end of execution to store all data and then read that into a new object when the process is called upon the next time.

It might indeed be tempting to keep the process running, just to maintain data. Storing and retrieving data is indeed overhead, but not much. Look at how much memory a typical PHP process consumes. Without fiddling too much with the settings an empty script can easily take 30 Megs of memory. That's a lot of waste, especially since PHP slows down dramatically when it needs to swap.

On the other hand, I do not know your exact needs. But given what I do know, I'd write a storage routine for the object.

By FST777 at Sat, 16/06/2007 - 12:04am | reply

I have good and bad news...

Hi, I managed to rewrite my process so it's divided in two parts. A daemon-like server (receiving requests through MySQL) and webpage clients...

I'd want to keep the server running (in background) until a specific flag is appended to the table. It works fine! YEAH! Thanks for your help on this!

Now the "bad news" part: if the process keeps doing nothing (waiting for a new record to process) it is automatically killed after 2 minutes... I tried using set_time_limit(0), but had no success... :(

Can you help me again?

By José Luís Carneiro (not verified) at Sun, 17/06/2007 - 12:30am | reply

Two thoughts on this:

The first one:
Add the following to the top of your PHP file:

  1. ini_set("max_execution_time", "0");
  2. ini_set("max_input_time", "0");

set_time_limit(0) only does so much. The two ini_set()'s are needed too, provided that the server supports the call.

Second one:
Your webserver does not permit scripts to run this long (this can be configured). In that case, use a script that is called periodically from a remote server. If you have cron access to a Unix / Linux box, you can do this with a cronjob calling wget to the page. Other solutions are thinkable. Though shit, though :-).

Have luck. Try the first solution, if it doesn't work: I can possibly help you with setting up a remote scheduler, but I will need more info.

By FST777 at Sun, 17/06/2007 - 12:53am | reply

Thanks!

Hi, I tried the first option but didn't succeed. Now, I'm using the second option (a loop), I spawn the background process and keep its PID. Then, every second I check that PID (through ps and grep) and if it's not running, I spawn it again. :)

There's a little side-effect: when I try to stop it (through a flag in the database) it takes time... I think I have to "stop" it many times, one for each "ressurection"... I'll study this a little more...

Thanks! :)

By José Luís Carneiro (not verified) at Sun, 17/06/2007 - 2:03am | reply

You're welcome!

When you do not start a resurrection when the flag is on, it shouldn't be a problem.

I can't fully picture the solution you have, but if it works, good for you. Sounds complicated, to say the least :-)

Good luck!

By FST777 at Sun, 17/06/2007 - 12:58pm | reply

Sounds like your webhost has

Sounds like your webhost has limits set to stop users processes running for longer than 2 min.
Contact support and ask for some clarification.

By Paul (not verified) at Wed, 14/11/2007 - 5:46am | reply

Alternatives do pcntl_fork... :-(

Hi, I gave you some time to answer before returning, now I have some more doubts... :)

I got rid of the database access problem (opened the connection only inside the spawned process)...

Since the day I saw your article, I've been finding concerns about pcntl_fork from a web process, just like you do (thanks once again), so it must be right. I've been lucky and haven't experienced any problem yet, probably because I only spawn ONE background process (and check to avoid starting any other instance).

Unfortunately, I'll have to find an alternative to pcntl_fork anyway... My webserver offers PHP5 but has disabled pcntl functions... :'(

Since I only need to monitor a table to read, process and flag any new records added to it, I can think of two alternatives:
1. Manually start a queue.php with a while(true) loop, it would bail out if found an especific record in the table;
2. Put the loop inside a queue.php and spawn it using exec() or shell_exec()...

What do you think? Any other ideas?

By José Luís Carneiro (not verified) at Fri, 15/06/2007 - 11:01pm | reply

Parse error line 67

Hi,

Please note the parse error line 67:
replace it by

if(socket_select($rarray = array($sock), $wnull, $enull, 0, 0)) {

Thanks for this great post.

Best regards,

Nicolas PESTANA

By Nicolas PESTANA (not verified) at Fri, 01/06/2007 - 3:35pm | reply

Fixed, but not as you said :-)

I recently revamped this site adding GeSHi capabilities (code highlighting) and edited all the pages accordingly. Apparently drupal doesn't accept < properly. It just skips until the next >.

I'm actually surprised that was the only parse error. Quite a coincidence. Note that a great deal of code was missing before the fix. Nasty business.

Anyway, I did a replace for < and > using &lt; and &gt;. That should do the trick. It might be the case that the other blocks of code are still not valid. I'll look at them later.

Thanks!


Comments