PHP Performance Boosting: __autoloading classes

Having installed memcached to improve the performance of sessions on my web server, I still wanted to squeeze some more performance out of my PHP application.

First, some background: My application framework, iota, contains a huge number of libraries – Database abstraction, PDF generation, image manipulation, various specialised content handlers, access control and so on.  Until yesterday, all these libraries were include()’d on every request, despite the fact that most of them are only rarely used.

Simply not loading them and making the include() before use wasn’t an option – There’s too much legacy code that expects certain libraries to be available.  Instead, I turned to PHP cass autoloading feature.  When PHP encounters a reference to a class that isn’t declared, it calls the global __autoload($classname) function to try and load it.  This was the behaviour I was after – A way to load things when (and if) they’re needed.

The next, and perhaps somewhat larger, part of the challenge was to teach __autoload where each class is defined, so that it knew which files to load when a particular class was loaded.  I’m a great believer in not doing things myself when they can be automated, and also in the idea that things should just work.  To this end, I wrote a new library to handle class loading that discovers for itself where classes are declared.

The loader has two modes of operation.  The first is “training mode” – In training mode, all the libraries are loaded but the loader makes a note of which classes are defined by each library (by comparing the list of declatred classes before and after include()’ing the file).  This “class map” is then stored in a temporary file for subsequent requests.

If the loader finds a pre-existing class map, it enters dynamic loading mode.  In this mode, requests to load libraries are ignored but an __autoload function is set up to use the class map to load libraries as and when a class that they declare is used.

The implementation is, as ever, not quite as simple as the explanation.  What happens, for instance, where a library loads other libraries? (In this case, the loader associates the class with multiple files and loads them in order).  The solution I used is below.  To use it, one just needs to call iotaLoader::getInstance()->load(FILENAME); instead of include(FILENAME);

setDefault('dynamic_class_loading', true); // Enable dynamic autoload by default

/**
 * List of files to always load when requested
 * 
 * There is a speed penalty when using this option!  We need to search for full file paths in lib directory before deciding
 * whether we need to load things.  If this array is empty, we have a simple optimisation of not bothering to do so!
 */

iotaConf::getInstance()->setDefault('dynamic_class_forceload', array()); 

class iotaLoader
{
	private static $instance;
	
	function getInstance()
	{
		if(!self::$instance)
		{
			$class = __CLASS__;
			self::$instance = new $class();
		}
			
		return self::$instance;
	}
	
	
	private $autoload = false;
	private $classmap = array();
	private $wd = false;
	
	
	private function __construct()
	{
		$this->wd = getcwd();
		
		$this->load_immediate = iotaConf::getInstance()->dynamic_class_forceload;
		
		// If autoloading is disabled...
		if(iotaConf::getInstance()->dynamic_class_loading == false)
		{
			$this->autoload = false;
			return;
		}
		
		// Otherwise, see whether we CAN use autoloading or whether we need to arrange for it to be set up
		if($this->loadclassmap())
		{
			//echo "Dynamic class loading enabled";
			
			$this->autoload = true;
			
			// Set up autoload function
			function __autoload($name)
			{
				iotaLoader::getInstance()->autoload($name);
			}
		}
		else
		{
			//echo "Dynamic class loading in training mode";
			
			$this->autoload = false;
			
			register_shutdown_function(array($this, 'storeClassMap'));
		}
	}
	
	/**
	 * List is a list of filenames to load, $from is an array of directories to look in
	 *
	 * @throws iotaCannotFindModuleException
	 */
	public function load($list, $from='lib')
	{	
		if(!is_array($from))
			$from = array($from);
		
		if($this->autoload && count($this->load_immediate) == 0)
			return;
			
		foreach($list as $module)
		{			
			$found = false;
			
			foreach($from as $dir)
			{
				$filename = iotaPath::makepath($dir,$module);
				
				// If we're autoloading, then only load forced modules
				// modules can be forced in config, OR by an autoload request that loads a file which itself contains load() calls
				if($this->autoload && !in_array($filename, $this->load_immediate))
				{
					$found = true;
					continue;
				}
				
				// By this point we have to actually go ahead and load the file
				if(file_exists($filename))
				{
					$classes = $this->loadAndGetClasses($filename);
					$this->setClassMap($filename, $classes);
					$found = true;
					break;
				}
			}
			
			if(!$found)
			{
				throw new iotaCannotFindModuleException("Unable to load $module (file not found in module paths)");	
			}
		}
	}
	
	/**
	 * Load a file/library and return a list of classes that it defined
	 */
	private function loadAndGetClasses($file)
	{
		$initclasses = array_merge(get_declared_classes(), get_declared_interfaces());
		
		include_once($file);
		
		$classes = array_merge(get_declared_classes(), get_declared_interfaces());
		
		$newclasses = array_diff($classes, $initclasses);
		
		return $newclasses;
	}
	
	/**
	 * Get the class map from disk - False if it doesn't exist
	 */
	private function loadClassMap()
	{
		$host = $_SERVER['SERVER_NAME'];
		
		if(file_exists('systemp/classmap_'.$host))
		{
			$this->classmap = unserialize(file_get_contents('systemp/classmap_'.$host));
			return $this->classmap;
		}
		else
		{
			return false;
		}
	}
	
	/**
	 * Set an entry in the classmap
	 * 
	 * Where calls to load() are nested (eg a library uses iotaloader itself) a single class
	 * may become associated with multiple files at different levels.  If this is the case,
	 * an array of files is created.  Because of the nesting (last load() in the stack returns 
	 * first, LIFO), the first file that needs to be loaded is the LAST entry.
	 */
	private function setClassMap($file, $classes)
	{
		foreach($classes as $c)
		{
			$c = strtolower($c);
			
			// If the class already has an entry, then create a compound entry
			if(array_key_exists($c, $this->classmap))
			{
				if(!is_array($this->classmap[$c]))
					$this->classmap[$c] = array($this->classmap[$c]);
				
				$this->classmap[$c][] = $file;
			}
			else
			{
				$this->classmap[$c] = $file;
			}
		}
	}
	
	/**
	 * Save list of classes for dynamic autoloading
	 */
	public function storeClassMap()
	{
		$host = $_SERVER['SERVER_NAME'];
		
		$f = fopen($this->wd.'/systemp/classmap_'.$host, 'w+');
		fwrite($f, serialize($this->classmap));
		fclose($f);
	}
	
	/**
	 * Autoloader - Uses the cached class->file mapping to load classes as required
	 */
	public function autoload($class)
	{
		$class = strtolower($class);
		
		if(array_key_exists($class, $this->classmap))
		{
			$map = $this->classmap[$class];
			
			if(is_array($map))
			{
				while(($file = array_pop($map)) !== null)
				{			
					include_once($file);
				}
			}
			else
			{
				include_once($map);
			}			
		}
		else
		{
			return false;
		}
	}
}

class iotaCannotFindModuleException extends iotaException{}
class iotaCannotAutoloadClassException extends iotaException{}

?>

PHP Performance Boosting: memcached sessions

Last night I spent some time trying to improve the speed of my project management system and PHP in general.  My first move was to turn on profiling in PHP’s xdebug extension to determine what was taking the time.  One obvious candidate for improvement was session_start() which was taking about 20% of the time.

Improving session_start() performance is not so obvious as introducing caching on a frequently-used database call.  First I tried introducing more agressive garbage collection for stale material – This increased performance in some cases, but still not to a point that I wanted to stop.  By default, PHP sessions are stored as files on disk, with all their associated overhead and latency.  For performance, it would be better to keep sessions (especially frequently used ones, like those used by my AJAX framework) in memory where they can be accessed with only minimal delay.

Rather wonderfully, there’s a PHP extension in PECL that adds memcached support to PHP (memcached is, as the name suggests, a memory caching server).  In addition to allowing object-oriented access to memcached within PHP scripts, it adds a new session save handler that stores sessions in memcached rather than on disk. (OpenSuSE has a package for in, in the PHP Extensions repository.)

Once memcache and the PHP extension are installed, it’s just a case of altering the session save handler in php.ini:

session.save_handler = memcache
session.save_path = “tcp://127.0.0.1:11211”

I also installed a PHP memcached monitor script, phpMemcachedAdmin, which displays detailed memcached usage information.