Optimization is one of the most enjoyable parts of software design, but unfortunately it does not claim a high percentage of development time. Generally speaking, it is not a task to consider until the time spent is justified, which is often toward the end of the development cycle (but not always!) Still, it is an important step, especially with products like LiveWhale, which has to perform well under high traffic spikes. I’ve already talked about general page caching before, but fine-tuning a PHP application for speed when something is not cached is also important. Here are some thoughts on how to do just that.
At the code level, LiveWhale is a framework, which means the same codebase is hit for many different types of requests. The question is then: how to achieve high performance with a codebase that has to perform so many tasks and is therefore code heavy. It makes sense to divide code across a handful of files. The objective here is to only load libraries when you need them. A typical request will only use a tiny percentage of the entire codebase, so there’s no need to read a great deal of code from the filesystem and eat up RAM per PHP request. Also, with a modular system like LiveWhale, it is not explicitly known what modules exist that will need to be loaded. An important optimization is one where only the first request to the server has to perform logic to determine what to load. The results of this expensive operation are cached, and all subsequent LiveWhale requests enjoy dramatic savings in the module loader.
A common problem with application frameworks is that they perform two very expensive tasks with every single page load: 1) initializing a session and 2) connecting to a database. Neither of these two should be assumed. In LiveWhale, a session is started only when a component requires it. A database connection only occurs when/if the first query to it is performed. Making your application sufficiently “smart” about if and when these initializations take place will also lead to dramatic speed improvements.
Less is more. That means you should optimize out any hit to the file system that you can, or remove extraneous database queries. Also, be wary of OO (object oriented) slowdowns. If something doesn’t need to be a class, don’t use one. Remember that calls to an object’s method are more expensive than global function calls. The same holds true for accessing an object’s properties. An obvious optimization is done when dealing with data from a foreign server. Such content should be cached as much as possible, so that rapid hits to your site don’t rely on the performance (or availability) or an external data source. Another tip would be to use error suppression (@) sparingly. This is a known performance hit. There are a number of similar, simple tips available via a quick Google search for “php optimizations”, and there are slideshows available from PHP developers on the subject.
Analyze how your code operates. If you have a function that uses a lot of logic (if/then, switch), make sure you understand what the “common case” is. In other words, what’s the most likely condition that function will come across? Then determine if the function’s logic is conducive to the fastest possible result under the most common case. For example, I had a function that has to intercept a variety of XML structures and handle them in different ways. One particular XML structure (the simplest possible one) was far and away the most common case. Rather than parsing the XML with simplexml_load_string() as I do with all the chunks of XML I receive, I first do a preg_match() to determine if I have the common case. In this situation, I can immediately return a simple result and skip both the parsing of the XML and the lengthy logic that goes with performing operations with it.
Identify “hot” functions. Functions that are called many times, often within a loop, should get the most attention in terms of optimization. First, make sure the function should be called at all, as opposed to using inline code, otherwise you needlessly acquire function overhead. Second, move any redundant logic outside of the hot function. Perhaps the check can be performed just once before the hot code is executed. There are also situations where a function can potentially be called repeatedly and end up with the same result. It might make sense to declare a static variable inside the function, to locally cache content generated by the function. On a subsequent call, it can first check the static array to see if a result already exists, and simply return that if so. This sort of fine-grained optimization will only help when there is a significant bottleneck with a particular function, but it can make a huge difference if done wisely.
Lastly, use XDebug! XDebug is your friend. Among the many things it can do, it will let you profile your PHP code and generate a report about how much time your script is taking overall and within particular functions. Here’s a mockup of the type of information it provides:
| Function | Line(s) | Calls | Cycles | Time |
| LiveWhale->foo | 128 | 1 | 1911 | 38.3% |
| LiveWhale->bar | 40 | 1 | 854 | 17.1% |
| do_something | 41 | 1 | 313 | 6.3% |
| do_something_else | 218/218/218/218/218/218/218/21 … | 15 | 252 (17) | 5.1% |
| do_this | 131 | 1 | 118 | 2.4% |
| do_that | 1 | 1 | 72 | 1.4% |
| fast | 886 | 1 | 58 | 1.2% |
| faster | 124 | 1 | 47 | 0.9% |
In this example, the function LiveWhale->foo() clearly takes the majority of the time for this request. At 38.3% of request time for a single call, it is a good candidate for serious analysis. LiveWhale->bar() is also expensive compared to other operations, and there may be ways to make this function perform better too. The function do_something() takes 6.3% for a single call, on line 41. Another function, do_something_else(), takes nearly the same amount of time but due to the fact that it was called 15 times (at 17 CPU cycles each, so 252 cycles total). Knowing why each function takes the time that it does, and why it might be called so many times, will help you maximize the performance of your application.
Familiarity with what makes PHP code perform well will only lead to the best decisions being made early on, so that XDebug reveals fewer inefficiencies. In the long run, many of these tips become automatic and part of your coding style, but for the best quality, production ready code, a little bit of analysis and thought goes a long way.
You can follow any responses to this entry through the RSS 2.0 feed. Respond to this post.