Updates from June, 2009

  • Getting out more from PHP

    Andi 16:35 on June 27, 2009 | 0 Permalink | Reply

    In this post I will demonstrate some practices to optimize performance and reduce memory footprint of a PHP script. Writing your code according to them is especially useful in the case where you traditionally do not use PHP: transforming a large bulk of data.

    Even PHP can handle this because it has become more performant in the last years and has got a command line interface since 4.3.0. But there are more arguments for transforming data with PHP: You reuse your application’s PHP interface. If it is performant itself, it’s probably not necessary to rewrite parts in another language to do the bulk operations.

    1. In PHP, assignments copy arrays. Assign arrays explicitly by reference, so they will not be duplicated in memory:
      class RowHolder {
        public $rows;
        function __construct(array $rows) {
          $this->rows = &$rows;
        }
      }
      
      $manyRows = $Stmt->fetchAll();
      $RowHolder = new RowHolder($manyRows);

      Same goes for iteration:

      foreach ($RowHolder->rows as &$row) {
        ...
      }
    2. Use prepared statements. Next to be more efficient they prevent SQL injection:
      $Stmt = $PDOConnection->prepare(
        'SELECT * FROM table WHERE id = ?'
      );
      
      foreach ($ids as $id) {
        $row = $Stmt->execute(array($id))->fetch();
        ...
        $Stmt->closeCursor();
      }
    3. Check if you really need a class to represent a simple data structure which is used in vast numbers. Memory usage is higher and you get longer serialize() strings. Imagine a country border, which is represented as array of geo points. Using an array of Point objects will need more bytes than simply using an array of arrays, like this:
      $Border = new Geometry(
        Geometry::LinearRing,
        array(
          array(53.0749616,87.867913),
          array(53.0719262,87.840664),
          array(53.0706606,87.819640),
          ...
          array(53.0749616,87.867913)
        )
      );

      I came across this when I created a framework for OpenGIS Simple Features. The standard defines a class hierarchy of geometric features. Starting the conventional way I just transfered it into a big class tree. But then I switched to an easier way, because all classes share an array of elements which form a geometry. I just define the parent class – Geometry. It is a container for a multi-dimensional array instead of a complex object tree.

    4. Memory management is still PHP’s big issue. Especially using objects with circular references creates memory leaks. While rendering a website this is not a problem because the number of objects is rather small. Memory is cleaned up after script execution. But looping through millions of objects will make the memory use growing until the script stops (sometimes even without a message). There is no clear way to prevent leaks. Raising memory_limit will not help to complete the script execution, if it will run a long time. I don’t suggest trying to fix memory leaks completely. You will not be successful with complex structures. To create a job script with PHP is not elegant but there is a way to do it.

      I suggest the following solution: Let the script run on the command line until it reaches a memory limit you have defined. If the job was not finished, the script will save its state to a temporary file. On the next execution it reloads the state and starts from there. A bash script repeatedly calls the job script. It looks like this:

      #!/bin/sh
      e=2
      while [ $e -eq 2 ]
        do jobScript.php $1
        e=$?
      done

      And this is a scheme I use for jobScript.php:

      <?php
      ini_set('memory_limit', '500M');
      
      /*
      This one is for checking and should be
      significantly lower than the 'memory_limit' ini setting
      */
      $memory_limit = 100 * 1000 * 1000;
      
      /*
      Create State with names of variables,
      which are important for current execution state.
      Example:
      object identifier of current data row
      */
      $State = new State(array(
        'stateVar1',
        'stateVar2',
        ...
      ), $dataset);
      
      // Try to load variables into script
      if (!$State->load()) {
      
        // On initial execution: init state variables
        $stateVar1 = ...;
        $stateVar2 = ...;
        // On repeated execution: $State->load() does the trick
      }
      
      // This is the loop where your data is transformed
      do {
      
        /*
        The job:
        - fetch data chunk from db depending from state vars ...
        - compute heavy objects (and create memory leaks)
        - put data to db
        - change state vars
        */
      
        /*
        To call
        unset() or explicit __destruct()
        here didn't succeed to prevent memory leaks
        */
      
        if (memory_get_usage(true) > $memory_limit) {
      
          /*
          If memory usage has risen too high,
          save current script state and exit
          */
          $State->save();
          exit(2);
        }
      } while ($data_exists);
      
      /*
      This time the script finishes - now delete
      the script's state and exit with success
      */
      $State->delete();
      exit(0);

      Pretty, huh? For me it has worked many times. This is the State class for download.

      Don’t do this at home, if your job is finished quickly or your script does not create memory leaks ;)

    Photos are by photocase users complize and mathias the dread

     
  • Adapting a Polyline Encoder to PHP

    Andi 10:22 on March 16, 2008 | 4 Permalink | Reply

    For a project I need encoded geographical polylines in Google Maps. A geographical polyline can be used to show a route or an area on a map. A polyline is made of several geographical points. Each point is composed of latitude and longitude. Example for a polyline:

    [
    
      // point
      {
        // latitude
        Latitude: 49.75121628642191,
    
        // longitude
        Longitude: 6.6281890869140625
      },
    
      {
        Latitude: 49.76252796566851,
        Longitude: 6.633853912353516
      },
    
      {
        Latitude: 49.757537844205025,
        Longitude: 6.649990081787109
      },
    
      {
        Latitude: 49.749441665946,
        Longitude: 6.642951965332031
      }
    ]
    

    If you load a big bunch of points from your server, this can take a while. The better solution is to encode them to binary format. In a forum I found a dead link to an Google Maps’ encoding algorithm coded in PHP ;(, so I found one implemented in JS and adapted it to PHP. You must take care of one thing: Replacing the JavaScript function String.fromCharCode() by PHP’s chr() will not work, because it does not support unicode. On php.net I found unichar(), which considers unicode. The client side encoder can be tested here. Download the ZIP which includes a PHP class and the whole thing in JavaScript: polyline.zip

     
  • Simple PHP debugging

    Andi 02:46 on March 6, 2008 | 0 Permalink | Reply

    If you debug your PHP code, you often get too many messages if you just use echo and print_r() or var_dump(). In the last few weeks I have implemented some useful techniques to prevent bunches of debugging messages on my screen:

    1. If you debug a template and you want to see the true result without any messages – but you don’t want to remove them or comment them out, you define a constant that tells your script whether it’s going to debug or not and you define your own function for debugging. This function checks if your constant is true and then prints your message.
    2. You can make it more comfortable and make your debug function accept any number of arguments of of different types. They will be conactenated and printed out in a meaningful way. So a false won’t be a blank string, an array is printed via print_r() or var_dump()
    3. Arrange your messages neatly by printing them in the end of your script – or use the killer feature: Send them to your Firebug console. Now your page is clean and you use full featuritis of Firebug, that will show your objects and recursive structures in a much more developer friendly way ;) When you test your site in other browsers, you can change output mode to HTML – your messages are collected in a box, and that’s it.

    E.g.:

    define('debug', 1);
    define('debugMode', 'js'); // 'html' or 'js'
    ...
    function myMethod($arg1, $arg2) {
      $_ = func_get_args();
      debug(__method__, $_);
    
      ... code ...
    
      debug("\twhat I wanna say", array('that' => 'sucks'));
    }
    
    // print all messages at once
    debugChain();

    I don’t want you to do all the work again: Download the package rename file extension to “.php” and use it: debugPackage0.1._php

     
c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
esc
cancel