PHP iterators in the wild
26 January 2020Although they were introduced way back in PHP 5, iterators are one of the language's less commonly used features. Almost all articles about PHP iterators seem to resort to one of two fairly contrived examples: reading a file line-by-line, or creating a bespoke range
function. In this post I want to take a look at some examples of their use in open source applications, with the hope that this approach will demonstrate how they can solve real-world problems.
In lieu of an introduction, I will point you to the most helpful explanation of iterators I've found, Anthony Ferrara's video "Iterators". He shows how a basic for
loop in PHP is analogous to the methods implementable in PHP's Iterator interface, such that
for ($i = 0; $i < count($array); $i++) {
corresponds to
for ($it->rewind(); $it->valid(); $it->next()) {
$key = $it->key();
(where $it->key()
in the second example maps to $i
in the first example). (These operations, rewind
, valid
, next
, key
and (not mentioned above) current
, correspond to the classes defined in the Iterator pattern as described in Gang of Four.)
Example 1: Flysystem
As I said, file operations are a favourite example when introducing iterators. One concrete implementation would be the PHP League's Flysystem package. Its listContents
method takes a path string and a boolean to specify whether or not the contents should be listed recursively. If we take a look at the code for this method on GitHub, we'll find that it instantiates an iterator (either a DirectoryIterator
or a RecursiveDirectoryIterator
depending on the boolean), which then makes it very easy to traverse the contents of the directory with a simple foreach
loop. Far simpler than messing about with openddir
, readdir
, etc. (Note the package also makes use of FilesystemIterator
, which allows you to skip .
and ..
when traversing a directory tree.)
Example 2: PHP Unit
Another useful iterator is the FilterIterator
. By creating a class which extends this iterator, you can easily filter the result of iterators by wrapping them with your new filter iterator. An example of this in the wild is in PHP Unit. You've probably run phpunit --filter
, phpunit --group
, or phpunit --exclude-group
, many times. If we look in PHP Unit's GitHub repo we find a method processSuiteFilters
. This method makes use of a few classes in the PHPUnit/Runner/Filter
namespace, specifically a factory, then a number of Iterators which extend FilterIterator
(or rather RecursiveFilterIterator
, which itself extends FilterIterator
). All filter iterators require an accept
method. For example, the accept
method of the IncludeGroupFilterIterator
checks whether the current test (in the traversal) is in the array of groups permissible by the user's filter.
Example 3: Symfony Finder
In a blog post from 2010, Symfony creator Fabien Potencier said that iterators were "largely underused", and described using them when rewriting the Finder component for Symfony 2.
The Finder component combines the approaches outlined in the previous two examples: the DirectoryIterator
and the FilterIterator
. Here Potencier implements the IteratorAggregate interface, which requires just one method, getIterator
, which returns an external iterator. In the case of Finder
, it returns PHP's AppendIterator. So, to adapt the example from the manual:
$finder = new Finder();
$finder->files()->in(__DIR__)->in('/home');
foreach ($finder as $file) {
// do stuff
}
Here we want the finder instance to focus only on files (files()
) and to look in the current and /home
directories. Each repeated in
call just adds its argument to the class's dirs
property. Finally, when the iterator is triggered with the foreach
statement, then for every directory in the dirs
array, it does two things. First, it calls the searchInDirectory
method, which basically configures the iterator. Like Flysystem, it uses the RecursiveDirectoryIterator and adds some bespoke Symfony filters depending on Finder's configuration (these filters extend PHP's FilterIterator). Second, having obtained this iterator, it adds it to the AppendIterator
it instantiated earlier. Finally it appends any more iterators that have been explicitly provided by the user, and returns the AppendIterator
. So here we have an example of PHP's various iterator classes providing a clean way to compose an extensible, flexible API for traversing a file system.
Example 4: CSV
Another use of iterators is as a memory-saving technique. When you are dealing with data sets of very large, or of unknown, size, processing the data iteratively means you do not suffer the performance drawbacks of holding the entire data set in memory. Basically, it can turn an O(n) process to an O(1) process.
As an example, let's look at another PHP League package, CSV, which makes heavy use of iterators. What happens when you run this code?
$csv = Reader::createFromPath('/path/to/file.csv', 'r');
$records = $csv->getRecords();
First the static createFromPath
method returns an instance of the Stream
object, which in turn implements PHP's SeekableIterator
. This is a simple extension to the common-or-garden iterator, allowing clients to specify the position of the cursor. This instance of the Stream object is set to the document
property of the AbstractCsv
class which Reader
extends.
Next, getRecords
gives the client this iterator after applying some cleaning to it. This line does two things: first, it uses PHP's CallbackFilterIterator
to normalize the data (remove any corrupt or empty rows), before using the package's own MapIterator
to remove any BOMs. Next the getRecords
method uses another CallbackFilterIterator
to skip headers if required. Finally, it returns the return value of combineHeader
, a method which takes the iterator produced so far in getRecords
and if necessary adds a header to the records using another MapIterator
. All of this means that at the end you have an iterator you can use to foreach
over the records in a CSV file.