php: tame arrays with map, filter and reduce

php: tame arrays with map, filter and reduce

php's arrays are great. until they aren't. it's easy to let arrays get away on you and wind up writing complex, hard-to-test foreach loops to modify or filter your data. but there's a better way to manage our arrays.

php's array_map(), array_filter() and array_reduce() functions allow us to handle arrays in a clean, concise and powerful way by letting us use ideas from functional programming. if you are new to the ideas like mapping, filtering and reducing, it may take some time to get used to thinking about array handling in this way, but the end result will be code that more concise and stable.

the flyover

in this article we're going to do a quick survey of three of php's powerful array manipulation functions:

  • array_map()

  • array_filter()

  • array_reduce()

we'll be looking at some basic examples, but also more complex ones that are a little closer to the sort of useful things we might want our code to do. addtionally, we'll go over several ways that we can define the callback functions these array functions require and look a bit at first class functions.

a basic array_map

the basic purpose of array_map() is to take an array and apply a function to each of the elements in that array. the result of each of those function calls is then returned as an element of a new array. let's look at an example:

$testArray = [1,2,3];
$result = array_map(fn($element) => $element * 2, $testArray);

the purpose of this code is take the array of integers in $testArray and double each of those numbers. we then create and return a new array called $result that contains all those values. this is all done in one line, the call to array_map().

if we look at array_map(), we can see that it takes two arguments. the second argument is our $testArray; the array we want to use as input.

the first argument is the interesting part. this argument is an anonymous function, a 'callable' in fact. this function is what modifies the elements of our input array. the callable itself takes one argument, which is the current element of our input array.

if this is the first time you've seen array_map() or if you've never experimented with any functional programming ideas before, this may seem confusing. it may help, at the beginning, to think of this as just another way of writing a foreach loop. the second argument is the array we iterate over, and the body of the function are the commands we would put inside our foreach block.

when array_map() has completed running, the result will be a new array that contains our modified values. in this case [2, 4, 6].

different ways to define callbacks

in the example above, the function we passed to array_map() was written as an arrow function. arrow functions and nice and terse, but they do have a major limitation: their bodies can only contain one line. that's a drawback.

fortunately, we can create our callbacks a number of different ways. all of these techniques will work for array_map(), array_filter() and array_reduce():

callbacks as full functions

before arrow functions were introduced in php7.4, all callbacks had to be defined using the function keyword and had to have an explicit return statement. this syntax still works and has the distinct advantage that we can put as many lines as we want in the function body:

$result = array_map(function($element) {
    // some other commands
    return $element * 2;
}, $testArray);

callbacks as variable functions

in php, functions can be assigned to variable names and treated like any other variable; they can be passed as arguments and even returned from functions.

here we assign our doubling function to the variable $double and then pass that variable as an argument to array_map():

$double = function($element) {
    return $element * 2;
};
$result = array_map($double, $testArray);

callbacks as object methods

we can also use methods in our objects as callbacks. this allows us to apply existing code to our array functions.

passing a method of an object as a callable to array_map() requires us to write the argument as an array. the first element of the array is the object, the second argument is a string of the method name:

class MyClass {
    public function double($element) {
        return $element * 2;
    }
}
$myObject = new MyClass();
$result = array_map([$myObject, 'double'],$testArray);

callbacks with extra arguments

so far, all of the callable functions that we have passed to array_map() have only taken one argument: the element of the array. however, we will probably want to be able to add more arguments to our callable.

for instance, a callable that doubles our array elements is nice, but we would much prefer to have a more general 'multiplier' function. this requires us to be able to pass the value we want to multiply by as well as the element of the array to our callable.

doing this with arrow functions is easy. arrow functions inherit the scope where they are written. this means they can access variables that are declared outside of their body without them being passed as arguments. let's look:

$multiplier = 4;
$result = array_map(fn($element) => $element * $multiplier, $testArray);

in this example, we declared the variable $multiplier and then, in our arrow function, used it without having to pass it as an argument. this is powerful and convenient!

this doesn't work with full function definitions, however. for instance, if we try to access $multiplier inside an anonymous function defined with the word function, we get an error.

$multiplier = 4;
$result = array_map(function($element) {
    return $element * $multiplier;
}, $testArray);
// PHP Warning:  Undefined variable $multiplier

if we run this code, php will throw a warning that it doesn't know about $multiplier inside our callable's scope.

the correct way to pass extra arguments in this case is to use the use() construct:

$multiplier = 4;
$result = array_map(function($element) use($multiplier) {
    return $element * $multiplier;
}, $testArray);

the use() command here tells php that we want to add the $multiplier variable to the scope of the function we are defining. we can now read $multiplier in our callable function's body.

functions assigned to variables can also employ use():

$multiplier = 4;
$multiply = function($element) use($multiplier){
    return $element * $multiplier;
};
$result = array_map($multiply, $testArray);

a more advanced array_map

so far, all we've done with array_map() is double and sometimes quadruple arrays of integers. let's take a look at something that at least vaguely approximates a useful implementation.

in this snippet, we're taking an array of 'items', each one an array itself, and using array_map() to figure out if we need to apply gst to the price. the returned value from array_map() will be a new array of items that includes the after-tax price.

$items = [
    [
        'name' => 'Used album',
        'price_pennies' => 500,
        'country' => 'US',
    ],
    [
        'name' => 'Stolen uranium, 1kg',
        'price_pennies' => 43000000,
        'country' => 'CA',
    ],
];
$gstRate = 0.07;

$result = array_map(function($i) use($gstRate) {
    $tax = $i['country'] == 'CA' ? round($i['price_pennies'] * $gstRate) : 0;
    return [
        'name' => $i['name'],
        'price' => '$'.number_format(($i['price_pennies'] + $tax) / 100, 2),
        'country' => $i['country'],
    ];
}, $items);

we observe here that the callable we're using in array_map() is defined with the function keyword so we can have a multi-line body. we're also using use() to pass in the $gstRate variable.

inside the body of our callable, we determine if gst is applicable (only canadians pay it) and calculate the tax amount. our return value is a new array with the price, including gst for canadians, formatted for display.

if we run this snippet, we will get:

Array
(
    [0] => Array
        (
            [name] => Used album
            [price] => $5.00
            [country] => US
        )

    [1] => Array
        (
            [name] => Stolen uranium, 1kg
            [price] => $460,100.00
            [country] => CA
        )

)

a basic array_filter

as the name implies, the purpose of array_filter() is to "filter" our input array. it returns the elements we want and leaves out the ones we don't.

let's take a look at the most basic example:

$testArray = [1,2,null,3];
$result = array_filter($testArray);

if we run this snippet, our $result array will be:

[1, 2, 3]

here, array_filter() removed the null value. it "filtered it out".

this is some pretty handy functionality and the most of the times i've seen other people use array_filter() it's to do just this. but array_filter() can do a lot more.

like array_map(), array_filter() can optionally take a callable as an argument. this callable inspects each element of our input array and makes a decision about whether we want to keep it or filter it out. if the callable returns a true, we keep the element. if it returns a false, we don't. for instance, let's say we want to filter our input array so we only keep the even integers:

$testArray = [1,2,3,4];
$result = array_filter($testArray, fn($element) => $element % 2 == 0);

print_r($result); // [2, 4]

the first thing we notice here is that the arguments for array_filter() are in a different order than in array_map(); our input array comes first here, and the callable is second. this is one of those minor inconsistencies that we file under "php's just like that" and move on.

if we look at the callable, we notice that it evaluates to a boolean value. the expression $element % 2 == 0 will return either a true or a false. all callables used with array_filter() must evaluate to or return a boolean. in fact, the return values will be explicitly cast to a boolean by php.

we can do any test we like on the array element passed to our callable. all we need to know is that if the callable returns or evalutes to true, the array element is added to our results.

a more advanced array_filter

of course, filtering out odd numbers is not really representative of the real work we would do with array_filter(). it's just an exercise to demonstrate a point. let's look at something that more closely approximates something practical.

we have an array of people, with each person having a name, country of residence and birthdate. what we want to get is a list of people who are old enough to go to the bar. of course, this is complicated by the fact that the age of majority is different in different countries, so we will need to accommodate that. let's look at how we would use array_filter() to solve this (addmittedly contrived) problem:

/**
 * Minimum ages to enter the bar by country
 */
$barAges = [
    'CA' => 18,
    'US' => 21,
];

/**
 * People with country and birth date
 */
$persons = [
    [
        'name' => 'Jasminder',
        'country' => 'CA',
        'birth_date' => '2004-09-12'
    ],
    [
        'name' => 'Larry',
        'country' => 'US',
        'birth_date' => '2003-12-04'
    ],
    [
        'name' => 'Tabitha',
        'country' => 'US',
        'birth_date' => '1972-05-26'
    ],
];

/**
 * Function to calculate age from birth date
 */
$calculateAge = fn($person) => date_diff(date_create($person['birth_date']), date_create(date('Y-m-d')))->format('%y');

/**
 * Get array of people old enough to go to the bar
 */
$results = array_filter($persons,
    function($person) use($barAges, $calculateAge) {
        return $calculateAge($person) >= $barAges[$person['country']];
    });

first, we set our data. the $barAges array is a list of the ages of majority, keyed by country. the $persons array is the array we want to pass to array_filter(); it's the raw data we want to filter so that we are only left with the people old enough to go to the bar in their country.

next, we have the variable function $calculateAge(). this function accepts one element from the $persons array and returns their age in years. we will be using this function inside our call to array_filter().

at last, we get to array_filter(). the interesting part here is our callable function. it accepts one argument, $person, which is the element of our input array that we're testing. it also calls use() to add two more variables to our function's scope so we can use them in the function body.

the first variable we're injecting is $barAges. this is so we can look up the age of majority for the country of the person we're evaluating. the second variable is $calculateAge, the function we wrote to get the age of a person from their birth date. this example shows us two things. first, that we can use use() to inject as many variables as we like into our function's scope and, second, that we can treat a function assigned to a variable like any other variable, even using at as an argument.

a short digression about first class functions

when we can assign functions to variables and pass those variables as arguments (like we're doing here) or return them as return values, we say that functions are 'first class citizens'. while this is not, in itself, a part of using map, filter and reduce, understanding and using first-class functions is an important part of making your code more 'functional programming like'. if you are incorporating map, filter and reduce into your code with the goal of getting some of the benefits of functional programming, it is very much worth your while to adopt using first-class functions as well.

we can use() variable functions this way with array_filter(), array_map() and array_reduce().

a basic array_reduce

the array_reduce() function behaves similarly to map and filter, but with one major difference: instead of an array, array_reduce() returns a scalar; an integer, or string or similar.

this is because the purpose of array_reduce() is to take all the array elements passed to it and "reduce" them down to a single value. let's look at the canonical example: summing up a list of integers.

$testArray = [1,2,3];
$result = array_reduce($testArray, fn($carry, $element) => $carry + $element);

this should look familiar to us by now. we pass two arguments to array_reduce(): our input array as the first argument, and a callback function that acts on each element of our array as our second argument.

the notable thing here is that our callable function itself takes two arguments, not just one. the $element argument is straightforward; it's the current element of our input array.

the $carry argument is the interesting one. this argument is automatically filled by array_reduce() and contains the return value from the previous call to array_reduce().

if we think about the process of summing up an array, this makes sense. we need to know the running total that we will be adding the current element to. $carry contains that running total.

if we run the above snippet, array_reduce() will execute our callable function three times. the first time, our $element will be 1 and our $carry will be null. our function will add these together and return the sum, which is 1. on the second execution, $carry will hold the value 1, which is the value that was previously returned. our function will add the next $element, which is 2, to our $carry, and return 3. on the final execution, our $carry will be 3 and our $element will be 3. summing them results in 6, which is our final, reduced, value that gets returned to $result.

a more advanced array_reduce

once we understand how array_reduce() works, we can start to look at how we can expand on it. for instance, we may want to sum up only the even numbers:

$testArray = [1,2,3,4];
$sumEvens = function($carry, $item) {
    if($item % 2 == 0) {
        return $carry + $item;
    }
    return $carry;
};
$result = array_reduce($testArray, $sumEvens);

here we have created a variable function assigned to $sumEvens that tests if a number is even. if it is, we add to our $carry variable. if it is odd, we simply return $carry without adding anything to it, essentially skipping over the odds.

of course we can also accept as our input array an array of arrays. here we sum up the price of three items:

$items = [
    [
        'name' => 'Used album',
        'price_pennies' => 500,
    ],
    [
        'name' => 'Flowers',
        'price_pennies' => 1350,
    ],
    [
        'name' => 'Tee shirt, plain',
        'price_pennies' => 975,
    ],
];
$result = array_reduce($items, fn($carry, $item) => $carry + $item['price_pennies']);

and we're not limited to numbers. we can modify and concatenate strings.

$testArray = ['one', 'two', 'three'];
$join = function($carry, $item) {
    $item = ucfirst($item);
    if(is_null($carry)) {
        return $item;
    }
    return $carry.','.$item;
};
$result = array_reduce($testArray, $join);

print $result // One,Two,Three

in this callable function, we modify the element of our array, which is a string, to have an uppercase first letter, then stick it onto the end of a comma-separated list. since, on the first execution of our callable, our $carry will, of course, be null, we just return the item to avoid our final string having a comma in the first position.

there is a lot of power in array_reduce() and, once we get the hang of it, we will quickly start to see the places where it can be applied instead of a clumsy loop.

conclusion

once we get the hang of map, filter and reduce and start using functions as first-class citizens, we will start to see all sorts of places where we can replace awkward or complicated loops with cleaner, terser code. using these techniques will even help us change the way we think about programming in general; we will see programs less as a long procedure to run from top to bottom, and more as a collection of function calls that we use to compose our solution. the end result will be more power and more readability.