Stop returning arrays (use objects instead)

« »

When returning large amounts of data, it can be common to turn to a built-in PHP data structure: the array. But in object-oriented code, arrays make poor object-to-object data transfer mechanisms. Object-oriented programming is and should be focused on objects. That means that each array should be converted into an object, and collections too.

The reason is simple: when one object makes a request to another object, it should generally get back an object unless the response is scalar. For example, asking for a data set from the database ought to generate an object, not an associative array of data.

Using objects allows for type hinting and type evaluation, not to mention creates additional type safety for your application. Arrays can be changed, and have no methods; objects can be made immutable, and can have methods that access bits of data. Naked arrays form excellent value objects that can be hinted on, ensuring the fidelity of the data in the future.

Of course, returning an object or a collection can have some unintended side effects. For starters, you can’t return a malformed value object, so you’ll need to throw more exceptions when the object can’t be created due to some error. That will mean additional error handling for specific error conditions in the calling object.

When returning a collection of value objects, you’ll want to make sure the data is homogeneous – that is, the value objects are all of the same type. Otherwise, returning an array is perfectly valid. Collections should only contain objects of a single type.

You should also consider returning an array if the call is solely internal (between protected methods, for example); value objects should be used for external calls, but arrays are perfectly suitable for internal calls.

Finally, when it comes to defining your return values, use exceptions to denote errors, and only return when you have a fully formed object or collection to return. This will help improve your application’s error handling, and make explicit the conditions by which failure occurs. In addition, by having a single, consistent return value, you can know that when you get a response, the request you made was successful.

Value objects are extremely powerful, and should be used regularly. What are your favorite places to replace arrays with value objects?

Brandon Savage is the author of Mastering Object Oriented PHP and Practical Design Patterns in PHP

Posted on 8/12/2015 at 8:00 am
Categories: PHP

David wrote at 8/12/2015 8:36 am:

“Naked arrays form excellent value objects that can be hinted on, ensuring the fidelity of the data in the future.”

Can you further explain this statement? It seems at odds with everything else discussed in the article? I.e. Value objects are good, arrays are bad but naked arrays form excellent value objects?

Perhaps I am misunderstanding. Thanks!

Brandon Savage (@brandonsavage) wrote at 8/12/2015 8:37 am:

This is definitely a misunderstanding, due to unclear language I used.

Converting a naked array into a value object means creating a value object that would represent the keys of the array, and returning that, instead of returning the array. Any time you have just a naked array of keys, that’s a great opportunity to create a value object.

Alexander Makarov (@sam_dark) wrote at 8/12/2015 9:01 am:

No word about practical reasons to use arrays i.e. performance (instantiation) / memory usage?

Bill Karwin (@billkarwin) wrote at 8/12/2015 10:53 pm:

A reply on Twitter asked you about performance, and you said you hadn’t tested it. So here’s a test:

<?php

$n = 100000000;

$start = microtime(true);
for ($i = 0; $i < $n; ++$i) { $a = array(); }
$end = microtime(true);
echo ($end-$start) . " seconds to create $n arrays.\n";

function myarrayfunc() { return array(); }
$start = microtime(true);
for ($i = 0; $i < $n; ++$i) { $a = myarrayfunc(); }
$end = microtime(true);
echo ($end-$start) . " seconds to call function $n times to create arrays.\n";

$start = microtime(true);
for ($i = 0; $i < $n; ++$i) { $obj = new stdClass(); }
$end = microtime(true);
echo ($end-$start) . " seconds to create $n objects.\n";

function myobjfunc() { return new stdClass(); }
$start = microtime(true);
for ($i = 0; $i < $n; ++$i) { $obj = myobjfunc(); }
$end = microtime(true);
echo ($end-$start) . " seconds to call function $n times to create objects.\n";

Results on my machine:

4.6363360881805 seconds to create 100000000 arrays.
10.022792816162 seconds to call function 100000000 times to create arrays.
10.128829956055 seconds to create 100000000 objects.
14.732341051102 seconds to call function 100000000 times to create objects.

Creating an object versus creating an array is 10.13/4.64 as costly, which means about 118% overhead.
But the proportional overhead diminishes to about 47% when we call a function, because the function call itself has a pretty high cost in PHP.

It's still a good idea to use value objects, but not for the sake of performance.

Ultimately, if one were that concerned about performance, one wouldn't be using PHP anyway.

Brandon Savage (@brandonsavage) wrote at 8/12/2015 11:43 pm:

If asked, I would gladly concede that the performance implications are not as good. My main goal here is to highlight a design concept, rather than a performance concept. That said, different performance characteristics require different approaches. So append my post to include that if performance dictates, use the fastest tool available, even if that’s arrays.

Rasmus Schultz (@mindplaydk) wrote at 8/13/2015 2:05 am:

Regarding performance, the idea that arrays are faster than objects still exists in a lot of people’s minds – but since 5.5 (or 5.4 even? I forget.) objects in fact are faster and require less memory. This is due to an optimization made in that release, in which PHP objects internally began to use a property table instead of storing properties as a key/value hash.

Most recently, I refactored TreeRoute (a router library) from one class using arrays, to a full object model – you can see what that looks like here:

https://github.com/mindplay-dk/timber/commit/56a191161f52d689976abfd5fc6e7ef5713c8dec

In benchmarks, this was slightly faster under 5.5, 5.6 and 7.0b2 – and of course you have IDE support now, more readable code, easier debugging, etc.

There is no reason anyone should choose arrays for anything that isn’t a collection. There hasn’t been for many years.

Rasmus Schultz (@mindplaydk) wrote at 8/13/2015 2:15 am:

PS: a benchmark for the skeptical, comparing the performance of a simple model written using six different approaches:

https://github.com/mindplay-dk/benchpress/blob/master/README.md

This result is under 5.4.7, where as you can see, raw performance for an object with the properties is practically identical to that of arrays. The relative difference varies under different versions of PHP.

Don’t forget, memory usage is also improved. (not shown in this benchmark.)

Onkar (@onkarjanwa) wrote at 8/15/2015 2:56 am:

I use object mostly in case of performing db operations like insert/update/delete.

I use array return in case of Rest Api’s.

I mostly use DTO classes while rendering data to browser.

« »

Copyright © 2023 by Brandon Savage. All rights reserved.