Code complexity and clean code

« »

Many people wonder how they can improve the overall readability and cleanliness of their code. It seems impossible to understand exactly how to rewrite code in such a way that makes it clean, easy to understand and simple to work with. But clean code is about a few simple principles, one of which is reducing overall code complexity through a series of simple steps.

I’ve been working on a new book on clean code for the last few months. While I know that I’ll never answer all the considerations regarding clean code, I believe that I can make an impact in showing people how to measure their code’s cleanliness, and know exactly what they need to fix. One metric that I’ve been focusing on for the last few weeks is code complexity.

There are many measures of code complexity, and they affect code quality in various ways. Today, I want to focus on two of them: cyclomatic complexity, and NPath Complexity.

Cyclomatic Complexity and Code Quality

Cyclomatic complexity is a simple measure of complexity in an application or routine. It measures the paths through the code. Cyclomatic complexity is never less than 1, because there’s always at least one code path.

Consider the following code:


function fizzBuzz($start, $max) {
    
    $return = array();
    
    // sanity check
    if($max < $start) {
        return false;
    }
    
    for($i = $start; $i < $max; $i++) {
        $result = '';
        if($i % 3 == 0) {
            $result .= 'fizz';
        }
        
        if($i % 5 == 0) {
            $result .= 'buzz';
        }
        
        if(!$result) {
            $result = $i;
        }
        
        $return[] = $result;
    }
    
    return implode($return, ',');
    
}

Cylcomatic complexity measures the paths through the code base. The function is the first entrance point, so it counts as 1. Each conditional or loop is another point. The total cyclomatic complexity of this function is 6; there are four if statements, one loop, and the one for the function itself.

Cyclomatic complexity is one measure of code quality. It helps us know exactly how complex a particular routine is, and helps us refactor that routine as necessary. For most routines, a cyclomatic complexity below 4 is considered good; a cyclomatic complexity between 5 and 7 is considered medium complexity, between 8 and 10 is high complexity, and above that is extreme complexity.

We could easily refactor this function to reduce overall cyclomatic complexity. By breaking it into two functions, we can break up the complexity and make both functions easier to test:

function fizzBuzz($start, $max) {
    
    $return = array();
    
    // sanity check
    if($max < $start) {
        return false;
    }
    
    for($i = $start; $i < $max; $i++) {        
        $return[] = determineFizzandBuzz($i);
    }
    
    return implode($return, ',');
    
}

function determineFizzandBuzz($value) {
    $result = '';
    
    if($value % 3 == 0) {
        $result .= 'fizz';
    }
    
    if($value % 5 == 0) {
        $result .= 'buzz';
    }
    
    if(!$result) {
        $result = $value;
    }
    
    return $result;
}

The new cylcomatic complexity of fizzBuzz() is 3, while the cyclomatic complexity of determineFizzandBuzz() is 4. Both functions are now less complex than the single large function.

But wait…isn’t the total cyclomatic complexity now 7??

It’s true: the addition of a second function makes the whole file’s cyclomatic complexity score go up by 1, even though the individual functions have lower individual cyclomatic complexities than the one large function did. And this is where cyclomatic complexity can’t be a full measure of code complexity.

Measuring the total number of code paths (and tests required)

There’s another measure, known as NPath complexity. This measure of complexity measured sightly differently than cyclomatic complexity. Cyclomatic complexity measures the decision points in a routine; NPath complexity measures all the possible code paths. More often than not, NPath complexity is higher than cyclomatic complexity.

The NPath complexity of the refactored fizzBuzz() is 4 while the NPath complexity of determineFizzandBuzz() is 8; this means you would need approximately 12 tests altogether to effectively test every possible code path in the fizzBuzz process. However, the score of the combined function from before is a whopping 18.

This makes sense: the first fizzBuzz() function is more complicated than the two functions in the second example. As a result, it would take approximately six more tests overall to effectively test every code path in the first fizzBuzz() example.

How does this relates to code quality?

Humans are not wholly unlike computers; we read code and work through code paths in an if-else sort of way. As a result, high levels of complexity make it hard for us to process the code and understand the code paths.

As a result, highly complex functions are difficult to understand. They’re also terribly difficult to test.

Eighteen tests might not seem all that difficult to write, but NPath complexity can measure in the thousands. Some WordPress functions I tested had an NPath complexity over 4,000. You would have to write 4,000+ tests just to effectively test a single routine! Breaking up that routine into smaller, more easily tested routines would dramatically help code quality in that case.

I personally consider an NPath complexity above 140 to require refactoring to be less complex.

Reducing code complexity improves code cleanliness

By reducing code complexity, the code becomes more readable. It’s easy to reduce complexity: simply breaking apart big functions that have many responsibilities or conditional statements into smaller functions is a great first step. Like a writer who takes a complicated sentence and edits it down into several easily digestible ones, you too can improve overall code quality by breaking apart complicated routines.

Learn how to apply abstraction and upgrade your object oriented skills!

Last week I opened registration for The Object Oriented Design Seminar. This four-hour online seminar will be held June 8th, and will be action-packed with details on how you can make great object oriented design decisions. Early bird pricing is $179 and lasts through May 31st. All attendees get a recording plus written materials they can use later. Get your seat before they’re gone! Register now.

Brandon Savage is the author of Mastering Object Oriented PHP and Practical Design Patterns in PHP

Posted on 5/22/2013 at 7:00 am
Categories: Object-Oriented Development, PHP, Clean Code

There are currently no comments.

« »

Copyright © 2023 by Brandon Savage. All rights reserved.