< Home August 2020 Don't Believe Obvious Performance Improvements

Introduction

This starts with a bug report from a client that relates to an API that returns a list of the content in their account. They complained that because they have a large number of pieces of content the performance of the API is quite poor, taking at least 20 seconds to serve each request.

First impressions

After confirming the problem (it was horribly slow) I jumped into the code. Below is a simplified example of how the code worked, with comments to help explain what each part does.

def get_content_api(limit=1000):
    # Fetch all data from the database
    content = get_all_content()

    # Remove hidden content
    content = filter(lambda c: not c["hidden"], content)

    # Apply the result size limit
    content = content[:limit]

    # Prepare the data ready to be returned
    for item in content:
        item["date"] = item["date"].isoformat()
        item["url"] = "%s/content/%s" % (get_domain(), item["id"])

    # Return the data
    return JsonResponse(content)

There is an imediate problem that jumped out at me right away. The function get_all_content fetches all the content from the database for that client (in this case that is over 100,000 items) then it applies a filter and page limit reducing the number of items to return to just the first 1,000 items.

This is clearly a huge problem as we have to wait while the database fetches all this data and transfers it over the network to the application server only for 99% of it to be immediately thrown away. Databases are designed to do this kind of filtering efficiently using various complex data structures and algorithms so it should be doing this task. Happily it's a fairly quick fix.

Benchmarking

But the most important rule when it comes to performance optimisations is to benchmark the performance of the code first to ensure the performance improvements are applied to the overall cause of the problem. Even if the problem seems obvious.

Especially if the problem seems obvious.

Below is the results, showing how long was spent on each part of the code.

100ms
1ms
0ms
20,000ms
def get_content_api(limit=100):
    # Fetch all data from the database
    content = get_all_content()

    # Remove hidden content
    content = filter(lambda c: not c["hidden"], content)

    # Apply the result size limit
    content = content[:limit]

    # Prepare the data ready to be returned
    for item in content:
        item["date"] = item["date"].isoformat()
        item["url"] = "%s/content/%s" % (get_domain(), item["id"])

    # Return the data
    return JsonResponse(content)

Much to my surprise get_all_content takes only 100ms to fetch 10,000+ items from the database. That could definitely be improved but clearly there is something else that needs to be looked at first.

The real problem

Once all the data had been fetched the code then did some simple data transformations on each item to prepare the data to be returned. Or as it turns out not that simple as it takes multiple seconds to run through all the items.

On further inspection the problem became apparent, the issue was the call to get_domain(). This function makes a call to the database to lookup the domain that the client uses so that it can build the "url" parameter. The amount of data it is fetching is tiny but the network latency multiplied by the number of items being processed was enough to completely ruin the performance.

Much to my joy this is probably the easiest performance enhancement I have ever made. The domain for each item is the same, so it only needs to called once!

def get_content_api(limit=1000):
    # Fetch all data from the database
    content = get_all_content()

    # Remove hidden content
    content = filter(lambda c: not c["hidden"], content)

    # Apply the result size limit
    content = content[:limit]

    # Prepare the data ready to be returned
    domain = get_domain()
    for item in content:
        item["date"] = item["date"].isoformat()
        item["url"] = "%s/content/%s" % (domain, item["id"])

    # Return the data
    return JsonResponse(content)

With that small change performance went from 20 seconds to just 120 milliseconds.

Result: Another happy customer.

Conclusion

So the moral of the story? Always check you are applying the performance improvements to the right part of the code. No matter how sure you are.

And yes the API still fetches all the data and filters it in the code. As much as I'd like to go do it better there isn't the time to go around making performance improvements to save a few milliseconds when everyone is happy to wait 120ms.