How to Profile Python Code With cProfile
Programming is a sequential job involving writing code, testing, monitoring (profiling), reviewing the reports and results, and refactoring. After all, there’s no guarantee that the first written source code is in its best-optimized shape.
In this article, we’re going to focus on the third step of the sequential procedure, which is profiling, with respect to cProfile in Python.
Table of Contents
A brief look at profiling
Profiling explains the techniques for determining the parts and units of a program that take a tremendous amount of time to execute. With it, we can use the reports and numbers to improve performance as well as the execution time of functions and methods.
It’s important to note that profiling is not always the best answer for better performance. However, it’s suitable when we need some extra speed through our executions or if there is a huge request load over the servers and we need to manage the complexity of the components.
If you’re already satisfied with your program’s execution time, there’s no need to bother with cProfile. However, it can be used when execution time has become a concern and it breaks other parts of the service.
For instance, your server is experiencing a lot of requests and heavy traffic owing to advertising efforts and users have noticed that it is responding slowly. This is where cProfile will come in handy if the service is Python-based and some Python function calls are causing the heavy traffic load.
Profiling is a game-changer when it comes to redesigning and restructuring a project and the codebase. Most times, you don’t need to put effort into a full redesign task as a simple R&D and profiling would work.
cProfile in Python
In Python and *nix environments, there are many packages and Python libraries that provide facilities for evaluating the cost of source codes, whether it’s security threats or the time it takes for a component to execute.
cProfile in Python is categorized as a standard library. In the next section, we’ll use an example followed by the actual implementation and possible strategies for optimization. It’s a good illustration of what happens in real-world cases.
Scenario: There’s a large text file of novel titles separated by new lines. Some names have occurred twice or even multiple times. The goal is to read the file and create a local variable filled with all the novel names without duplicate names included. The main requirement is to use multiple functions as the solution.
Here is a short scheme of what we have in the text file.
The Name of the Wind Dune The Left Hand of Darkness Dracula Foundation Dune .
Solution: We need functions for reading the file, ignoring the duplicated values, and a main function for calling them in order.
- read_file(path: str) -> list: Gets the file name/path and returns the list of titles.
- ignore_duplicates(titles: list): Gets a list of titles (the value returned from read_file()) and returns a duplicate-safe list.
- main(): Main function.
Keep in mind that the following code snippet is not perfect. We will come up with solutions with the help of cProfile to optimize the code and improve the execution time of each function.
#!/bin/python def read_file(name: str) -> list: with open(name,'r') as file: titles = file.readlines() new_titles = [] # casting off \n from end of titles for title in titles: new_titles.append(title.removesuffix('\n')) return new_titles def ignore_duplicates(titles: list) -> list: new_titles = [] for title in titles: if title in new_titles: continue else: new_titles.append(title) return new_titles def main() -> None: original_titles = read_file('novels.txt') safe_titles = ignore_duplicates(original_titles) print(safe_titles) if __name__ == '__main__': cProfile.run('main()')
Executing the above script will show the safe_titles variable that contains the unique titles. On a lower-end machine, the same script takes more seconds to run and that’s why we’re not done yet.
Let’s begin the cProfile part by importing this module first.
At the very bottom of the script, replace the phrase main() with cProfile.run(‘main()’). It not only deals with its parameter as an interpreting line, but illustrates the information we need for optimizing.
321630 function calls in 0.510 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.000 0.000 :260(__init__) 1 0.000 0.000 0.000 0.000 :309(__init__) 308 0.000 0.000 0.001 0.000 :319(decode) 1 0.001 0.001 0.510 0.510 :1() 1 0.447 0.447 0.447 0.447 file.py:15(ignore_duplicates) 1 0.001 0.001 0.509 0.509 file.py:25(main) 1 0.036 0.036 0.059 0.059 file.py:3(read_file) 308 0.000 0.000 0.000 0.000 1 0.000 0.000 0.510 0.510 1 0.001 0.001 0.001 0.001 1 0.000 0.000 0.000 0.000 1 0.000 0.000 0.000 0.000 161002 0.005 0.000 0.005 0.000 1 0.000 0.000 0.000 0.000 1 0.009 0.009 0.010 0.010 160000 0.009 0.000 0.009 0.000
In the first line, we have 321630 function calls in 0.5 seconds. This execution time is almost 5 seconds on the low-end machine.
We have two ~160K calls in two lines. To sort the depicted table, simply add the sort=COLUMN attribute to the cProfile.run() method.
. cProfile.run('main()', sort='ncalls') # sorting by number of calls
ncalls tottime percall cumtime percall filename:lineno(function) 161002 0.005 0.000 0.005 0.000 160000 0.009 0.000 0.009 0.000 308 0.000 0.000 0.000 0.000 308 0.000 0.000 0.001 0.000 :319(decode) 1 0.000 0.000 0.000 0.000 1 0.447 0.447 0.448 0.448 file.py:15(ignore_duplicates)
We can see which lines get executed the most. We have two lines with over +160K executions! That’s massive. It’s time to resolve things.
read_file() optimization: We’re looping through all names in order to remove the \n suffix and append the name into another list that is described like in the first row.
That’s a heavy load on the system. Instead, we could do it slightly cleaner by returning file.read().splitlines() right within the with scope as follows. It’s an impressive O(n) to O(1) complexity.
def read_file(name: str) -> list: with open(name,'r') as file: return file.read().splitlines()
ignore_duplicates() optimization: Next, it’s time to refactor the core function. There is an O(1) trick that we can use to make a list filled with unique items. This involves using sets.
def ignore_duplicates(titles: list) -> list: return list(set(titles))
Now, our source code should look like the following snippet after all optimizations and refactors are applied.
#!/bin/python import cProfile def read_file(name: str) -> list: with open(name,'r') as file: return file.read().splitlines() def ignore_duplicates(titles: list) -> list: return list(set(titles)) def main() -> None: original_titles = read_file('novels.txt') safe_titles = ignore_duplicates(original_titles) print(safe_titles) if __name__ == '__main__': cProfile.run('main()')
Let’s execute the program and wait for the result of cProfile to shine through even faster than before!
15 function calls in 0.017 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.008 0.008 0.008 0.008 1 0.005 0.005 0.005 0.005 file2.py:8(ignore_duplicates) 1 0.001 0.001 0.017 0.017 :1() 1 0.001 0.001 0.001 0.001 1 0.001 0.001 0.001 0.001 1 0.001 0.001 0.001 0.001 1 0.000 0.000 0.009 0.009 file2.py:3(read_file) 1 0.000 0.000 0.017 0.017 1 0.000 0.000 0.000 0.000 1 0.000 0.000 0.015 0.015 file2.py:12(main) 1 0.000 0.000 0.000 0.000 1 0.000 0.000 0.001 0.001 :319(decode) 1 0.000 0.000 0.000 0.000 :309(__init__) 1 0.000 0.000 0.000 0.000 1 0.000 0.000 0.000 0.000 :260(__init__)
There is a huge performance improvement here. We went from >300K function calls to only 15 calls. We also made it much faster, from 0.5s to 0.01, which is ~x50 quicker than before!
With the applied performance improvements, we can have the same functionality but 50 times faster on the low-spec machine as well.
Another way of applying cProfile through components is via decorators. Simply structure cProfile objects and with the help of io and pstats modules, we can make our cProfile much easier to use and more precise. A good matching module that helps manipulate the statistics returned from cProfile is pstats.
In the next section, we’ll share some tricks to turn the solid raw information that cProfile provides us into better graphical shapes and reports using a third-party library in Python.
cProfile visualization in GUI
There are online services and tools that can be used to illustrate the output and reports of a cProfile Python execution. It’s a nice practice for those who need the reports for meetings and technical events.
A quick library installation is needed. We assume that you’ve already installed virtualenv and have your environment activated.
Now, we need to export the output of cProfile into a .prof file. The following commands do so and keep the statistics in a file named program.prof.
$ python -m cProfile -o program.prof file.py
As a last step, run the snakeviz command and it will illustrate the reports right on your browser.
$ snakeviz program.prof snakeviz web server started on 127.0.0.1:8080; enter Ctrl-C to exit http://127.0.0.1:8080/snakeviz/.
Browse the link and you’ll see a panel like this.
You can customize and filter the shown data as you wish. Feel free to work with the utilities that the tool provides.
Conclusion
Redesigning a codebase is costly but with the help of monitoring and profiling methodologies, you can avoid full redesigns and optimize it instead. In the cProfile example discussed in this article, you’ll have plenty of information for optimizing as well as an accurate report of how your functions work over time. Your team can have sessions to analyze the results obtained from libraries and modules, such as cProfile in Python and pstat, for better performance.
Author
Sadra Yahyapour
Sadra is a Python back-end developer who loves the architectural design behind the software. A GitHub Campus Expert and open-source contributor. He spends his free time writing high-quality technical articles.