- Python Subprocess: The Simple Beginner’s Tutorial (2023)
- The subprocess Module
- Using subprocess.run()
- More Parameters
- Controlling the Outputs
- Raising Errors
- Inputting in the Subprocess
- Conclusion
- Otávio Simões Silveira
- You May Also Like
- Python Ternary: How to Use It and Why It’s Useful (with Examples)
- Do You Post Too Much? Analyze Your Personal Facebook Data with Python
Python Subprocess: The Simple Beginner’s Tutorial (2023)
Simply put, everything that happens in a computer is a process. When you open an application or run command-line commands or a Python script, you’re starting a new process. From opening a menu bar to starting a complex application, everything is a process running on your machine.
When you call a Python script from the command line, for example, a process is creating another process. These two processes now have a relationship with each other: the process that creates another process is the parent, while the process created is the child.
In the same sense, a Python script can also start a new process, and then it will become the parent of that new process. In this article, we’ll see how to use the subprocess module in python to run different subprocesses during the course of a regular python script.
Although this is not an advanced article, basic knowledge of Python might be necessary in order to follow along with the examples and concepts.
The subprocess Module
The subprocess is a standard Python module designed to start new processes from within a Python script. It’s very helpful, and, in fact, it’s the recommended option when you need to run multiple processes in parallel or call an external program or external command from inside your Python code.
One of the pros of the subprocess module is that it allows the user to manage the inputs, the outputs, and even the errors raised by the child process from the Python code. This possibility makes calling subprocesses more powerful and flexible — it enables using the output of the subprocess as a variable throughout the rest of the Python script, for instance.
The module was first implemented in Python 2.4 with the goal of being an alternative to other functions, such as os.system . Also, since Python 3.5, the recommended usage for this module is through the run() function, which will be the focus of this article.
Using subprocess.run()
The run function of the subprocess module in Python is a great way to run commands in the background without worrying about opening a new terminal or running the command manually. This function is also great for automating tasks or running commands you don’t want to run manually.
The main argument received by the function is *args , which is an iterable containing the instructions to run the subprocess. In order to execute another Python code, the args should contain the paths to the Python executable and to the Python script.
So it would look like this:
import subprocess subprocess.run(["python", "my_script.py"])
It’s also possible to write Python code directly to the function instead of passing a .py file. Here’s an example of running such a subprocess:
result = subprocess.run(["/usr/local/bin/python", "-c", "print('This is a subprocess')"])
Inside args we have the following:
- «/usr/local/bin/python» : the path to the local Python executable.
- «-c» : the Python tag that allows the user to write Python code as text to the command line.
- «print(‘This is a subprocess’)» : the actual code to be run.
In fact, this is the same as passing /usr/local/bin/python -c print(‘This is a subprocess’) to the command line. Most of the code in this article will be in this format because it’s easier to show the features of the run function. However, you can always call another Python script containing the same code.
Also, both times we used the run function in the examples above, the first string in args refers to the path to the Python executable. The first example uses a more generic path, and we’ll keep it like this throughout the article.
However, if you’re having trouble finding the path to the Python executable on your machine, you can have the sys module do that for you. This module interacts very well with the subprocess , and a good use case for it is to replace the path to the executable like this:
import sys result = subprocess.run([sys.executable, "-c", "print('This is a subprocess')"])
The run function then returns an object of the CompletedProcess class, which represents a finished process.
If we print result , we’ll have this outcome:
CompletedProcess(args=['/usr/bin/python3', '-c', "print('This is a subprocess')"], returncode=0)
As we expect, it’s an instance of the CompletedProcess class that shows the command and the returncode=0 that indicates that it was run successfully.
More Parameters
Now that we understand how to run a subprocess using the run function, we should explore options that enable us to make better use of the function.
Controlling the Outputs
Notice that the object returned by the code above shows only the command and the return code, but there’s no other information about the subprocess. However, if we set the capture_output parameter to True , it returns more information that gives the user more control over their code.
result = subprocess.run(["python", "-c", "print('This is a subprocess')"], capture_output=True)
CompletedProcess(args=['/usr/bin/python3', '-c', "print('This is a subprocess')"], returncode=0, stdout=b'This is a subprocess\n', stderr=b'')
Now we also have stdout and stderr in the returned object. If we print them both, we get the following:
print(result.stdout) print(result.stderr)
They both are sequences of bytes representing the outputs of the subprocess. However, we can also set the text parameter to True , and then have these outputs as strings.
If there’s an error in your script, however, stdout will be empty, and stderr will contain the error message, like this:
result = subprocess.run(["python", "-c", "print(subprocess)"], capture_output=True, text=True) print('output: ', result.stdout) print('error: ', result.stderr)
output: error: Traceback (most recent call last): File "", line 1, in NameError: name 'subprocess' is not defined
So this makes it possible to condition the rest of your code on the output of the subprocesses, to use this output as a variable throughout the code, or even just keep track of the subprocesses and store them (if they all ran without errors) and their outputs.
Raising Errors
Although we were able to generate an error message in the code above, it’s important to notice that it didn’t stop the parent process, which means the code would still run even if something wrong happened in the child process.
If we want the code to proceed if there’s an error in the subprocess, we can use the check parameter of the run function. By setting this parameter to True , any errors in the child will be raised in the parent process and cause the entire code to stop.
Below, we use the same example as in the last section but with check=True :
result = subprocess.run(["pyhon", "-c", "print(subprocess)"], capture_output=True, text=True, check=True) print('output: ', result.stdout) print('error: ', result.stderr)
--------------------------------------------------------------------------- CalledProcessError Traceback (most recent call last) in () ----> 1 result = subprocess.run(["python", "-c", "print(subprocess)"], capture_output=True, text=True, check=True) 2 print('output: ', result.stdout) 3 print('error: ', result.stderr) /usr/lib/python3.7/subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs) 510 if check and retcode: 511 raise CalledProcessError(retcode, process.args, --> 512 output=stdout, stderr=stderr) 513 return CompletedProcess(process.args, retcode, stdout, stderr) 514 CalledProcessError: Command '['python', '-c', 'print(subprocess)']' returned non-zero exit status 1.
The code raised an error stating the subprocess returned a status 1.
Another way of raising an error is using the timeout parameter. As it goes by the name, the use of this parameter will stop the child process and raise an error if it takes longer than expected to run.
For instance, the code in the subprocess below takes five seconds to run:
subprocess.run(["python", "-c", "import time; time.sleep(5)"], capture_output=True, text=True)
But if we set the timeout parameter to less than five, we’ll have an exception:
subprocess.run(["python", "-c", "import time; time.sleep(5)"], capture_output=True, text=True, timeout=2)
--------------------------------------------------------------------------- TimeoutExpired Traceback (most recent call last) in () ----> 1 subprocess.run(["python", "-c", "import time; time.sleep(5)"], capture_output=True, text=True, timeout=2) /usr/lib/python3.7/subprocess.py in _check_timeout(self, endtime, orig_timeout, stdout_seq, stderr_seq, skip_check_and_raise) 1009 self.args, orig_timeout, 1010 output=b''.join(stdout_seq) if stdout_seq else None, -> 1011 stderr=b''.join(stderr_seq) if stderr_seq else None) 1012 1013 TimeoutExpired: Command '['python', '-c', 'import time; time.sleep(5)']' timed out after 2 seconds
Inputting in the Subprocess
We saw earlier that it’s possible to use the output of a child process throughout the rest of the parent code, but the opposite is also true: we can input a value from the parent process to the child using the input parameter.
We use this parameter to send any sequences of bytes or strings (if text=True ) to the subprocess, which will receive this information through the sys module. The sys.stdin.read() function will read the input parameter in the child process, in which it can be assigned to a variable and be used like any other variable in the code.
result = subprocess.run(["python", "-c", "import sys; my_input=sys.stdin.read(); print(my_input)"], capture_output=True, text=True, input='my_text') print(result.stdout)
The code in the subprocess above imports the module and uses the sys.stdin.read() to assign the input to a variable, and then it prints that variable.
However, we can also input values through args and use sys.argv to read them inside the child code. For instance, let’s say we have the following code in a script called my_script.py :
# ../subprocess/my_script.py import sys my_input = sys.argv def sum_two_values(a=int(my_input[1]), b=int(my_input[2])): return a + b if __name__=="__main__": print(sum_two_values())
The args in the run function below are inside the script that will become the parent process for the child process above, and it contains two other values that will be accessed by sys.argv and then added up in my_script.py .
# ../subprocess/main.py result = subprocess.run(["python", "my_script.py", "2", "4"], capture_output=True, text=True) print(result.stdout)
Conclusion
The Python subprocess module and the subprocess.run() function are very powerful tools for interacting with other processes while running a Python script.
In this article, we covered the following:
- Processes and subprocesses
- How to run a subprocess using the run() function
- How to access the outputs and the errors generated by the subprocess
- How to raise an error in the parent process if there’s an error in the child
- How to set a timeout for the child process
- How to send inputs to the child process
Otávio Simões Silveira
Otávio is an economist and data scientist from Brazil. In his free time, he writes about Python and Data Science on the internet. You can find him at LinkedIn.
You May Also Like
Python Ternary: How to Use It and Why It’s Useful (with Examples)
Do You Post Too Much? Analyze Your Personal Facebook Data with Python
Learn data skills for free