Python 3: List the Contents of a Directory, Including Recursively
This article shows how to list the files and directories inside a directory using Python 3. Throughout this article, we’ll refer to the following example directory structure:
We’ll assume the code examples will be saved in script.py above, and will be run from inside the mydir directory so that the relative path ‘.’ always refers to mydir .
Using pathlib (Python 3.4 and up)
Non-Recursive
iterdir
To list the contents of a directory using Python 3.4 or higher, we can use the built-in pathlib library’s iterdir() to iterate through the contents. In our example directory, we can write in script.py :
from pathlib import Path for p in Path( '.' ).iterdir(): print( p )
When we run from inside mydir , we should see output like:
Because iterdir is non-recursive, it only lists the immediate contents of mydir and not the contents of subdirectories (like a1.html ).
Note that each item returned by iterdir is also a pathlib.Path , so we can call any pathlib.Path method on the object. For example, to resolve each item as an absolute path, we can write in script.py :
from pathlib import Path for p in Path( '.' ).iterdir(): print( p.resolve() )
This will list the resolved absolute path of each item instead of just the filenames.
Because iterdir returns a generator object (meant to be used in loops), if we want to store the results in a list variable, we can write:
from pathlib import Path files = list( Path( '.' ).iterdir() ) print( files )
glob
We can also use pathlib.Path.glob to list all files (the equivalent of iterdir ):
from pathlib import Path for p in Path( '.' ).glob( '*' ): print( p )
Filename Pattern Matching with glob
If we want to filter our results using Unix glob command-style pattern matching, glob can handle that too. For example, if we only want to list .html files, we would write in script.py :
from pathlib import Path for p in Path( '.' ).glob( '*.html' ): print( p )
As with iterdir , glob returns a generator object, so we’ll have to use list() if we want to convert it to a list:
from pathlib import Path files = list( Path( '.' ).glob( '*.html' ) ) print( files )
Recursive
To recursively list the entire directory tree rooted at a particular directory (including the contents of subdirectories), we can use rglob . In script.py , we can write:
from pathlib import Path for p in Path( '.' ).rglob( '*' ): print( p )
This time, when we run script.py from inside mydir , we should see output like:
rglob is the equivalent of calling glob with **/ at the beginning of the path, so the following code is equivalent to the rglob code we just saw:
from pathlib import Path for p in Path( '.' ).glob( '**/*' ): print( p )
Filename Pattern Matching with rglob
Just as with glob , rglob also allows glob-style pattern matching, but automatically does so recursively. In our example, to list all *.html files in the directory tree rooted at mydir , we can write in script.py :
from pathlib import Path for p in Path( '.' ).rglob( '*.html' ): print( p )
This should display all and only .html files, including those inside subdirectories:
Since rglob is the same as calling glob with **/ , we could also just use glob to achieve the same result:
from pathlib import Path for p in Path( '.' ).glob( '**/*.html' ): print( p )
Not Using pathlib
Non-Recursive
os.listdir
On any version of Python 3, we can use the built-in os library to list directory contents. In script.py , we can write:
import os for filename in os.listdir( '.' ): print( filename )
Unlike with pathlib , os.listdir simply returns filenames as strings, so we can’t call methods like .resolve() on the result items. To get full paths, we have to build them manually:
import os root = '.' for filename in os.listdir( root ): relative_path = os.path.join( root, filename ) absolute_path = os.path.abspath( relative_path ) print( absolute_path )
Another difference from pathlib is that os.listdir returns a list of strings, so we don’t need to call list() on the result to convert it to a list:
import os files = os.listdir( '.' ) # files is a list print( files )
glob
Also available on all versions of Python 3 is the built-in glob library, which provides Unix glob command-style filename pattern matching.
To list all items in a directory (equivalent to os.listdir ), we can write in script.py :
import glob for filename in glob.glob( './*' ): print( filename )
This will produce output like:
Note that the root directory ( ‘.’ in our example) is simply included in the path pattern passed into glob.glob() .
Filename Pattern Matching with glob
To list only .html files, we can write in script.py :
import glob for filename in glob.glob( './*.html' ): print( filename )
Recursive
Since Python versions lower than 3.5 do not have a recursive glob option, and Python versions 3.5 and up have pathlib.Path.rglob , we’ll skip recursive examples of glob.glob here.
os.walk
On any version of Python 3, we can use os.walk to list all the contents of a directory recursively.
os.walk() returns a generator object that can be used with a for loop. Each iteration yields a 3-tuple that represents a directory in the directory tree: — current_dir : the path of the directory that the current iteration represents; — subdirs : list of names (strings) of immediate subdirectories of current_dir ; and — files : list of names (strings) of files inside current_dir .
In our example, we can write in script.py :
import os for current_dir, subdirs, files in os.walk( '.' ): # Current Iteration Directory print( current_dir ) # Directories for dirname in subdirs: print( '\t' + dirname ) # Files for filename in files: print( '\t' + filename )
This produces the following output: