- Python 3: List the Contents of a Directory, Including Recursively
- Using pathlib (Python 3.4 and up)
- Non-Recursive
- iterdir
- glob
- Filename Pattern Matching with glob
- Recursive
- Filename Pattern Matching with rglob
- Not Using pathlib
- Non-Recursive
- os.listdir
- glob
- Filename Pattern Matching with glob
- Recursive
- os.walk
- Python : How to get list of files in directory and sub directories
- Creating a list of files in directory and sub directories using os.listdir()
- Frequently Asked:
- Creating a list of files in directory and sub directories using os.walk()
- Related posts:
Python 3: List the Contents of a Directory, Including Recursively
This article shows how to list the files and directories inside a directory using Python 3. Throughout this article, we’ll refer to the following example directory structure:
We’ll assume the code examples will be saved in script.py above, and will be run from inside the mydir directory so that the relative path ‘.’ always refers to mydir .
Using pathlib (Python 3.4 and up)
Non-Recursive
iterdir
To list the contents of a directory using Python 3.4 or higher, we can use the built-in pathlib library’s iterdir() to iterate through the contents. In our example directory, we can write in script.py :
from pathlib import Path for p in Path( '.' ).iterdir(): print( p )
When we run from inside mydir , we should see output like:
Because iterdir is non-recursive, it only lists the immediate contents of mydir and not the contents of subdirectories (like a1.html ).
Note that each item returned by iterdir is also a pathlib.Path , so we can call any pathlib.Path method on the object. For example, to resolve each item as an absolute path, we can write in script.py :
from pathlib import Path for p in Path( '.' ).iterdir(): print( p.resolve() )
This will list the resolved absolute path of each item instead of just the filenames.
Because iterdir returns a generator object (meant to be used in loops), if we want to store the results in a list variable, we can write:
from pathlib import Path files = list( Path( '.' ).iterdir() ) print( files )
glob
We can also use pathlib.Path.glob to list all files (the equivalent of iterdir ):
from pathlib import Path for p in Path( '.' ).glob( '*' ): print( p )
Filename Pattern Matching with glob
If we want to filter our results using Unix glob command-style pattern matching, glob can handle that too. For example, if we only want to list .html files, we would write in script.py :
from pathlib import Path for p in Path( '.' ).glob( '*.html' ): print( p )
As with iterdir , glob returns a generator object, so we’ll have to use list() if we want to convert it to a list:
from pathlib import Path files = list( Path( '.' ).glob( '*.html' ) ) print( files )
Recursive
To recursively list the entire directory tree rooted at a particular directory (including the contents of subdirectories), we can use rglob . In script.py , we can write:
from pathlib import Path for p in Path( '.' ).rglob( '*' ): print( p )
This time, when we run script.py from inside mydir , we should see output like:
rglob is the equivalent of calling glob with **/ at the beginning of the path, so the following code is equivalent to the rglob code we just saw:
from pathlib import Path for p in Path( '.' ).glob( '**/*' ): print( p )
Filename Pattern Matching with rglob
Just as with glob , rglob also allows glob-style pattern matching, but automatically does so recursively. In our example, to list all *.html files in the directory tree rooted at mydir , we can write in script.py :
from pathlib import Path for p in Path( '.' ).rglob( '*.html' ): print( p )
This should display all and only .html files, including those inside subdirectories:
Since rglob is the same as calling glob with **/ , we could also just use glob to achieve the same result:
from pathlib import Path for p in Path( '.' ).glob( '**/*.html' ): print( p )
Not Using pathlib
Non-Recursive
os.listdir
On any version of Python 3, we can use the built-in os library to list directory contents. In script.py , we can write:
import os for filename in os.listdir( '.' ): print( filename )
Unlike with pathlib , os.listdir simply returns filenames as strings, so we can’t call methods like .resolve() on the result items. To get full paths, we have to build them manually:
import os root = '.' for filename in os.listdir( root ): relative_path = os.path.join( root, filename ) absolute_path = os.path.abspath( relative_path ) print( absolute_path )
Another difference from pathlib is that os.listdir returns a list of strings, so we don’t need to call list() on the result to convert it to a list:
import os files = os.listdir( '.' ) # files is a list print( files )
glob
Also available on all versions of Python 3 is the built-in glob library, which provides Unix glob command-style filename pattern matching.
To list all items in a directory (equivalent to os.listdir ), we can write in script.py :
import glob for filename in glob.glob( './*' ): print( filename )
This will produce output like:
Note that the root directory ( ‘.’ in our example) is simply included in the path pattern passed into glob.glob() .
Filename Pattern Matching with glob
To list only .html files, we can write in script.py :
import glob for filename in glob.glob( './*.html' ): print( filename )
Recursive
Since Python versions lower than 3.5 do not have a recursive glob option, and Python versions 3.5 and up have pathlib.Path.rglob , we’ll skip recursive examples of glob.glob here.
os.walk
On any version of Python 3, we can use os.walk to list all the contents of a directory recursively.
os.walk() returns a generator object that can be used with a for loop. Each iteration yields a 3-tuple that represents a directory in the directory tree: — current_dir : the path of the directory that the current iteration represents; — subdirs : list of names (strings) of immediate subdirectories of current_dir ; and — files : list of names (strings) of files inside current_dir .
In our example, we can write in script.py :
import os for current_dir, subdirs, files in os.walk( '.' ): # Current Iteration Directory print( current_dir ) # Directories for dirname in subdirs: print( '\t' + dirname ) # Files for filename in files: print( '\t' + filename )
This produces the following output:
Python : How to get list of files in directory and sub directories
In this article we will discuss different methods to generate a list of all files in directory tree.
Creating a list of files in directory and sub directories using os.listdir()
Python’s os module provides a function to get the list of files or folder in a directory i.e.
It returns a list of all the files and sub directories in the given path.
We need to call this recursively for sub directories to create a complete list of files in given directory tree i.e.
Frequently Asked:
''' For the given path, get the List of all files in the directory tree ''' def getListOfFiles(dirName): # create a list of file and sub directories # names in the given directory listOfFile = os.listdir(dirName) allFiles = list() # Iterate over all the entries for entry in listOfFile: # Create full path fullPath = os.path.join(dirName, entry) # If entry is a directory then get the list of files in this directory if os.path.isdir(fullPath): allFiles = allFiles + getListOfFiles(fullPath) else: allFiles.append(fullPath) return allFiles
Call the above function to create a list of files in a directory tree i.e.
dirName = '/home/varun/Downloads'; # Get the list of all files in directory tree at given path listOfFiles = getListOfFiles(dirName)
Creating a list of files in directory and sub directories using os.walk()
Python’s os module provides a function to iterate over a directory tree i.e.
It iterates of the directory tree at give path and for each directory or sub directory it returns a tuple containing,
( , , .
Iterate over the directory tree and generate a list of all the files at given path,
# Get the list of all files in directory tree at given path listOfFiles = list() for (dirpath, dirnames, filenames) in os.walk(dirName): listOfFiles += [os.path.join(dirpath, file) for file in filenames]
Complete example is as follows,
import os ''' For the given path, get the List of all files in the directory tree ''' def getListOfFiles(dirName): # create a list of file and sub directories # names in the given directory listOfFile = os.listdir(dirName) allFiles = list() # Iterate over all the entries for entry in listOfFile: # Create full path fullPath = os.path.join(dirName, entry) # If entry is a directory then get the list of files in this directory if os.path.isdir(fullPath): allFiles = allFiles + getListOfFiles(fullPath) else: allFiles.append(fullPath) return allFiles def main(): dirName = '/home/varun/Downloads'; # Get the list of all files in directory tree at given path listOfFiles = getListOfFiles(dirName) # Print the files for elem in listOfFiles: print(elem) print ("****************") # Get the list of all files in directory tree at given path listOfFiles = list() for (dirpath, dirnames, filenames) in os.walk(dirName): listOfFiles += [os.path.join(dirpath, file) for file in filenames] # Print the files for elem in listOfFiles: print(elem) if __name__ == '__main__': main()
/home/varun/Downloads/temp1.txt /home/varun/Downloads/sample/temp2.txt /home/varun/Downloads/test/message.txt