Python get folder contents

List of all files in a directory using Python

It can be very handy to know how to programmatically get a list of all files in a folder. For example, you have a folder full of text files containing useful data that you want to collate into a dataset or you just want to find out whether a given file exists in a folder or not. In this tutorial, we will look at how to get a list of all the files in a folder using Python.

How to get a list of files in a directory?

There are a number of ways to get a list of all files in a directory using Python. You can use the os module’s os.listdir() or the glob module’s glob.glob() functions to list out the contents of a directory.

📚 Discover Online Data Science Courses & Programs (Enroll for Free)

Introductory ⭐

Intermediate ⭐⭐⭐

🔎 Find Data Science Programs 👨‍💻 111,889 already enrolled

Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.

Читайте также:  Тестовый проект 1

Let’s demonstrate the usage for each of these methods with the help of some examples. First, let’s look at the directory structure of the directory we want to use for this tutorial.

Directory structure of the current working directory called weather.

The “weather” directory contains one python script, one requirements text file, one README markdown file, and a directory named “data” which stores the data for the project.

1. Using os module

The os module in python comes with a number of handy functions for file handling. To list out the contents of a directory, you can use the os.listdir() function. It returns a list of all files and directories in a directory.

For example, let’s use it to get the list of contents in the current working directory which is the “weather” directory from the tree shown above.

['data', 'README.md', 'requirements.txt', 'train.py']

You can see we get all the files and directories in the current working directory. You can, however, pass a custom directory path to list out its contents instead. For example, let’s list out the contents of the “data” directory present inside the current working directory.

import os print(os.listdir('./data'))
['chennai.txt', 'data sources', 'delhi.txt', 'kolkata.txt', 'mumbai.txt', 'test_set.csv', 'train_set.csv']

We get a list of all files and folders present in the “data” directory. In this example, we passed a relative path but you can also pass an absolute path and get its contents as well.

If you only want to get a list of files and not the directories, you can use the os.path.isfile() function which checks whether a given path is a file or not. For example, let’s list out only the files (and not directories) inside the “data” directory.

import os from os.path import isfile, join # set the base path base_path = './data' file_ls = [f for f in os.listdir(base_path) if isfile(join(base_path, f))] print(file_ls)
['chennai.txt', 'delhi.txt', 'kolkata.txt', 'mumbai.txt', 'test_set.csv', 'train_set.csv']

You can see that we only get the files and not the directories present inside the “data” folder.

For more on the os module in python, refer to its documentation.

2. Using glob module

You can also use the glob module to get a list of files in a directory. Let’s use it to list out the files in our current directory.

import glob print(glob.glob("*"))
['data', 'README.md', 'requirements.txt', 'train.py']

You can see that we get all the files and directories in the current working directory. Note that we passed “*” as the parameter to the glob.glob() function which results in listing all the files and folders in the given directory.

You can also specify the types of files you want to get from a path. For example, to only get text files from the “data” folder in our current working directory –

import glob print(glob.glob("data/*.txt"))
['data\\chennai.txt', 'data\\delhi.txt', 'data\\kolkata.txt', 'data\\mumbai.txt']

We get a list of only the text files present in the “data” folder. Note that the above result is obtained on a Windows machine hence the “\\” in the path.

With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel.

Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.

Author

Piyush is a data professional passionate about using data to understand things better and make informed decisions. He has experience working as a Data Scientist in the consulting domain and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects. View all posts

Data Science Parichay is an educational website offering easy-to-understand tutorials on topics in Data Science with the help of clear and fun examples.

Источник

Python List Files in a Directory

In this article, we will see how to list all files of a directory in Python. There are multiple ways to list files of a directory. In this article, We will use the following four methods.

  • os.listdir(‘dir_path’) : Return the list of files and directories present in a specified directory path.
  • os.walk(‘dir_path’) : Recursively get the list all files in directory and subdirectories.
  • os.scandir(‘path’) : Returns directory entries along with file attribute information.
  • glob.glob(‘pattern’) : glob module to list files and folders whose names follow a specific pattern.

Table of contents

How to List All Files of a Directory

Getting a list of files of a directory is easy as pie! Use the listdir() and isfile() functions of an os module to list all files of a directory. Here are the steps.

  1. Import os module This module helps us to work with operating system-dependent functionality in Python. The os module provides functions for interacting with the operating system.
  2. Use os.listdir() function The os.listdir(‘path’) function returns a list containing the names of the files and directories present in the directory given by the path .
  3. Iterate the result Use for loop to Iterate the files returned by the listdir() function. Using for loop we will iterate each file returned by the listdir() function
  4. Use isfile() function In each loop iteration, use the os.path.isfile(‘path’) function to check whether the current path is a file or directory. If it is a file, then add it to a list. This function returns True if a given path is a file. Otherwise, it returns False.

Example to List Files of a Directory

Let’s see how to list files of an ‘account’ folder. The listdir() will list files only in the current directory and ignore the subdirectories.

Example 1: List only files from a directory

import os # folder path dir_path = r'E:\\account\\' # list to store files res = [] # Iterate directory for path in os.listdir(dir_path): # check if current path is a file if os.path.isfile(os.path.join(dir_path, path)): res.append(path) print(res)

Here we got three file names.

['profit.txt', 'sales.txt', 'sample.txt']

If you know generator expression, you can make code smaller and simplers using a generator function as shown below.

Generator Expression:

import os def get_files(path): for file in os.listdir(path): if os.path.isfile(os.path.join(path, file)): yield file

Then simply call it whenever required.

for file in get_files(r'E:\\account\\'): print(file)

Example 2: List both files and directories.

Directly call the listdir(‘path’) function to get the content of a directory.

import os # folder path dir_path = r'E:\\account\\' # list file and directories res = os.listdir(dir_path) print(res)

As you can see in the output, ‘reports_2021’ is a directory.

['profit.txt', 'reports_2021', 'sales.txt', 'sample.txt']

os.walk() to list all files in directory and subdirectories

The os.walk() function returns a generator that creates a tuple of values (current_path, directories in current_path, files in current_path).

Note: Using the os.walk() function we can list all directories, subdirectories, and files in a given directory.

It is a recursive function, i.e., every time the generator is called, it will follow each directory recursively to get a list of files and directories until no further sub-directories are available from the initial directory.

For example, calling the os.walk(‘path’) will yield two lists for each directory it visits. The first list contains files, and the second list includes directories.

Let’s see the example to list all files in directory and subdirectories.

from os import walk # folder path dir_path = r'E:\\account\\' # list to store files name res = [] for (dir_path, dir_names, file_names) in walk(dir_path): res.extend(file_names) print(res)
['profit.txt', 'sales.txt', 'sample.txt', 'december_2021.txt']

Note: Add break inside a loop to stop looking for files recursively inside subdirectories.

from os import walk # folder path dir_path = r'E:\\account\\' res = [] for (dir_path, dir_names, file_names) in walk(dir_path): res.extend(file_names) # don't look inside any subdirectory break print(res) 

os.scandir() to get files of a directory

The scandir() function returns directory entries along with file attribute information, giving better performance for many common use cases.

It returns an iterator of os.DirEntry objects, which contains file names.

import os # get all files inside a specific folder dir_path = r'E:\\account\\' for path in os.scandir(dir_path): if path.is_file(): print(path.name)
profit.txt sales.txt sample.txt

Glob Module to list Files of a Directory

The Python glob module, part of the Python Standard Library, is used to find the files and folders whose names follow a specific pattern.

For example, to get all files of a directory, we will use the dire_path/*.* pattern. Here, *.* means file with any extension.

Let’s see how to list files from a directory using a glob module.

import glob # search all files inside a specific folder # *.* means file name with any extension dir_path = r'E:\account\*.*' res = glob.glob(dir_path) print(res)
['E:\\account\\profit.txt', 'E:\\account\\sales.txt', 'E:\\account\\sample.txt']

Note: If you want to list files from subdirectories, then set the recursive attribute to True.

import glob # search all files inside a specific folder # *.* means file name with any extension dir_path = r'E:\demos\files_demos\account\**\*.*' for file in glob.glob(dir_path, recursive=True): print(file)
E:\account\profit.txt E:\account\sales.txt E:\account\sample.txt E:\account\reports_2021\december_2021.txt

Pathlib Module to list files of a directory

From Python 3.4 onwards, we can use the pathlib module, which provides a wrapper for most OS functions.

  • Import pathlib module: Pathlib module offers classes and methods to handle filesystem paths and get data related to files for different operating systems.
  • Next, Use the pathlib.Path(‘path’) to construct directory path
  • Next, Use the iterdir() to iterate all entries of a directory
  • In the end, check if a current entry is a file using the path.isfile() function
import pathlib # folder path dir_path = r'E:\\account\\' # to store file names res = [] # construct path object d = pathlib.Path(dir_path) # iterate directory for entry in d.iterdir(): # check if it a file if entry.is_file(): res.append(entry) print(res)

Did you find this page helpful? Let others know about it. Sharing helps me continue to create free Python resources.

About Vishal

I’m Vishal Hule, Founder of PYnative.com. I am a Python developer, and I love to write articles to help students, developers, and learners. Follow me on Twitter

Python Exercises and Quizzes

Free coding exercises and quizzes cover Python basics, data structure, data analytics, and more.

  • 15+ Topic-specific Exercises and Quizzes
  • Each Exercise contains 10 questions
  • Each Quiz contains 12-15 MCQ

Источник

Оцените статью