- Handling file and directory Paths
- Linux and Windows Paths
- The current working directory
- Creating new folders
- Absolute vs. Relative paths
- Handling Absolute paths
- Handling Relative paths
- Path and File validity
- Checking if a file/directory exists
- Checking if a path is a file
- Checking if a path is a directory
- Getting a file’s size in bytes
- Listing directories
- Directory file sizes
- Copying files and folders
- Moving and Renaming
- Deleting files and folders
- Walking a Directory Tree
- Python window file path
- Referencing a File in Windows
- File Name Shortcuts and CWD (Current Working Directory)
- Finding and Changing CWD
Handling file and directory Paths
There are two main modules in Python that deal with path manipulation. One is the os.path module and the other is the pathlib module.
The `pathlib` module was added in Python 3.4, offering an object-oriented way to handle file system paths.
Linux and Windows Paths
On Windows, paths are written using backslashes ( \ ) as the separator between folder names. On Unix based operating system such as macOS, Linux, and BSDs, the forward slash ( / ) is used as the path separator. Joining paths can be a headache if your code needs to work on different platforms.
Fortunately, Python provides easy ways to handle this. We will showcase how to deal with both, os.path.join and pathlib.Path.joinpath
Using os.path.join on Windows:
>>> import os >>> os.path.join('usr', 'bin', 'spam') # 'usr\\bin\\spam'
>>> from pathlib import Path >>> print(Path('usr').joinpath('bin').joinpath('spam')) # usr/bin/spam
pathlib also provides a shortcut to joinpath using the / operator:
>>> from pathlib import Path >>> print(Path('usr') / 'bin' / 'spam') # usr/bin/spam
Notice the path separator is different between Windows and Unix based operating system, that’s why you want to use one of the above methods instead of adding strings together to join paths together.
Joining paths is helpful if you need to create different file paths under the same directory.
Using os.path.join on Windows:
>>> my_files = ['accounts.txt', 'details.csv', 'invite.docx'] >>> for filename in my_files: ... print(os.path.join('C:\\Users\\asweigart', filename)) ... # C:\Users\asweigart\accounts.txt # C:\Users\asweigart\details.csv # C:\Users\asweigart\invite.docx
>>> my_files = ['accounts.txt', 'details.csv', 'invite.docx'] >>> home = Path.home() >>> for filename in my_files: ... print(home / filename) ... # /home/asweigart/accounts.txt # /home/asweigart/details.csv # /home/asweigart/invite.docx
The current working directory
>>> import os >>> os.getcwd() # 'C:\\Python34' >>> os.chdir('C:\\Windows\\System32') >>> os.getcwd() # 'C:\\Windows\\System32'
>>> from pathlib import Path >>> from os import chdir >>> print(Path.cwd()) # /home/asweigart >>> chdir('/usr/lib/python3.6') >>> print(Path.cwd()) # /usr/lib/python3.6
Creating new folders
>>> import os >>> os.makedirs('C:\\delicious\\walnut\\waffles')
>>> from pathlib import Path >>> cwd = Path.cwd() >>> (cwd / 'delicious' / 'walnut' / 'waffles').mkdir() # Traceback (most recent call last): # File "", line 1, in # File "/usr/lib/python3.6/pathlib.py", line 1226, in mkdir # self._accessor.mkdir(self, mode) # File "/usr/lib/python3.6/pathlib.py", line 387, in wrapped # return strfunc(str(pathobj), *args) # FileNotFoundError: [Errno 2] No such file or directory: '/home/asweigart/delicious/walnut/waffles'
Oh no, we got a nasty error! The reason is that the ‘delicious’ directory does not exist, so we cannot make the ‘walnut’ and the ‘waffles’ directories under it. To fix this, do:
>>> from pathlib import Path >>> cwd = Path.cwd() >>> (cwd / 'delicious' / 'walnut' / 'waffles').mkdir(parents=True)
Absolute vs. Relative paths
There are two ways to specify a file path.
- An absolute path, which always begins with the root folder
- A relative path, which is relative to the program’s current working directory
There are also the dot ( . ) and dot-dot ( .. ) folders. These are not real folders, but special names that can be used in a path. A single period (“dot”) for a folder name is shorthand for “this directory.” Two periods (“dot-dot”) means “the parent folder.”
Handling Absolute paths
To see if a path is an absolute path:
>>> import os >>> os.path.isabs('/') # True >>> os.path.isabs('..') # False
>>> from pathlib import Path >>> Path('/').is_absolute() # True >>> Path('..').is_absolute() # False
You can extract an absolute path with both os.path and pathlib
>>> import os >>> os.getcwd() '/home/asweigart' >>> os.path.abspath('..') '/home'
from pathlib import Path print(Path.cwd()) # /home/asweigart print(Path('..').resolve()) # /home
Handling Relative paths
You can get a relative path from a starting path to another path.
>>> import os >>> os.path.relpath('/etc/passwd', '/') # 'etc/passwd'
>>> from pathlib import Path >>> print(Path('/etc/passwd').relative_to('/')) # etc/passwd
Path and File validity
Checking if a file/directory exists
>>> import os >>> os.path.exists('.') # True >>> os.path.exists('setup.py') # True >>> os.path.exists('/etc') # True >>> os.path.exists('nonexistentfile') # False
from pathlib import Path >>> Path('.').exists() # True >>> Path('setup.py').exists() # True >>> Path('/etc').exists() # True >>> Path('nonexistentfile').exists() # False
Checking if a path is a file
>>> import os >>> os.path.isfile('setup.py') # True >>> os.path.isfile('/home') # False >>> os.path.isfile('nonexistentfile') # False
>>> from pathlib import Path >>> Path('setup.py').is_file() # True >>> Path('/home').is_file() # False >>> Path('nonexistentfile').is_file() # False
Checking if a path is a directory
>>> import os >>> os.path.isdir('/') # True >>> os.path.isdir('setup.py') # False >>> os.path.isdir('/spam') # False
>>> from pathlib import Path >>> Path('/').is_dir() # True >>> Path('setup.py').is_dir() # False >>> Path('/spam').is_dir() # False
Getting a file’s size in bytes
>>> import os >>> os.path.getsize('C:\\Windows\\System32\\calc.exe') # 776192
>>> from pathlib import Path >>> stat = Path('/bin/python3.6').stat() >>> print(stat) # stat contains some other information about the file as well # os.stat_result(st_mode=33261, st_ino=141087, st_dev=2051, st_nlink=2, st_uid=0, # --snip-- # st_gid=0, st_size=10024, st_atime=1517725562, st_mtime=1515119809, st_ctime=1517261276) >>> print(stat.st_size) # size in bytes # 10024
Listing directories
Listing directory contents using os.listdir on Windows:
>>> import os >>> os.listdir('C:\\Windows\\System32') # ['0409', '12520437.cpx', '12520850.cpx', '5U877.ax', 'aaclient.dll', # --snip-- # 'xwtpdui.dll', 'xwtpw32.dll', 'zh-CN', 'zh-HK', 'zh-TW', 'zipfldr.dll']
Listing directory contents using pathlib on *nix:
>>> from pathlib import Path >>> for f in Path('/usr/bin').iterdir(): ... print(f) ... # . # /usr/bin/tiff2rgba # /usr/bin/iconv # /usr/bin/ldd # /usr/bin/cache_restore # /usr/bin/udiskie # /usr/bin/unix2dos # /usr/bin/t1reencode # /usr/bin/epstopdf # /usr/bin/idle3 # .
Directory file sizes
Directories themselves also have a size! So, you might want to check for whether a path is a file or directory using the methods in the methods discussed in the above section.
Using os.path.getsize() and os.listdir() together on Windows:
>>> import os >>> total_size = 0 >>> for filename in os.listdir('C:\\Windows\\System32'): ... total_size = total_size + os.path.getsize(os.path.join('C:\\Windows\\System32', filename)) ... >>> print(total_size) # 1117846456
>>> from pathlib import Path >>> total_size = 0 >>> for sub_path in Path('/usr/bin').iterdir(): ... total_size += sub_path.stat().st_size ... >>> print(total_size) # 1903178911
Copying files and folders
The shutil module provides functions for copying files, as well as entire folders.
>>> import shutil, os >>> os.chdir('C:\\') >>> shutil.copy('C:\\spam.txt', 'C:\\delicious') # C:\\delicious\\spam.txt' >>> shutil.copy('eggs.txt', 'C:\\delicious\\eggs2.txt') # 'C:\\delicious\\eggs2.txt'
While shutil.copy() will copy a single file, shutil.copytree() will copy an entire folder and every folder and file contained in it:
>>> import shutil, os >>> os.chdir('C:\\') >>> shutil.copytree('C:\\bacon', 'C:\\bacon_backup') # 'C:\\bacon_backup'
Moving and Renaming
>>> import shutil >>> shutil.move('C:\\bacon.txt', 'C:\\eggs') # 'C:\\eggs\\bacon.txt'
The destination path can also specify a filename. In the following example, the source file is moved and renamed:
>>> shutil.move('C:\\bacon.txt', 'C:\\eggs\\new_bacon.txt') # 'C:\\eggs\\new_bacon.txt'
If there is no eggs folder, then move() will rename bacon.txt to a file named eggs:
>>> shutil.move('C:\\bacon.txt', 'C:\\eggs') # 'C:\\eggs'
Deleting files and folders
- Calling os.unlink(path) or Path.unlink() will delete the file at path.
- Calling os.rmdir(path) or Path.rmdir() will delete the folder at path. This folder must be empty of any files or folders.
- Calling shutil.rmtree(path) will remove the folder at path, and all files and folders it contains will also be deleted.
Walking a Directory Tree
>>> import os >>> >>> for folder_name, subfolders, filenames in os.walk('C:\\delicious'): ... print(f'The current folder is folder_name>') ... for subfolder in subfolders: ... print('SUBFOLDER OF : ') ... for filename in filenames: ... print('FILE INSIDE : filename') ... print('') ... # The current folder is C:\delicious # SUBFOLDER OF C:\delicious: cats # SUBFOLDER OF C:\delicious: walnut # FILE INSIDE C:\delicious: spam.txt # The current folder is C:\delicious\cats # FILE INSIDE C:\delicious\cats: catnames.txt # FILE INSIDE C:\delicious\cats: zophie.jpg # The current folder is C:\delicious\walnut # SUBFOLDER OF C:\delicious\walnut: waffles # The current folder is C:\delicious\walnut\waffles # FILE INSIDE C:\delicious\walnut\waffles: butter.txt
`pathlib` provides a lot more functionality than the ones listed above, like getting file name, getting file extension, reading/writing a file without manually opening it, etc. See the official documentation if you intend to know more.
Python window file path
As seen in Tutorials #12 and #13, you can refer to a local file in Python using the file’s full path and file name. Below, you are opening up a file for reading:
>>> myfile = open('C:/Users/yourname/Desktop/alice.txt') # Windows >>> mytxt = myfile.read() >>> myfile.close()
>>> myfile = open('/Users/yourname/Desktop/alice.txt') # Mac and Linux >>> mytxt = myfile.read() >>> myfile.close()
In Windows, a full file directory path starts with a drive letter (C:, D:. etc.). In Linux and OS-X, it starts with «/», which is called root. Directories are separated by a slash «/». You can look up a file’s full directory path and file name through its «Properties». See how it is done in this FAQ.
Referencing a File in Windows
- Python lets you use OS-X/Linux style slashes «/» even in Windows. Therefore, you can refer to the file as 'C:/Users/yourname/Desktop/alice.txt'. RECOMMENDED.
- If using backslash, because it is a special character in Python, you must remember to escape every instance: 'C:\\Users\\yourname\\Desktop\\alice.txt'
- Alternatively, you can prefix the entire file name string with the rawstring marker «r»: r'C:\Users\yourname\Desktop\alice.txt'. That way, everything in the string is interpreted as a literal character, and you don’t have to escape every backslash.
File Name Shortcuts and CWD (Current Working Directory)
So, using the full directory path and file name always works; you should be using this method. However, you might have seen files called by their name only, e.g., 'alice.txt' in Python. How is it done?
The concept of Current Working Directory (CWD) is crucial here. You can think of it as the folder your Python is operating inside at the moment. So far we have been using the absolute path, which begins from the topmost directory. But if your file reference does not start from the top (e.g., 'alice.txt', 'ling1330/alice.txt'), Python assumes that it starts in the CWD (a «relative path«).
- In a Python script:
When you execute your script, your CWD is set to the directory where your script is. Therefore, you can refer to a file in a script by its name only provided that the file and the script are in the same directory. An example:
myfile = open('alice.txt') # alice.txt is in the same dir as foo.py mytxt = myfile.read() myfile.close() foo.py
- Change your CWD to the file’s directory, or
- Copy or move your file to your CWD. (Not recommended, since your shell’s CWD may change.)
Finding and Changing CWD
Python module os provides utilities for displaying and modifying your current working directory. Below illustrates how to find your CWD (.getcwd()) and change it into a different directory (.chdir()). Below is an example for the windows OS:
>>> import os >>> os.getcwd() 'D:\\Lab' >>> os.chdir('scripts/gutenberg') # relative path: scripts dir is under Lab >>> os.getcwd() 'D:\\Lab\\scripts\\gutenberg' >>> os.chdir(r'D:\Corpora\corpus_samples') # absolute path, using \ and r prefix >>> os.getcwd() 'D:\\Corpora\\corpus_samples'