How to Use Python's glob Module to Find Files by Pattern

The glob module in Python is a powerful tool for finding files and directories that match a specified pattern. It's particularly useful when you need to search for files with specific extensions or names in a directory. This article will guide you through using the glob module to locate files efficiently.

Introduction to the glob Module

The glob module provides a way to list files and directories using Unix shell-style wildcards. This is useful for tasks such as file searches, batch processing, and more. The module includes functions that allow you to match filenames with patterns, making file management easier.

Basic Usage of glob

To use the glob module, you'll need to import it and then use the glob.glob() function to find files matching a pattern. The patterns you use can include wildcards such as * (matches any number of characters) and ? (matches a single character).

Examples

Finding Files with a Specific Extension

For example, to find all files with the .txt extension in a directory, you can use:

import glob

# Find all .txt files in the current directory
txt_files = glob.glob('*.txt')
print(txt_files)

Finding Files in a Subdirectory

To find all files with a specific extension in a subdirectory, specify the subdirectory in the pattern:

import glob

# Find all .jpg files in the 'images' subdirectory
jpg_files = glob.glob('images/*.jpg')
print(jpg_files)

Using Wildcards

Wildcards can help you match a broader range of files. For example, to find all text files starting with "report":

import glob

# Find all files starting with 'report' and ending with .txt
report_files = glob.glob('report*.txt')
print(report_files)

Finding Files with Multiple Extensions

You can also use glob to find files with multiple extensions by using a pattern that includes multiple wildcard matches:

import glob

# Find all .txt and .md files
files = glob.glob('*.txt') + glob.glob('*.md')
print(files)

Advanced Usage

In addition to simple patterns, the glob module supports more complex patterns. For instance, you can use patterns like ** to recursively search directories.

Recursive Search

To search for files recursively in all subdirectories, use the ** pattern along with the recursive=True argument:

import glob

# Find all .py files in the current directory and subdirectories
py_files = glob.glob('**/*.py', recursive=True)
print(py_files)

Conclusion

The glob module is an essential tool for managing files in Python, allowing you to search for files using patterns with ease. Whether you're working with specific file types, searching through directories, or performing complex searches, glob provides a simple and effective solution.