How to Change Hugging Face Transformers Default Cache Directory

Last Updated : 30 Oct, 2024

When working with Hugging Face Transformers, models and tokenizers are downloaded and cached by default in a specific directory. This can lead to issues if you run out of space on your main drive or if you want to keep your models organized in a different location. Fortunately, changing the cache directory is straightforward and can be done in various ways.

In this article, we’ll explore how to change the default cache directory for Hugging Face Transformers.

Why Change the Default Cache Directory?

Changing the default cache directory can be beneficial for several reasons:

  • Space Management: If your primary drive is running low on space, redirecting the cache to another drive can prevent interruptions during model downloads.
  • Organization: Keeping your models organized in a dedicated directory can help maintain a cleaner workspace, especially when working with multiple projects.
  • Collaboration: In collaborative environments, setting a shared cache directory can ensure that team members use the same resources without redundancy.

Methods to Change the Default Cache Directory

1. Change Directory Using Python Code

You can set the cache directory programmatically within your Python scripts. This method allows for flexibility since you can change the directory as needed in different projects.

Here’s how you can do it:

Python
import os

# Set your desired cache directory
os.environ['TRANSFORMERS_CACHE'] = '/path/to/your/cache/directory'

from transformers import AutoModel, AutoTokenizer

# Load model and tokenizer, which will now use the new cache directory
model = AutoModel.from_pretrained('bert-base-uncased')
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

2. Change Directory Using Command Line

If you prefer to set the cache directory from the command line, you can export the environment variable before running your Python script. This method is useful for temporary changes.

For Linux/MacOS:

export TRANSFORMERS_CACHE=/path/to/your/cache/directory
python your_script.py

For Windows (Command Prompt):

set TRANSFORMERS_CACHE=C:\path\to\your\cache\directory
python your_script.py

For Windows (PowerShell):

$env:TRANSFORMERS_CACHE="C:\path\to\your\cache\directory"
python your_script.py

3. Change Directory Using Configuration Files

If you want to make a permanent change that applies to all your projects, you can add the environment variable to your shell configuration file.

  • For Bash, add the following line to your ~/.bashrc or ~/.bash_profile:
export TRANSFORMERS_CACHE=/path/to/your/cache/directory
  • For Zsh, add the same line to your ~/.zshrc.
  • For Fish, use:
set -x TRANSFORMERS_CACHE /path/to/your/cache/directory

Conclusion

Changing the default cache directory for Hugging Face Transformers is a simple yet effective way to manage your resources. Whether you prefer to do it programmatically, via the command line, or through configuration files, you can customize your setup to fit your needs. By directing your cache to a more suitable location, you can optimize your workflow, keep your workspace organized, and avoid potential issues related to storage space.

Comment