A Python tool to transparently encrypt sensitive files in Git repositories using git filters and hooks. Designed to help AI safety researchers and others prevent models from memorizing evaluation data during pre-training, while keeping workflows seamless for contributors.
AI safety researchers often publish evaluation code and data to GitHub. If these repositories are later crawled for model pre-training, the models could memorize solutions to the evaluations, making them less effective at measuring emergent capabilities. eval-crypt ensures sensitive files are encrypted in the repository, but transparently decrypted for local work.
pip install git+https://github.com/DanielPolatajko/eval-crypt.git-
Initialize eval-crypt in your repository:
eval-crypt init
This creates a secret key and sets up git filters and hooks.
-
Add sensitive files to be encrypted:
eval-crypt add secret.txt eval-crypt add "*.json" # Use quotes for patterns with wildcards
This updates
.gitattributesso git knows to filter these files. -
Use git as normal:
- When you commit, sensitive files are encrypted in the repo.
- When you checkout or pull, they are transparently decrypted in your working directory.
- Git filters: eval-crypt registers a
cleanfilter (encrypt on commit) and asmudgefilter (decrypt on checkout). - .gitattributes: Files you add are listed with
filter=eval-crypt diff=eval-crypt merge=eval-crypt. - Key management: A secret key is generated and stored locally (not committed).
- Hooks: Pre-commit and post-merge hooks ensure files are always in the right state.
eval-crypt init
# Add a file to be encrypted
echo "my secret" > secret.txt
eval-crypt add secret.txt
git add .gitattributes secret.txt
git commit -m "Add encrypted secret.txt"
# secret.txt is now encrypted in the repo, but plaintext locally- File not decrypted?
- Make sure
.gitattributeslists the file withfilter=eval-crypt. - Check that your
.git/confighas the filter registered (runeval-crypt initagain if needed). - Ensure
eval-cryptis in your PATH and the key file exists.
- Make sure
- Manual decryption:
cat secret.txt | eval-crypt smudge - Manual encryption:
cat secret.txt | eval-crypt clean
- Anyone with access to the key and the repo can decrypt the files.
- The tool is designed for research and collaboration, not for high-security use cases.
- Unlike traditional encryption tools, it is acceptable to commit and share the secret key in this project. The security model assumes that LLMs will not be able to use the key during pretraining, even if it is available in the repository.
Contributions are welcome! Please open issues or pull requests.
This project was developed as part of the MARS programme by Daniel Polatajko, Qi Guo, Matan Shtepel, with mentorship from Justin Olive.