As a data scientist, I often get to look at other people's code. Not just team mates' code, but also code from other teams within my company, as well as code from packages that I find interesting.
One of the things I notice is, how much of a difference in quality it makes when teams enforce a couple of simple checks at the gate. That is: before they commit any code.
Today we'll talk about pre-commit, or as I like to call it: the gatekeeper of good code quality.
What is pre-commit?
Pre-commit is a package manager for pre-commit hooks. This means that it can run commands (hooks) before you commit something.
We use it to run Black (for code formatting), Pylint (for code smells and enforcing certain standards), Mypy (for type hints), isort (for sorting dependencies) as well as our unit tests before committing, but there are other use cases as well.
Why is it useful?
Find out about issues with your code before you commit
Problems with your code are to be handled before a change gets pushed to master, so these things are better handled right away, not postponed till someone runs into them at a code review.
Save time waiting on failed pipelines
In case you're also verifying these checks in the CI, you are saving time waiting on failed pipelines due to failed code quality checks by making sure they're green before committing.
Cleaner repo makes for happier readers
Your repo will be cleaner and easier to read for anyone who needs to read your code and get up to speed.
Code reviewers will thank you for it
Useful for code reviewer who can now focus on the meat of the code change, not on your code style. Also, by making a hook for your tests, you can be sure that you won't forget to make all tests pass before you send your work off for code review. This will save you some 'oops I forgot' moments.
No more manual checks
We have on occasion run these checks manually (or with Makefiles), but this gets really tedious. You may forget to run one or two checks, or you may not have the same version of the check as you have in the CI, causing mixups.
How to set it up?
Fortunately, setting it up is easy.
Simply install pre-commit using pip, poetry or brew. E.g.:
pip install pre-commit
Make sure it's installed by running:
You should see something like this:
Configure your hooks in the .pre-commit-config.yaml. Pro tip: running the following willl give you a very basic starter template (you'll have to paste it into the .pre-commit-config.yaml file yourself though):
This is an example of how we've set up our pre-commit hooks in my team:
repos: - repo: <https://github.com/pre-commit/pre-commit-hooks> rev: v3.2.0 hooks: - id: check-merge-conflict - id: no-commit-to-branch name: Don't commit to master - id: trailing-whitespace - id: mixed-line-ending - id: end-of-file-fixer - id: debug-statements - id: check-ast - id: check-toml - id: check-yaml args: [--allow-multiple-documents] - repo: local hooks: - id: black name: Black entry: black language: system require_serial: true types: [python] - id: mypy name: mypy entry: mypy src/ language: system types: [python] pass_filenames: false - id: isort name: isort entry: isort --profile black src/ language: system types: [python] - id: pylint name: pylint entry: pylint src/ language: system types: [python]
The repo argument specifies whether you want to fetch a hook from the web, or use a local package. If the package is local, make sure to add it to your (dev) dependencies!
With the entry parameter, you specify how to execute the hook. This often involves the actual command to run (e.g.
pylint), and in addition you may often want to specify in which folders you want the hooks to run, for instance only in the src/ directory.
Run the following to install the git hook scripts. This is also the command you want to run if you are cloning someone else's repo to enable the pre-commit hooks that are configured there.
When you've installed new hooks, it is good practice to run the hooks against your entire codebase, instead of just the changed files.
You can do so with:
pre-commit run --all-files
In the rare occasion that you need to quickly commit something and can't spend the few minutes fixing your code after running pre-commit, there is a solution for that too:
For skipping individual hooks, use the following:
SKIP=black git commit -m "foo"
Otherwise, you can run git commit with the following flag:
git commit -m 'foo' --no-verify
That's it! Now you can test it by committing some code and verifying whether it runs the pre-commit hooks.
You should see something like this:
Let's keep in touch! 📫
Thanks for reading! If you would like to be notified whenever I post a new article, you can sign up for my email newsletter here.
If you have any comments, questions or want to collaborate, please email me at email@example.com or drop me a message on Twitter.