We are always looking for contributions! You can find below some relevant information and
standards for databooks.
After cloning the repo, make sure to set up the environment.
We use Poetry for both managing environments and packaging. That means you need to install poetry but from there you can use the tool to create the environment.
pip install poetry==1.1.12
poetry install # installs prod and dev dependenciesRemember that to use the environment you can use the poetry run <COMMAND> command or
initialize the shell with poetry shell. For example, if you want to create the
coverage report you could run
poetry run pytest --cov=databooks tests/or alternatively
poetry shell
pytest --cov=databooks tests/We welcome new features, bugfixes or enhancements (whether on code or docs). There are a few standards we adhere to, that are required for new features.
We use type hints! Not only that, they are enforced and checked (with Mypy). This is actually the reason for supporting Python 3.8+. There are a couple of reasons for using type hints, mainly:
- Better code coverage (avoid errors during runtime)
- Improve code understanding
- As
databooksuses both Typer and Pydantic, types are not only for developer hints, but they are also used to cast notebook (JSON) values to the correct types as well as user inputs in the CLI
If you are not familiar with type hints and Mypy, a good starting point is watching the Type-checked Python in the real world - PyCon 2018 talk.
In regards to code quality, we use a couple of linting tools to maintain the same "style" and uphold to the same standards. For that, we use:
As for documentation, the databooks documentation "lives" both on the code itself and
supporting documentation (markdown) files.
Code docs include annotating type hints as well as function docstrings. For that, we use the reStructuredText -like format.
Providing docstrings not only give a clear way to document the code, but it is also
picked up by MkDocs.
MkDocs gives a simple way to write markdown files that get
rendered as HTML (under a certain theme) and served as documentation. We use MkDocs
with different extensions. We use mkdocstrings to
link function docstrings with the existing documentation.
You can check the generation of documentation by running from the project root
mkdocs serveWe also use the mike plugin in MkDocs to publish
and keep different versions of documentation.
We use cog to dynamically generate parts of the
documentation. That way, code changes trigger markdown changes as well.
Pre-commit is the tool that automates everything, eases the
workflow and run checks in CI/CD. It's highly recommended installing pre-commit and the
hooks during development.
We use unit tests to ensure that our package works as expected. We use
pytest for testing and
Pytest-cov for checking how much of
the code is covered in our tests.
The tests should mimic the package directory structure. The tests are also written to serve as an example of how to use the classes and methods and expected outputs.
The coverage is also added to the documentation. For that we use
MkDocs Coverage Plugin. For that we need
a htmlcov/ directory that is generated by Pytest-cov by running from the project
root
pytest --cov-report html --cov=databooks tests/Publishing is automatically done via Github Actions to PyPI. After published, a new tag and release are created. A new docs version is also published if all previous steps are successful.
databooks was created by Murilo Cunha, and is
maintained by dataroots.
Special thanks to: