-
Notifications
You must be signed in to change notification settings - Fork 3
Feature/model autodiscovery #173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
947c803
2adf862
04c6e84
1dc3990
67dd962
c77dc4c
d6d4ac5
98caf2f
f0bc46c
96ee86d
a66f1fe
94970de
d04f626
131cf28
9c7e873
b2fd840
cc1ec43
af8e7fb
4cefa07
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -193,47 +193,73 @@ def cleanup(self) -> None: | |
| """Placeholder function for freshwater model cleanup.""" | ||
| ``` | ||
|
|
||
| ## Including a configuration schema | ||
| ## Writing a schema for the module configuration | ||
|
|
||
| A detailed description of the configuration system can be found | ||
| [here](../virtual_rainforest/core/config.md). The key thing to note is that a `JSON` | ||
| schema file should be saved within your model folder. This file should have a name of | ||
| the format "{MODEL_NAME}_schema.json". In order for this schema to be generally | ||
| accessible, it needs to be registered in the model's `__init__.py` (i.e. the | ||
| `__init__.py` in the model folder). This means that when the model is imported, it's | ||
| schema is automatically added to the schema registry. | ||
| The root module directory **must** also contain a [JSONSchema](https://json-schema.org/) | ||
| document that defines the configuration options for the model. A detailed description | ||
| of the configuration system works can be found | ||
| [here](../virtual_rainforest/core/config.md) but the schema definition is used to | ||
| validate configuration files for a Virtual Rainforest simulation that uses your model. | ||
|
|
||
| ```python | ||
| from virtual_rainforest.core.config import register_schema | ||
| So the example model used here would need to provide the file: | ||
| `virtual_rainforest/models/freshwater/freshwater_schema.json` | ||
|
|
||
| @register_schema("freshwater") | ||
| def schema() -> dict: | ||
| """Defines the schema that the freshwater module configuration should conform to.""" | ||
| Writing JSONSchema documents can be very tedious. The following tools may be of use: | ||
|
|
||
| schema_file = Path(__file__).parent.resolve() / "freshwater_schema.json" | ||
| * [https://www.jsonschema.net/app](https://www.jsonschema.net/app): this is a web | ||
| application that takes a data document - which is what the configuration file - and | ||
| automatically generates a JSON schema to validate it. You will need to then edit it | ||
| but you'll be starting with a valid schema! | ||
| * [https://jsonschemalint.com/](https://jsonschemalint.com/) works the other way. It | ||
| takes a data document and a schema and checks whether the data is compliant. This can | ||
| be useful for checking errors. | ||
|
|
||
| with schema_file.open() as f: | ||
| config_schema = json.load(f) | ||
| Both of those tools take data documents formatted as JSON as inputs, where we use TOML | ||
| configuration files, but there are lots of web tools to convert TOML to JSON and back. | ||
|
|
||
| return config_schema | ||
| ``` | ||
| ## Setting up the model `__init__.py` file | ||
|
|
||
| All model directories need to include an `__init__.py` file. The simple presence of the | ||
| `__init__.py` file tells Python that the directory content should be treated as module, | ||
| but then the file needs to contain code to do two things: | ||
|
|
||
| 1. The `__init__.py` file needs to register the JSONSchema file for the module. The | ||
| {meth}`~virtual_rainforest.core.config.register_schema` function takes a module name | ||
| and the path to the schema file and then, after checking the file can be loaded and is | ||
| valid, adds the schema to the schema registry | ||
| {data}`~virtual_rainforest.core.config.SCHEMA_REGISTRY`. | ||
|
|
||
| ## Ensuring that schema and models are always added to the registry | ||
| 1. It also needs to import the main BaseModel subclass. So for example, it should import | ||
| `FreshwaterModel` from the `virtual_rainforest.models.freshwater.freshwater_model` | ||
| module. This gives a shorter reference for a commonly used object | ||
| (`virtual_rainforest.models.freshwater.FreshwaterModel`) but it also means | ||
| that the BaseModel class is always imported when the model module | ||
| (`virtual_rainforest.models.freshwater`) is imported. This is used when the package | ||
| is loaded to automatically discover all the BaseModel classes and register them. | ||
|
|
||
| At the moment, a configuration schema only gets added to the schema registry when the | ||
| model it belongs to is imported, and a `Model` class only gets added to the registry | ||
| when the class itself is imported. This is a problem because the script that runs the | ||
| main Virtual Rainforest simulation does not import these things directly. To circumvent | ||
| this these imports needed to be placed in the top level `__init__.py` file (the one in | ||
| the same folder as `main.py`). This won't pass the `pre-commit` checks unless `flake8` | ||
| checks are turned off for the relevant lines. It's only strictly necessary to import the | ||
| `Model` class, as this implicitly entails importing the specific model as a whole. | ||
| However, for the sake of clarity we currently include both imports. | ||
| The class is not going to actually be used within the file, so needs `#noqa: 401` | ||
| to suppress a `flake8` error. | ||
|
|
||
| The resulting `__init__.py` file should then look something like this: | ||
|
|
||
| ```python | ||
| # Import all module schema here to ensure that they are added to the registry | ||
| from virtual_rainforest.models.freshwater import schema # noqa | ||
| """This is the freshwater model module. The module level docstring should contain a | ||
| short description of the overall model design and purpose. | ||
| """ # noqa: D204, D415 | ||
|
|
||
| from importlib import resources | ||
|
|
||
| # Import models here so that they also end up in the registry | ||
| from virtual_rainforest.models.freshwater.model import FreshWaterModel # noqa | ||
| from virtual_rainforest.core.config import register_schema | ||
| from virtual_rainforest.models.freshwater.freshwater_model import ( | ||
| FreshWaterModel | ||
| ) # noqa: 401 | ||
|
|
||
| with resources.path( | ||
| "virtual_rainforest.models.freshwater", "freshwater_schema.json" | ||
| ) as schema_file_path: | ||
| register_schema(module_name="freshwater", schema_file_path=schema_file_path) | ||
| ``` | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this code the same for all models, except for the name of the schema file? If so, you could remove some of the boilerplate and just do something like this in all the from virtual_rainforest.core.config import default_schema_setup
schema = default_schema_setup()You'll probably need to examine the stack for this.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is the same for all models. That's a good thought. |
||
|
|
||
| When the `virtual_rainforest` package is loaded, it will automatically import | ||
| `virtual_rainforest.models.freshwater`. That will cause both the model and the schema to | ||
| be loaded and registered. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,13 +1,18 @@ | ||
| import importlib.metadata | ||
| import pkgutil | ||
| from importlib import import_module | ||
|
|
||
| # Import all module schema here to ensure that they are added to the registry | ||
| from virtual_rainforest.core import schema # noqa | ||
| from virtual_rainforest.models.abiotic import schema # noqa | ||
| from virtual_rainforest.models.abiotic.abiotic_model import AbioticModel # noqa | ||
| from virtual_rainforest.models.plants import schema # noqa | ||
| from virtual_rainforest.models.soil import schema # noqa | ||
| import virtual_rainforest.models as vfm | ||
|
|
||
| # Import models here so that they also end up in the registry | ||
| from virtual_rainforest.models.soil.soil_model import SoilModel # noqa | ||
| # Import the core module schema to register the schema | ||
| import_module("virtual_rainforest.core") | ||
|
|
||
| # Autodiscover models in the models module to add their schema and BaseModel subclass | ||
| # All modules within virtual_rainforest.model are expected to: | ||
| # - import their BaseModel subclass to the module root, for example: | ||
| # from virtual_rainforest.models.soil.model import SoilModel # noqa: F401 | ||
| # - register their configuration schema using core.config.register_schema | ||
| for module_info in pkgutil.iter_modules(vfm.__path__): | ||
| import_module(f"virtual_rainforest.models.{module_info.name}") | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ...alternatively, you could check here whether the module has an attribute called
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One thing that crossed my mind was having a |
||
|
|
||
| __version__ = importlib.metadata.version("virtual_rainforest") | ||
Uh oh!
There was an error while loading. Please reload this page.