Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #1093 +/- ##
===========================================
- Coverage 94.60% 94.52% -0.08%
===========================================
Files 89 90 +1
Lines 7651 7856 +205
===========================================
+ Hits 7238 7426 +188
- Misses 413 430 +17 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@dalonsoa Could you have a quick look at this to get your thoughts on the design question in the PR description? The question hinges on this line in the new |
I don't have it clear at all how you can type annotate One option could be to use, in a somewhat hacky way, Generics. It could look something like this (not tested): from typing import TypeVar, Callable
T = TypeVar("T")
...
def get_subconfiguration(self, name: str, _: Callable[..., T]) -> T:
try:
return getattr(name)(self)
except AttributeError:
raise AttributeError(f"Model configuration for {name} not loaded")And then using it as: plants_config: PlantsConfiguration = configuration.get_subconfiguration("plants", PlantsConfiguration)It is a bit hacky because the only purpose of the second input argument is to indicate the type to I might be missing something, though. Keeping aside the above, as |
|
@dalonsoa - indeed you're right and it doesn't work. Running it through into the models still leads to a In my prototyping this afternoon, I've simply been using model_configuration: AbioticConfiguration = configuration.get_subconfiguration(
"abiotic"
) # type: ignore[assignment]
Yup - I couldn't see anything either. My strong feeling is that we do want to keep this flexibility. I think the only way to avoid some kind of a I then think it's a toss up between your solution - many thanks! - or my I guess, would you agree that the dynamic system is what we want and that one of these two solutions is acceptable to stop |
dalonsoa
left a comment
There was a problem hiding this comment.
The changes look sensible. As you said, it does not look that different from the existing config. I've just spotted something that I would recommend to change, if not now, in the future, so approving.
|
|
||
| .. code-block:: python | ||
|
|
||
| config_object = ConfigurationLoader(...).get_configuration() |
There was a problem hiding this comment.
You can equally do:
| config_object = ConfigurationLoader(...).get_configuration() | |
| config_object = get_configuration(ConfigurationLoader(...).data) |
And save an unnecessary method.
Indeed, this kind of usage of classes where the generated object is directly ditched just after its creation, using ever a function or attribute of the object, is often considered an anti pattern. Why creating an object that you are not using anywhere else?
I would suggest you re-write this using normal functions, passing arguments as needed.
There was a problem hiding this comment.
Definitely true on the get_configuration method - will make that change. I can definitely see what you mean about the ConfigurationLoader class being a poltergeist. I'm going to make that a new issue - I'd like to get the new system pushed through so would prefer to park that for now.
Description
This PR:
Adds the
core.config_builder, which is a rework ofcore.configbut using the new pydantic configuration classes.It retains - with some minor alterations - the main helper functions from
core.configfor loading and merging dictionaries.It adds the
generate_configurationmethod, which takes the a compiled config dictionary and does two things:The existing home-brew
Configclass, which is adictof config settings with a bunch of loading methods, is converted toConfigurationLoader. Creating a class instance functions almost identically toConfigand builds a single compiled configuration dictionary but this is no longer the endpoint. TheConfigurationLoader.get_configuration()method is used to pass the loaded data intogenerate_configurationand return the validated dataclass. This step now applies all of the validation that was previously handled through JSON Schema.The
ve_runfunction is extended to insert this new mechanism alongside the existing config loader. That turned out to work right out the box - the API for handling the inputs is basically the same, but now we can turn them into an integrated config model.The PR updates the names of the
model_name.model_config.ModelConfigurationclasses to be model specific, using a simple expected name pattern from the model namemodel_name-->ModelNameConfiguration.Tests - basically just a restructure of
test_config.pyfor the new architecture.LATE BREAKING DESIGN QUESTION
This needs resolving before a more wholesale review of the PR
Rolling out this change into the next steps of the configuration meta-issue #1085 has brought up a problem with using a dynamically generated configuration class. It definitely feels like the right thing to do to validate the configuration for the whole simulation once and to do that early - that's what the code currently does.
But... because the resulting Pydantic model instance is generated dynamically, the code does not explicitly define the class fields. We have to use a generic high level typing (
Configuration) and thenmypymakes it unusable because the nested fields are unknown. For example, if you then try to useconfiguration.plants.constantsinPlantsModel.from_config(..., configuration, ...), themypythrows all its toys out the cot, because it doesn't know that theplantsattribute exists.Ways forward:
We can type all of the configuration stuff as
Any. That will shutmypyup but equally we lose type checking on a core system. This is bad.We could validate early as is, and then say basically "OK - we know the data is good, we can then pass around the
ConfigurationLoaderand extract the model specific dictionary fromdatato get the specific model configuration instances inside models". That seems repetitive to me - we could use pydantic'sBaseModel.model_construct()method to bypass validating, it still wasteful.We could drop the early validation and basically do the same as above, but validation only occurs at the Model level. It is possible that some models then need to load multiple sections and so validation gets repeated. This seems kinda clunky though and defers finding configuration problems until late in the startup. One of the key advantages to switching to
pydanticis to be able to get the configuration validated right out of the gate.I think my preferred solution is that we create a specific Pydantic class to use as the base class for the compiled Configuration that provides an accessor for the first layer of fields:
The in the
from_configmethods, we can write:Obviously you have to correctly type the output from that method with the actual ModelConfiguration class, but then
mypyknow about all the class internals and we've only validated once.I think this option also requires a change to the
ModelConfigurationnaming style. Before I spotted this issue, I'd intended that people could just useconfiguration.plants.etcand so the different implementations between models ofplants.model_config.ModelConfigurationandabiotic.model_config.ModelConfigurationcould be invisible. If we need to use those classes for typing within model code, having them all called the same thing is going to get awfully confusing and we're going to have toimport ... as PlantsConfigurationif a model needs access to configuration settings from other models. So, I think it would be clearer if the classes actually have model specific names (as @dalonsoa actually originally suggested for config discovery).TODO
Fixes #1087
Fixes #1088
Type of change
Key checklist
pre-commitchecks:$ pre-commit run -a$ poetry run pytestFurther checks