Alternative implementation of variables.py#1296
Conversation
dalonsoa
left a comment
There was a problem hiding this comment.
I love it. Way cleaner and robust that what I coded back in the day. Very well done. I have a few comments and suggestions, none of them blockers.
| for model in models: | ||
| for var in model.vars_populated_by_init: | ||
| if var in runtime_variables: | ||
| raise ValueError( |
There was a problem hiding this comment.
Maybe collect first all the things that are wrong and then raise the exception, as done in _check_model_variables_are_known
There was a problem hiding this comment.
Yeah - I think this would be good. I think the easiest way to do it is to have these functions return a list of error messages. If we concatenate those at the bottom of setup_variables then it is either [] (hooray!) or not, in which case we log all the contents and raise a general error.
However, I'm going to kick that down the road!
| def load_known_variables() -> dict[str, VariableMetadata]: | ||
| """Loads the known variables from the variable database. | ||
|
|
||
| The pydantic classes handle validation of the input. | ||
| """ | ||
|
|
||
| with open( | ||
| str(resources.files("virtual_ecosystem") / "data_variables.toml"), "rb" | ||
| ) as f: | ||
| known_vars = tomllib.load(f) |
There was a problem hiding this comment.
Rather than hard coding this path here, make it an input argument, defaulting to this value, so that an alternative can be provided if needed, eg, via tests. Moreover, it could be useful to provide the known_vars object as input as well, so they can be loaded from elsewhere combining multiple variable files, if needed at some point via configuration. This will be one step to make VE extensible with new models without changing the base source code.
Something like:
| def load_known_variables() -> dict[str, VariableMetadata]: | |
| """Loads the known variables from the variable database. | |
| The pydantic classes handle validation of the input. | |
| """ | |
| with open( | |
| str(resources.files("virtual_ecosystem") / "data_variables.toml"), "rb" | |
| ) as f: | |
| known_vars = tomllib.load(f) | |
| def load_known_variables(known_vars: dict[str, Any] | None = None, var_file: str | None = None) -> dict[str, VariableMetadata]: | |
| """Loads the known variables from the variable database. | |
| The pydantic classes handle validation of the input. | |
| """ | |
| if var_file is None: | |
| var_file = str(resources.files("virtual_ecosystem") / "data_variables.toml") | |
| if known_vars is None: | |
| with open(var_file) as f: | |
| known_vars = tomllib.load(f) |
There was a problem hiding this comment.
I've added the filepath option - but I've stopped there for now because it opens further questions about how to resolve duplication and how to add new models. That is something I think we need to do - it shouldn't just be to add them into the package source - but I don't want to get deep into the weeds second guessing myself 😄
Or at least not today...
|
Thanks @dalonsoa - I'll shift the testing from variables over to the new system and work through your comments. Once everything is passing, I'll send it back over, but if the main structure is sound then it shouldn't be a big change from this. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #1296 +/- ##
===========================================
- Coverage 95.00% 94.97% -0.04%
===========================================
Files 71 71
Lines 7394 7358 -36
===========================================
- Hits 7025 6988 -37
- Misses 369 370 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@dalonsoa Could you give the new module a final check?
|
dalonsoa
left a comment
There was a problem hiding this comment.
All good from my side. Much, much nicer, indeed.
Description
I was trying to implement a couple of things:
xarray.DatasetinDatafrom theKNOWN_VARIABLESwhen theData.__set_item__method ran. Then they're dumped automatically. You could also add them right before outputting them...KNOWN_VARIABLEScan be added toData- it turns out this is quite important for checking that developers do actually update thedata_variables.toml(or are alerted to changes in names or deprecated variables)!There's a problem though - because
KNOWN_VARIABLESis a registry, it has a state that is dependent on the particular sequence of code execution. That's really hard in tests - thetest_variables.pyexplicitly has a bunch of fixtures to reset the registries constantly because of this. If we want to useKNOWN_VARIABLESoutside of just the variable setup (which I think we do) then all of the tests suffer the same issue.I think the answer is to discard the registry approach for
KNOWN_VARIABLESandRUNTIME_VARIABLESand make these explicit variables within the model.I've added
ve.core.variables_new.py:variables.rstfile.Variableand theVariableMetadata: they were very similar butVariablehad the post init fields for the model vars andVariableMetadatausedpydanticfor validation. We now have a single pydantic dataclass that loads and validatesdata_variables.tomland that we can use for tracking variable usage in modelsVariableMetadatavalidation now handles the validation of the axes data - unique names that appear in the VALIDATORS registry.load_known_variablesfunction that returnsdict[str, VariableMetadata]. I've used this in the main code to populate a newData.known_variablesattribute which seemed like a sensible home for it. This replacesregister_all_variables.setup_variablesand the variousvariables._collect...functions:_check_model_variables_are_known) - less repetition, separation of concerns.known_variablesand use that as necessary to add variables to a secondruntime_variablesdictionary, which is then used to track variable usage by models.setup_variablesfunction now returns a dictionary of variables with populated model usage attributes.@dalonsoa Does this seem ok to you - it seems less brittle to me and allows us to use the variable database more widely. If so I'll polish up the rough edges and take it out of draft
Fixes #1295
Type of change
Key checklist
pre-commitchecks:$ pre-commit run -a$ poetry run pytestFurther checks