Added separate bacteria and fungi pools#762
Conversation
…ated the functions accordingly)
…p nutrients, produce enzymes, grow, etc
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #762 +/- ##
===========================================
+ Coverage 94.69% 94.72% +0.02%
===========================================
Files 74 75 +1
Lines 5149 5197 +48
===========================================
+ Hits 4876 4923 +47
- Misses 273 274 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
dalonsoa
left a comment
There was a problem hiding this comment.
I've made a small suggestion, but otherwise it looks good.
I think there are a couple of places where there's code duplication, applying the same thing to bacteria and fungi, that could be replaced with a loop, but I'm not sure if it is worth the effort at this time or if the calculation is exactly the same, or if it would be the same in the future, so I would leave it as it is, for now. We can revisit that later, if needed.
| @dataclass | ||
| class MicrobialGroup: |
There was a problem hiding this comment.
If I get this right, this dataclass is mean to contain the constants that each group (bacteria and fungi) requires, and is not meant to change over the course of the simulation, right?
If that's the case, I'd make this a frozen dataclass, to ensure that, and maybe add Constants to the class name to make it even more clear:
| @dataclass | |
| class MicrobialGroup: | |
| @dataclass(frozen=True) | |
| class MicrobialGroupConstants: |
There was a problem hiding this comment.
Yep these constants should never change in the course of a simulation, I'll make it a frozen class and rename!
There was a problem hiding this comment.
At the moment you already have all of this data packaged in SoilConsts but you are packaging it up into a consistent wrapper and leaving the door open to add more groups? I think as it stands its between two stools - your code allows multiple groups but the config has got you pinned at only one of each.
I think you've basically got microbial functional types here and it would make sense to adopt something like the framework used in the Plants Model update. Rather than having a long list of of parameters which (currently) only allow one of each of fungi and bacteria, then have an array of tables in TOML (or package in a CSV file as Plants does).
[[soil.microbial_group]]
type = "fungal"
max_uptake_rate_labile_C = 0.1
etc = "etc"If you add type to your dataclass, you can then simply load an instance using MicrobialGroup(**args_from_config) and hence ditch make_bacterial_functional_group and make_fungal_functional_group. The only thing you then need to do is to divide what you get from loading the config into the fungal and bacterial subgroups in make_full_set_of_microbial_groups.
There was a problem hiding this comment.
That does make sense, I guess my main concern would be how these constants will get documented because the majority of "constants" used in the soil model in reality vary between functional groups.
Would they be documented as attributes in MicrobialGroup rather than in the constants submodule?
There was a problem hiding this comment.
So you have a consistent set of attributes across both fungi and bacteria but the values of those attributes differ between the two (and also could vary between different subgroups within those two, but currently don't).
So, I think the attribute itself should be easy to document in the constants submodule - it plays a consistent role - but then the actual values for each group are harder. Ultimately this is becoming more like data - as a user, you'd want to document what sources were used to describe those groups but that isn't something that we document within the code?
I guess you'd also need to handle the case where users omit one of the two! Could you run the simulation without one of them or should it break?
There was a problem hiding this comment.
It should break if one of the two is omitted. It's not implemented yet but the idea is that each functional groups interacts with the soil pools (somewhat) differently. With the basic idea being the functional groups are defined at a coarse enough level that they represent really meaningfully different actors, as the model is not really designed to simulate competition between similar organisms.
I really don't want to allow users to define arbitrary functional groups, because this would essentially entail spending a lot of development effort for a feature we would tell people not to use (because the results would be nonsensical, like if two bacterial functional groups were added one would be guaranteed to go extinct).
There was a problem hiding this comment.
I do think it's worth adopting your approach though as it will save me time and effort (and avoid cluttering constants) when I get around to split fungi into multiple groups
There was a problem hiding this comment.
I really don't want to allow users to define arbitrary functional groups
Is that then a simple check that the types are unique on the loaded set of groups?
spending a lot of development effort for a feature we would tell people not to use
But we do want them to use it - it's a less cluttered way of defining the constants. We just don't want them to use it wrong 😄 It also opens up using a CSV for the parameterisation, which is much more user friendly than TOML for this kind of configuration.
davidorme
left a comment
There was a problem hiding this comment.
I think it makes more sense to go all the way and refactor the config to support this. It'll end up much cleaner and I don't think it's too hard to do.
| @dataclass | ||
| class MicrobialGroup: |
There was a problem hiding this comment.
At the moment you already have all of this data packaged in SoilConsts but you are packaging it up into a consistent wrapper and leaving the door open to add more groups? I think as it stands its between two stools - your code allows multiple groups but the config has got you pinned at only one of each.
I think you've basically got microbial functional types here and it would make sense to adopt something like the framework used in the Plants Model update. Rather than having a long list of of parameters which (currently) only allow one of each of fungi and bacteria, then have an array of tables in TOML (or package in a CSV file as Plants does).
[[soil.microbial_group]]
type = "fungal"
max_uptake_rate_labile_C = 0.1
etc = "etc"If you add type to your dataclass, you can then simply load an instance using MicrobialGroup(**args_from_config) and hence ditch make_bacterial_functional_group and make_fungal_functional_group. The only thing you then need to do is to divide what you get from loading the config into the fungal and bacterial subgroups in make_full_set_of_microbial_groups.
…e generated from toml files rather than SoilConsts
|
@davidorme I've changed the code so that |
davidorme
left a comment
There was a problem hiding this comment.
This looks much cleaner to me. I think it's the right direction! We could add loading from CSV but with the current two groups, putting this in TOML isn't anything like the fuss it is for plant functional types. That's something to park, maybe until you add the different fungal types.
I've made a suggestion on the validation code - just easier to read and doesn't mask an error in the exception.
| expected_groups = ["fungi", "bacteria"] | ||
|
|
||
| if "soil" not in config: | ||
| msg = "Model configuration for soil model not found." | ||
| LOGGER.critical(msg) | ||
| raise ConfigurationError(msg) | ||
|
|
||
| defined_groups = [ | ||
| group["name"] for group in config["soil"]["microbial_group_definition"] | ||
| ] | ||
|
|
||
| if set(defined_groups) != set(expected_groups): | ||
| if not set(expected_groups).issubset(set(defined_groups)): | ||
| msg = ( | ||
| "The following expected soil microbial groups are not defined: " | ||
| f"{', '.join(set(expected_groups) - set(defined_groups))}" | ||
| ) | ||
| LOGGER.critical(msg) | ||
| if not set(defined_groups).issubset(set(expected_groups)): | ||
| msg = ( | ||
| "The following microbial groups are not valid: " | ||
| f"{', '.join(set(defined_groups) - set(expected_groups))}" | ||
| ) | ||
| LOGGER.critical(msg) | ||
| raise ConfigurationError(msg) |
There was a problem hiding this comment.
I think the code below is easier to read - also the current code masks the first error in the exception reporting if both occur.
| expected_groups = ["fungi", "bacteria"] | |
| if "soil" not in config: | |
| msg = "Model configuration for soil model not found." | |
| LOGGER.critical(msg) | |
| raise ConfigurationError(msg) | |
| defined_groups = [ | |
| group["name"] for group in config["soil"]["microbial_group_definition"] | |
| ] | |
| if set(defined_groups) != set(expected_groups): | |
| if not set(expected_groups).issubset(set(defined_groups)): | |
| msg = ( | |
| "The following expected soil microbial groups are not defined: " | |
| f"{', '.join(set(expected_groups) - set(defined_groups))}" | |
| ) | |
| LOGGER.critical(msg) | |
| if not set(defined_groups).issubset(set(expected_groups)): | |
| msg = ( | |
| "The following microbial groups are not valid: " | |
| f"{', '.join(set(defined_groups) - set(expected_groups))}" | |
| ) | |
| LOGGER.critical(msg) | |
| raise ConfigurationError(msg) | |
| if "soil" not in config: | |
| msg = "Model configuration for soil model not found." | |
| LOGGER.critical(msg) | |
| raise ConfigurationError(msg) | |
| expected_groups = {"fungi", "bacteria"} | |
| defined_groups = { | |
| group["name"] for group in config["soil"]["microbial_group_definition"] | |
| } | |
| undefined_groups = expected_groups.difference(defined_groups) | |
| unexpected_groups = defined_groups.difference(expected_groups) | |
| if undefined_groups: | |
| msg = ( | |
| "The following expected soil microbial groups are not defined: " | |
| f"{', '.join(set(expected_groups) - set(defined_groups))}" | |
| ) | |
| LOGGER.critical(msg) | |
| if unexpected_groups: | |
| msg = ( | |
| "The following microbial groups are not valid: " | |
| f"{', '.join(set(defined_groups) - set(expected_groups))}" | |
| ) | |
| LOGGER.critical(msg) | |
| if undefined_groups or unexpected_groups: | |
| raise ConfigurationError("The soil microbial group configuration contains errors. Please check the log".) |
There was a problem hiding this comment.
Ahh that's really useful thanks, I didn't think I was approaching that part in an ideal manner.
Also, yes think reading from CSV is something to implement down the road, once we've had a bit more chance to see how using TOML works in practice for these groups
…om make_full_set_of_microbial_groups
Description
This PR splits the microbial pool into separate fungi and bacteria pools. At the moment these groups behave identically but can be parameterised differently. In future (see #761) I want to split enzymes up by which taxonomic group generated them.
In order to get the existing code to work with multiple microbial groups I had to introduce a
MicrobialGroupdata class. This basically exists to bundle the relevant constants for a specific microbial group together. I'm not sure whether the way I did this is the most sensible approach so I am open to suggestions.@hrlai, tagging you in so that you can see the changes coming in. The most significant for your purposes will be that most of the microbial parameters are now split between fungi and bacteria versions.
Fixes #758
Type of change
Key checklist
pre-commitchecks:$ pre-commit run -a$ poetry run pytestFurther checks