Added trophic flow analytical tool and test by nickwctan · Pull Request #243 · ImperialCollegeLondon/ve_data_science

nickwctan · 2026-03-20T12:23:24Z

This PR adds an analytical tool to visualise the animal trophic interaction output from Virtual Ecosystem and calculates the trophic mass flow according to carbon, nitrogen or phosphorus.

Currently, by default, it tracks the carbon sum of resource pool (AKA resource kind) of all consumer_cohort in each timestep. This outputs faceted plot for visualisation purposes.

I envision that this will be extended in the future to create multiple graphical outputs to analyse trophic interactions.

@davidorme Is there a way we can mimic the pytest pathing in Virtual Ecosystem so that it also works similarly in the ve_data_science repo? At the moment I have placed the test file in the same folder, which is not ideal.

davidorme · 2026-03-20T13:37:17Z

@nickwctan When we chatted in Teams, I hadn't realised that you wanted to push a datafile to act as part of a test suite for the new tool. That's quite a different use case than just uploading a datafile for people to check the code out with. I'm completely on board with adding a datafile that is used for code testing. It would be good to reduce the file size as much as possible though - maybe cut the file down to fewer time steps or cells?

I think the way to do this is to have a tools/testing_data directory - we may end up going for R or python or stick with both for the tools but sharing common test files is sensible!

On the location of the test files, there are a couple of choices of organising files, which includes having them within the code directory:

https://docs.pytest.org/en/7.1.x/explanation/goodpractices.html

It doesn't have to be a separate folder as in the VE repo. I wouldn't worry about this now, we can restructure as needed later.

davidorme

@nickwctan This is really neat - hadn't realised you were also building in testing.

I've got basically two types of comments:

You have added the if __name__=="__main__": sections to both files, but neither of those really reflect how the files should be used in practice (for different reasons for the tests vs the tool).
Your methods on the tool are using return self a lot. That isn't generally how class methods are used in Python - if your method is updating the class instance (i.e. doing self.something = "a thing" then the method can simply return None and the class instance that you ran the method on has been updated.

In [2]: class Test:
   ...: 
   ...:     def __init__(self):
   ...: 
   ...:         self.a = 12
   ...: 
   ...:     def update_a(self) -> None:
   ...: 
   ...:         self.a = 24
   ...: 
   ...: 
   ...: test = Test()
   ...: test.a
   ...: 
Out[2]: 12

In [3]: test.update_a()
   ...: test.a
Out[3]: 24

ETA:

You have a ton of print statements. That is great for debugging - do you actually want these unavoidably appearing in all your notebooks that use the tool? You can filter out printed messages in notebook code, but that isn't something users would normally expect to have to do.

You could use the logging system to print these messages, which allows a user to suppress logging. I think you have to make a decision about whether these are debugging messages that were useful for you during coding (but should now just be deleted) or if they are actually a useful output that users might sometimes want (in which case, logging, but more complex!)

davidorme · 2026-03-20T13:40:11Z

+# ================================================================
+# running the test
+# ================================================================
+
+if __name__ == "__main__":
+    print("Running tests")
+    test_class_initialisation(analysis())
+    test_load_data(analysis())
+    test_process_data(analysis())
+    print("All tests passed")


You wouldn't run the tests like this. You'd just do pytest test_trophic_mass_flow.py from the command line and pytest handles everything else.

A Python file does not have to have an if __name__ == "__main__": section so you can simply delete these lines.

davidorme · 2026-03-20T13:45:42Z

+if __name__ == "__main__":  #
+    """Run test separately.
+
+    Only run this chunk of codes when executed directly and not imported.
+    animal_trophic_interactions.csv can be generated when doing a ve_run.
+    This csv can be found in ".../out"
+    Place the csv file in the same folder as your py files.
+    """
+    # create instance for analysis
+    analysis = TrophicFlowAnalysis(
+        "animal_trophic_interactions.csv", config={"cumulative": False, "n_cols": 2}
+    )
+
+    # load and process data (method chaining)
+    analysis.load_data().process_data()
+
+    #     summary
+    analysis.summary()
+
+    # create plots
+    analysis.plot_faceted(save_path="mass_flow_faceted.png")
+
+    print("\nDone")


Again, you'd never just execute this Python file like this, or need to move the output files. Instead, what you would do is - in the Python analysis script that uses this tool -

from ../../tools.python.trophic_mass_flow import TrophicFlowAnalysis analysis = TrophicFlowAnalysis('../../data/scenarios/scenario/out/animal_trophic_interactions.csv')

There will be some things to sort out there with the relative locations of the tool module that I need to think about, but I would again remove this and maybe move the code into the docstring to show the workflow.

davidorme · 2026-03-20T13:50:10Z

+        print(f"trophic flow initialised for {file_path}")
+


I think breaking out the functionality into methods here is good practice, but in terms of actual users is anyone ever going to create a TrophicFlowAnalysis instance without loading and processing the data?

If we include calling those methods in __init__ then the usage is simpler.

Suggested change

print(f"trophic flow initialised for {file_path}")

print(f"trophic flow initialised for {file_path}")

self.load_data()

self.process_data()

And actually, linking over to @Baizurah's recent PR (#241) - if you were using this in a notebook, could you imagine wanting to do other things with the data from animal_trophic_interactions.csv in the file? Would you want to filter it, plot other things etc?

If that is the case, then forcing a reload of that data within this class is much less flexible than having the class __init__ have a df argument that passes in an existing loaded dataframe. You have to pass it a dataframe with the right structure of course, but it is probably a better design for the code. This is sometimes called "Separation of concerns": your class about analysis of a particular dataset, not loading the dataset in the first place.

davidorme · 2026-03-20T13:53:32Z

+    def process_data(self) -> "TrophicFlowAnalysis":
+        """Run all the processing steps in correct order.
+
+        Returns:
+            A processed dataframe.
+
+        """
+        print("\nProcessing data...")
+        self.convert_units()
+        self.group_and_aggregate()
+        self.pivot_data()
+        print("Processing complete!")
+        return self


Don't return self here. The way to do this is return None and have the object just update its own internal state in place.

Suggested change

def process_data(self) -> "TrophicFlowAnalysis":

"""Run all the processing steps in correct order.

Returns:

A processed dataframe.

"""

print("\nProcessing data...")

self.convert_units()

self.group_and_aggregate()

self.pivot_data()

print("Processing complete!")

return self

def process_data(self) -> None:

"""Run all the processing steps in correct order.

Updates the processed data attributes

"""

print("\nProcessing data...")

self.convert_units()

self.group_and_aggregate()

self.pivot_data()

print("Processing complete!")

davidorme · 2026-03-20T13:55:00Z

+    def pivot_data(
+        self,
+        index_col: str | None = None,
+        columns_col: str | None = None,
+        values_col: str | None = None,
+    ) -> "TrophicFlowAnalysis":


This isn't returning anything, just updating the class to populate self.pivoted_df

Suggested change

def pivot_data(

self,

index_col: str | None = None,

columns_col: str | None = None,

values_col: str | None = None,

) -> "TrophicFlowAnalysis":

def pivot_data(

self,

index_col: str | None = None,

columns_col: str | None = None,

values_col: str | None = None,

) -> None:

davidorme · 2026-03-20T13:56:21Z

+    def load_data(self) -> "TrophicFlowAnalysis":
+        """Load the animal trophic interaction output.
+
+        Load data from the output CSV file from Virtual Ecosystem
+        into a dataframe.
+
+        Returns:
+            A dataframe.
+
+        """
+
+        print("loading data...")
+        self.df = pd.read_csv(self.file_path, parse_dates=["time"])
+        print(f" Loaded {len(self.df)} rows, {len(self.df.columns)} columns")
+        return self


Suggested change

def load_data(self) -> "TrophicFlowAnalysis":

"""Load the animal trophic interaction output.

Load data from the output CSV file from Virtual Ecosystem

into a dataframe.

Returns:

A dataframe.

"""

print("loading data...")

self.df = pd.read_csv(self.file_path, parse_dates=["time"])

print(f" Loaded {len(self.df)} rows, {len(self.df.columns)} columns")

return self

def load_data(self) -> None:

"""Load the animal trophic interaction output.

Load data from the output CSV file from Virtual Ecosystem

into a dataframe.

Returns:

A dataframe.

"""

print("loading data...")

self.df = pd.read_csv(self.file_path, parse_dates=["time"])

print(f" Loaded {len(self.df)} rows, {len(self.df.columns)} columns")

davidorme · 2026-03-20T13:57:12Z

+    def convert_units(
+        self, columns: list[str] | None = None, multiplier: float = 1000
+    ) -> "TrophicFlowAnalysis":
+        """Convert measurement units (default: from kg to g).
+
+        Args:
+            columns: columns to multiply.
+            multiplier: multiplier factor.
+
+        Returns:
+            A dataframe with units converted.
+
+        """
+        cols = columns or self.config["column_values"]
+        if self.config["convert_to_grams"]:
+            print(f"Converting {cols} by {multiplier} to grams")
+            self.df[cols] = self.df[cols] * multiplier
+        return self


Suggested change

def convert_units(

self, columns: list[str] | None = None, multiplier: float = 1000

) -> "TrophicFlowAnalysis":

"""Convert measurement units (default: from kg to g).

Args:

columns: columns to multiply.

multiplier: multiplier factor.

Returns:

A dataframe with units converted.

"""

cols = columns or self.config["column_values"]

if self.config["convert_to_grams"]:

print(f"Converting {cols} by {multiplier} to grams")

self.df[cols] = self.df[cols] * multiplier

return self

def convert_units(

self, columns: list[str] | None = None, multiplier: float = 1000

) -> None:

"""Convert measurement units (default: from kg to g).

Args:

columns: columns to multiply.

multiplier: multiplier factor.

Returns:

A dataframe with units converted.

"""

cols = columns or self.config["column_values"]

if self.config["convert_to_grams"]:

print(f"Converting {cols} by {multiplier} to grams")

self.df[cols] = self.df[cols] * multiplier

Added trophic flow analytical tool and test

2ee8cf1

nickwctan requested review from Baizurah and TaranRallings March 20, 2026 12:23

davidorme requested changes Mar 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added trophic flow analytical tool and test#243

Added trophic flow analytical tool and test#243
nickwctan wants to merge 1 commit intomainfrom
animal---tool-trophic-mass-flow

nickwctan commented Mar 20, 2026 •

edited

Loading

Uh oh!

davidorme commented Mar 20, 2026

Uh oh!

davidorme left a comment •

edited

Loading

Uh oh!

davidorme Mar 20, 2026

Uh oh!

davidorme Mar 20, 2026

Uh oh!

davidorme Mar 20, 2026

Uh oh!

davidorme Mar 23, 2026 •

edited

Loading

Uh oh!

davidorme Mar 20, 2026

Uh oh!

davidorme Mar 20, 2026

Uh oh!

davidorme Mar 20, 2026

Uh oh!

davidorme Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nickwctan commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidorme commented Mar 20, 2026

Uh oh!

davidorme left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidorme Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

davidorme Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

davidorme Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

davidorme Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidorme Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

davidorme Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

davidorme Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

davidorme Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nickwctan commented Mar 20, 2026 •

edited

Loading

davidorme left a comment •

edited

Loading

davidorme Mar 23, 2026 •

edited

Loading