Skip to content

Refactor RasterLayer to use PropertyLayer as backend (#201)#310

Closed
Tejasv-Singh wants to merge 1 commit intomesa:mainfrom
Tejasv-Singh:refactor-rasterlayer-propertylayer
Closed

Refactor RasterLayer to use PropertyLayer as backend (#201)#310
Tejasv-Singh wants to merge 1 commit intomesa:mainfrom
Tejasv-Singh:refactor-rasterlayer-propertylayer

Conversation

@Tejasv-Singh
Copy link
Copy Markdown
Contributor

Summary

This PR refactors RasterLayer to use Mesa's new PropertyLayer vectorized backend. This resolves the massive performance and memory bottlenecks caused by eager instantiation of Python objects for every pixel while completely preserving the existing Cell spatial API for backward compatibility.

Bug / Issue

Resolves #201

Context & Impact:
The current implementation of RasterLayer (prior to this PR) proactively creates a Python Cell object (inheriting from Agent) for every single coordinate in a raster. When dealing with moderately-sized to large geographic rasters (e.g., GeoTIFFs), this object-oriented initialization scales extremely poorly, causing severe initialization delays and massive memory consumption overhead.

Implementation

To address this architectural bottleneck, I transitioned RasterLayer from an object-oriented memory model to a data-oriented (vectorized) memory model using PropertyLayer. Here is what was changed and what was deleted:

  • Deleted Eager Instantiation (_initialize_cells): I completely deleted the nested internal loops that iteratively allocated millions of Cell instances into a giant list-of-lists in memory.
  • Deleted Per-Pixel Matrix Iteration loops: I removed all slow $O(N \times M)$ nested for loops inside apply_raster and get_raster that previously utilized setattr() and getattr() individually on every single pixel cell.
  • Added Vectorized Storage (PropertyLayer): RasterLayer no longer holds cells; it exclusively manages instances of PropertyLayer. Raster matrices are mapped natively, using hardware-accelerated NumPy operations like np.flip(..., axis=0).T to match bounds and read/write full grids instantly.
  • Added Descriptors for O(1) Access: To preserve the intuitive developer API, I implemented custom Python descriptors (getattr and setattr) inside the Cell class. Now, when a simulation step calls cell.elevation, it doesn't read the attribute from the cell itself, it dynamically proxies the query directly into the shared PropertyLayer memory in $O(1)$ time.
  • Added _CellWrapper array-views: To ensure iterating visually via layer.cells[x][y] does not break existing model code, I added a lightweight wrapper class. It generates ephemeral Cell objects dynamically on the fly, acting as "views" rather than permanently stored agents.
  • Deleted Static Cell.xy Storage: The cell's geographic coordinate is no longer stored inside the object taking up memory. I rewrote the xy property to dynamically calculate its rio.transform.xy location on demand using its rowcol offset.

Testing

  • Ran the existing pytest tests/test_RasterLayer.py verifications (updated the concatenation test to expect the correct 3 bands after fixing the attr_name collision bug).
  • Rebased the branch onto the latest main integrating alongside the new PropertyLayer grid changes from PR Add Cell.xy for RasterLayer cell coordinates and clarify pos/indices semantics #299
  • Verified that the full pytest tests/ test suite passes gracefully without a single regression or failure. The PropertyLayer backend successfully satisfies the maintainers' newly added apply_raster test suite unconditionally.

Additional Notes

@wang-boyu

This is my initial attempt at bridging the experimental PropertyLayer capabilities into Mesa-Geo while keeping the legacy "Cell-centric" user experience intact.

I would really appreciate any feedback, guidance, or structural corrections regarding how the _CellWrapper and descriptors were implemented, and whether you think this meshes well with the long-term vision of integrating discrete_space. eager to learn from your reviews!

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 22, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 22, 2026

Codecov Report

❌ Patch coverage is 91.30435% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.28%. Comparing base (65a3751) to head (4a4301e).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
mesa_geo/raster_layers.py 91.30% 5 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #310      +/-   ##
==========================================
+ Coverage   77.67%   81.28%   +3.60%     
==========================================
  Files          10       10              
  Lines         954     1031      +77     
  Branches      146      161      +15     
==========================================
+ Hits          741      838      +97     
+ Misses        174      153      -21     
- Partials       39       40       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Tejasv-Singh Tejasv-Singh force-pushed the refactor-rasterlayer-propertylayer branch 6 times, most recently from 17b05af to a0a4039 Compare February 27, 2026 22:49
@Tejasv-Singh
Copy link
Copy Markdown
Contributor Author

Update: CI Timeouts & Backward Compatibility Resolved

I have just pushed a major set of optimizations and fixes to resolve the 2-minute timeouts and mesa-examples test failures! The PropertyLayer backend is now fully backward-compatible and hyper-optimized.

Here is a summary of the fixes:

1. Resolved the UrbanGrowth / Rainfall CI Timeouts -> O(1) Access
The CI runner was previously timing out after 2 minutes because iterating over RasterLayer.cells generated a brand-new Python list containing ephemeral Cell objects dynamically on every single matrix lookup (e.g. layer.cells[x][y]).

  • Fix: I refactored the wrapper's getitem to return a proxied _CellColumn class. This completely eliminates the O(N) list building, dropping double-indexed neighbor traversals to constant O(1) time.
  • Result: The Rainfall legacy test runtime dropped from ~3m 22s natively down to ~4.6 seconds (a 98% reduction!). The entire gis suite now passes cleanly in under 40 seconds.

2. Resolved the mesa.Agent Registry Memory Leak
Because Cell inherits from mesa.Agent, dynamically generating hundreds of thousands of ephemeral grid views meant Mesa was auto-registering millions of temporary Cell objects directly into model.agents, causing huge memory overheads.

  • Fix: Bypassed super().__init__(model) inside Cell.__init__ and assigned self.model explicitly, preventing ephemeral views from polluting the permanent model.agents schedule tracker.

3. Fixed RuntimeWarning: invalid value encountered in cast

  • Fix: In RasterLayer.get_raster, modified the numpy dtype inference to use np.result_type combining only the requested attributes instead of blindly defaulting to the first layer's type. This prevents float64 NaN arrays from being dangerously forcefully cast to int32.

4. Codecov Coverage Restored & setattr Hardened

  • Added a native test suite fallback checks to tests/test_RasterLayer.py ensuring that try/except backward compatibility logic hits 100% lines covered.

All checks (including the example models strictly running with -Werror) are now completely passing!

Do let me know if this requires any more changes or modifications. 🙌

@Tejasv-Singh Tejasv-Singh force-pushed the refactor-rasterlayer-propertylayer branch 4 times, most recently from cb2025e to 8b69439 Compare February 27, 2026 23:23
Summary
This PR refactors RasterLayer to use Mesa's new PropertyLayer vectorized backend. This resolves the massive performance and memory bottlenecks caused by eager instantiation of Python objects for every pixel while completely preserving the existing Cell spatial API for backward compatibility.

Bug / Issue
Resolves mesa#201

Context & Impact:
The current implementation of RasterLayer (prior to this PR) proactively creates a Python Cell object (inheriting from Agent) for every single coordinate in a raster. When dealing with moderately-sized to large geographic rasters (e.g., GeoTIFFs), this object-oriented initialization scales extremely poorly, causing severe initialization delays and massive memory consumption overhead.

Implementation
To address this architectural bottleneck, I transitioned RasterLayer from an object-oriented memory model to a data-oriented (vectorized) memory model using PropertyLayer.
- Added Vectorized Storage (PropertyLayer): RasterLayer no longer holds cells; it exclusively manages instances of Python numeric arrays mapped directly to numpy transformations.
- Added O(1) Ephemeral Views: _CellWrapper dynamically proxies properties dynamically without permanent Agent object initialization overhead.
@Tejasv-Singh Tejasv-Singh force-pushed the refactor-rasterlayer-propertylayer branch from c0cbae7 to 4a4301e Compare February 27, 2026 23:35
@AdMub
Copy link
Copy Markdown
Contributor

AdMub commented Mar 7, 2026

Hi @Tejasv-Singh, fantastic work on the backend refactor! Moving the heavy lifting to PropertyLayer and utilizing np.result_type for the casting warnings is a massive performance win.

I was reviewing the architecture for my own GSoC proposal, and I wanted to flag a critical edge case regarding the _CellWrapper and the bypassed super().__init__(model) implementation.

Because the _CellWrapper dynamically spawns ephemeral Cell views on demand, we lose stable object identity. If a user does a = layer.cells[0][0] and b = layer.cells[0][0], a is b will return False. Since Cell inherits from mesa.Agent, users heavily rely on object identity to use cells as dictionary keys, store them in sets, or evaluate if two agents share the same cell (agent1.pos is agent2.pos).

Furthermore, bypassing super().__init__(model) means the Cell objects no longer receive a unique_id from the Mesa core, which breaks compatibility with Mesa's internal tracking and reproducibility pipelines (like self.random).

To preserve object identity without blowing up memory, we might need to explore a true Flyweight Pattern or completely decouple Cell from mesa.Agent so it acts purely as a spatial dataclass/coordinate view rather than an active model component.

Happy to help brainstorm solutions for this! The NumPy backend mapping is beautiful, we just need to make sure the frontend API doesn't break existing agent-level logic.

@Tejasv-Singh
Copy link
Copy Markdown
Contributor Author

Hi @AdMub, thank you for the thorough review, 🙌 these are completely valid concerns and I'm glad you flagged them 🙌

You're right on both points:

Object Identity: The current _CellWrapper approach does break identity semantics. My plan is to introduce a weakref.WeakValueDictionary cache keyed by (row, col) inside the layer, so repeated accesses to layer.cells[x][y] return the same object as long as it's alive. This preserves dict/set usage and is comparisons without holding all cells in memory permanently.

super().__init__() bypass: This one is the deeper architectural question. The more I think about it, the more I agree that Cell probably shouldn't inherit from mesa.Agent at all. Cells are spatial coordinate views, not active model components, conflating the two is what caused the memory leak in the first place. I'm actually leaning a bit toward decoupling Cell into a pure spatial dataclass and removing the Agent inheritance entirely, which also cleanly resolves the unique_id and self.random issues.

I'd love to hear @wang-boyu's take on the Cell/Agent decoupling direction before I push the fix, since it's a meaningful API change. Does this align with the long-term discrete_space integration vision?

Will push an updated branch once I have a clearer signal on the architectural direction. Thanks again! 😄

@wang-boyu
Copy link
Copy Markdown
Member

Thanks for the PR @Tejasv-Singh and the discussions here @AdMub - appreciate your time and interest in Mesa-Geo.

This change touches a major part of the upcoming GSoC project scope, so we're not ready to merge direct implementations at this stage. We want to intentionally keep this space open for ideas / suggestions / discussions through the GSoC application process.

As such we'll close this PR for now. This is definitely not against the broader direction or your implementation, but because we're not planning to review this PR towards merging at this time.

If you'd like to talk more about it, I'd recommend continuing the discussion at Mesa-Geo discussions.

The more I think about it, the more I agree that Cell probably shouldn't inherit from mesa.Agent at all. Cells are spatial coordinate views, not active model components, conflating the two is what caused the memory leak in the first place. I'm actually leaning a bit toward decoupling Cell into a pure spatial dataclass and removing the Agent inheritance entirely, which also cleanly resolves the unique_id and self.random issues.

Not quite sure about the idea of "a pure dataclass", but I do agree that separating Cell from mesa.Agent makes sense. Particularly for a cellular automata (CA) model (such as our urban growth gis example model), writing self.agents.shuffle_do("step") is a bit confusing, whereas it would be better to have something like self.cells.shuffle_do("step"), since there's no moving agents, just cells.

However like I mentioned, all ideas / suggestions are welcome, and they don't have to align to my comments above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor RasterLayer to be based on PropertyLayer in Mesa

3 participants