Skip to content

Add superslab filtering functionality#16

Merged
lgarrison merged 4 commits intomasterfrom
filter-func
Sep 8, 2021
Merged

Add superslab filtering functionality#16
lgarrison merged 4 commits intomasterfrom
filter-func

Conversation

@lgarrison
Copy link
Member

This PR introduces a feature for filtering rows from the halo table on a per-superslab basis as they are loaded into memory. This saves memory, since the full, unfiltered table never needs to be constructed. Syntax is a function passed to the CHC constructor, often a lambda:

cat = CompaSOHaloCatalog('/mnt/home/lgarrison/ceph/AbacusSummit/AbacusSummit_hugebase_c000_ph000/halos/z0.100/',
                         fields=['N','x_L2com'],
                         filter_func=lambda c: c['N'] >= 100,
                        )

If requesting subsamples, only the subsamples from the surviving halos are loaded, but the peak memory usage is still higher than it needs to be. This part of the code needs a refactor anyways (#7), but that's a future issue.

This required a bit of refactoring of how the cleaning files are read; now the original and cleaned files are read as a pair for each superslab, rather than all the originals, then all the cleaning.

@lgarrison lgarrison merged commit 872a926 into master Sep 8, 2021
@lgarrison lgarrison deleted the filter-func branch September 8, 2021 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant