Closed
Conversation
…overview which projects have a payload that is larger than the main memory of the host system
…gnored, hopefully also bugfix that yaml emits complex-formatted payload
…ssing and batch analysis
…st modified, author from the microscope directory as user, atom types
…and bugfixing incorrect name for row
… 2020ish nionswift project files, add bypass to first go through small datasets, as system is currently busy
… or data to the NeXus, i.e., when the nionswift project parser is used in so-called analysis mode that metadata from data.npy and hfive files will not read the entire content but only metadata from the respective file headers, this should consume substantially less main memory than before and thus be also faster
…t least for the analysis mode should work fine, documented a potential memory leak that needs addressing soonish, likely related to switching logs though and log state variables not gc freed, one should check carefully when running pynxtools in production mode on the actual datasets.
…a few files had nionswift projects files missing, this commit configures the script to generate a complete nsproj_to_eln file that identifies which hashed results logs and yaml files belong to which project
…d ipynb with analyses of metadata
…respective subsequent function call already, implement a fix for cases where individual ndata files were corrupted causing an uncatched IOError that stopped parsing, right now we warn about this issue but continue parsing. The idea is that the pynxtools parser should always warn when specific portions cannot be parsed but it should not stop in an uncontrolled manner, even if that results in a NeXus file that might end up empty or not fully filled with instance data to qualify for matching with NXem, otherwise these uncontrolled throwing would affect automated processing pipelines like in RDM systems, e.g., NOMAD which is undesired
… different NXdata types, each dataset not more than 8GiB payload, to get all 35 different types, would demand allowing the processing of even the largest datasets with 130GiB payload, this we do not wish to pursue after the Dec, 11th talk to focus and at the same time sample broad
Collaborator
Author
|
Included and functionally superseeded by #157 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.