Conversation
214d859 to
a09ddfd
Compare
|
@zhiltsov-max , do we have any difficulties to solve that: "The order of elements in a Dataset is maintained, but is not guaranteed to be the same after saving and loading"? I'm not sure that it is critical, but I prefer deterministic behavious if it is easy to achieve. |
|
@nmanovic, if a format represents a dataset with several subset files, it is impossible to reproduce initial item ordering. Example: |
|
@zhiltsov-max , should we update documentation? Are you planning to add some short tutorials for new use cases? |
|
@nmanovic, I'd prefer to update documentation after new API for operations are introduced, otherwise the changes are hard to perceive. Small catchy examples were added earlier, they still work - but now they also have good performance because of added transparent caching. Thorough documentation will be added with r0.2 (VCS) / r0.3 (stable API) and stable API introduction. |
Summary
Datasetoperations are finally made lazyDatasetDatasetis maintained, but is not guaranteed to be the same after saving and loadingDatumaroformatDatasetinterface with cache control, changed data info, source path and format infoDataset.get()returnsNoneinstead of raising an exception when the item doesn't existinoperator forDatasetgetoperation forExtractorDatasetclassConverterinterface is extended by optional operation to support partial data update (patch()). The default implementation uses the regular full-dataset saving.ExceptionDatasetcan track updates and generate patches. Transform is considered updating the whole datasetDataset.get_subsetprovides modifiable slicesTBD:
How to test
Checklist
developbranchLicense
Feel free to contact the maintainers if that's a concern.