All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Add tabular data import/export (open-edge-platform#1089)
- Support video annotation import/export (open-edge-platform#1124)
- Add SAM OVMS and Triton server Docker image builders (open-edge-platform#1129)
- Remove xfail marks from the convert integration tests (open-edge-platform#1115)
- Enhance
ClassificationValidatorfor multi-label classification datasets withlabel_groups(open-edge-platform#1116) - Replace Roboflow
xml.etreewithdefusedxml(open-edge-platform#1117) - Define
GroupTypewithIntEnumfor, where0isEXCLUSIVE(open-edge-platform#1116) - Add Rust API to optimize COCOPageMapper performance (open-edge-platform#1120)
- Fix bugs for Tile transform (open-edge-platform#1123)
- Report errors for COCO (stream) and Datumaro importers (open-edge-platform#1110)
- Add documentation and notebook example for Prune API (open-edge-platform#1070)
- Changed supported Python version range (>=3.8, <=3.11) (open-edge-platform#1083)
- Migrate OpenVINO v2023.0.0 (open-edge-platform#1036)
- Add Roboflow data format support (COCO JSON, Pascal VOC XML, YOLOv5-PyTorch, YOLOv7-PyTorch, YOLOv8, YOLOv5 Oriented Bounding Boxes, Multiclass CSV, TFRecord, CreateML JSON) (open-edge-platform#1044)
- Add MissingAnnotationDetection transform (open-edge-platform#1049, open-edge-platform#1063, open-edge-platform#1064)
- Add OVMSLauncher (open-edge-platform#1056)
- Add Prune API (open-edge-platform#1058)
- Add TritonLauncher (open-edge-platform#1059)
- Migrate DVC v3.0.0 (open-edge-platform#1072)
- Stream dataset import/export (open-edge-platform#1077, open-edge-platform#1081, open-edge-platform#1082, open-edge-platform#1091, open-edge-platform#1093, open-edge-platform#1098, open-edge-platform#1102)
- Support mask annotations for CVAT data format (open-edge-platform#1078)
- Support list query for explorer (open-edge-platform#1087)
- update contributing.md (open-edge-platform#1094)
- Update 3rd-party.txt for release 1.4.0 (open-edge-platform#1099)
- Give notice that the deprecation works will be done in datumaro==1.5.0 (open-edge-platform#1085)
- Unify COCO, Datumaro, VOC, YOLO importer/exporter progress reporter descriptions (open-edge-platform#1100)
- Enhance import performance for built-in plugins (open-edge-platform#1031)
- Change default dtype of load_image() to np.uint8 (open-edge-platform#1041)
- Add OTX ATSS detector model interpreter & refactor interfaces (open-edge-platform#1047)
- Refactor Launcher and ModelInterpreter (open-edge-platform#1055)
- Add CVAT data format document (open-edge-platform#1060)
- Reduce peak memory usage when importing COCO and Datumaro formats (open-edge-platform#1061)
- Enhance the error message for datum stats to be more user friendly (open-edge-platform#1069)
- Refactor dataset.py to seperate DatasetStorage (open-edge-platform#1073)
- Create cache dir under only writable filesystem (open-edge-platform#1088)
- Fix: Dataset infos() can be broken if a transform not redefining infos() is stacked on the top (open-edge-platform#1101)
- Fix warnings in test_visualizer.py (open-edge-platform#1039)
- Fix LabelMe data format (open-edge-platform#1053)
- Prevent installing protobuf>=4 (open-edge-platform#1054)
- Fix UnionMerge (open-edge-platform#1086)
- Let CocoBase continue even if an InvalidAnnotationError is raised (open-edge-platform#1050)
- Install dvc version to 2.x (open-edge-platform#1048)
- Replace np.append() in Validator (open-edge-platform#1050)
- Fix Cityscapes format mis-detection (open-edge-platform#1029)
- Add CocoRoboflowImporter (open-edge-platform#976, open-edge-platform#1000)
- Add SynthiaSfImporter and SynthiaAlImporter (open-edge-platform#987)
- Add intermediate skill docs for filter (open-edge-platform#996)
- Add VocInstanceSegmentationImporter and VocInstanceSegmentationExporter (open-edge-platform#997)
- Add Segment Anything data format support (open-edge-platform#1005, open-edge-platform#1009)
- Add Correct transformation (open-edge-platform#1006)
- Implement ReindexAnnotations transform (open-edge-platform#1008)
- Add notebook examples for importing/exporting detection and segmentation data (open-edge-platform#1020, open-edge-platform#1023)
- Update CLI from diff to compare, add TableComparator (open-edge-platform#1012)
- Use autosummary for fully-automatic Python module docs generation (open-edge-platform#973)
- Enrich stack trace for better user experience when importing (open-edge-platform#992)
- Save and load hashkey for explorer (open-edge-platform#981) (open-edge-platform#1003)
- Add MOT and MOTS data format docs (open-edge-platform#999)
- Improve RemoveAnnotations to remove specific annotations with ids (open-edge-platform#1004)
- Add Jupyter notebook example of noisy label detection for detection tasks (open-edge-platform#1011)
- Fix Mapillary Vistas data format (open-edge-platform#977)
- Fix
bytesproperty returningNoneif function is given todata(open-edge-platform#978) - Fix Synthia-Rand data format (open-edge-platform#987)
- Fix
person_layoutcategories andaction_classificationattributes in imported Pascal-VOC dataset (open-edge-platform#997) - Drop a malformed transform from StackedTransform automatically (open-edge-platform#1001)
- Fix
Cityscapesto dropImgsFinedirectory (open-edge-platform#1023)
- Fix project level CVAT for images format import (open-edge-platform#980)
- Fix an info message when using the convert CLI command with no args.input_format (open-edge-platform#982)
- Fix media contents not returning bytes in arrow format (open-edge-platform#986)
- Add Skill Up section to documentation (open-edge-platform#920, open-edge-platform#933, open-edge-platform#935, open-edge-platform#945, open-edge-platform#949, open-edge-platform#953, open-edge-platform#959, open-edge-platform#960, open-edge-platform#967)
- Add LossDynamicsAnalyzer for noisy label detection (open-edge-platform#928)
- Add Apache Arrow format support (open-edge-platform#931, open-edge-platform#948)
- Add sort transform (open-edge-platform#931)
- Add multiprocessing to DatumaroBinaryBase (open-edge-platform#897)
- Refactor merge code (open-edge-platform#901, open-edge-platform#906)
- Refactor download CLI commands (open-edge-platform#909)
- Refactor CLI commands w/ and w/o project (open-edge-platform#910, open-edge-platform#952)
- Refactor Media to be initialized from explicit sources (open-edge-platform#911 open-edge-platform#921, open-edge-platform#944)
- Refactor hl_ops.py (open-edge-platform#912)
- Add tfds:uc_merced and tfds:eurosat download (open-edge-platform#914)
- Migrate documentation framework to Sphinx (open-edge-platform#917, open-edge-platform#922, open-edge-platform#947, open-edge-platform#954, open-edge-platform#958, open-edge-platform#961, open-edge-platform#962, open-edge-platform#963, open-edge-platform#964, open-edge-platform#965, open-edge-platform#969)
- Update merge tutorial for real life usecase (open-edge-platform#930)
- Abbreviate "detect-format" to "detect" for prettifying (open-edge-platform#951)
- Add UserWarning if an invalid media_type comes to image statistics computation (open-edge-platform#891)
- Fix negated
is_encrypted(open-edge-platform#907) - Save extra images of PointCloud when exporting to datumaro format (open-edge-platform#918)
- Fix log issue when importing celeba and align celeba dataset (open-edge-platform#919)
- Fix to not export absolute media path in Datumaro and DatumaroBinary formats (open-edge-platform#896)
- Change pypi_publish.yml to publish_sdist_to_pypi.yml (open-edge-platform#895)
- Add with_subset_dirs decorator (Add ImagenetWithSubsetDirsImporter) (open-edge-platform#816)
- Add CommonSemanticSegmentationWithSubsetDirsImporter (open-edge-platform#826)
- Add DatumaroBinary format (open-edge-platform#828, open-edge-platform#829, open-edge-platform#830, open-edge-platform#831, open-edge-platform#880, open-edge-platform#883)
- Add Explorer CLI documentation (open-edge-platform#838)
- Add version to dataset exported as datumaro format (open-edge-platform#842)
- Add Ava action data format support (open-edge-platform#847)
- Add Shift Analyzer (both covariate and label shifts) (open-edge-platform#855)
- Add YOLO Loose format (open-edge-platform#856)
- Add Ultralytics YOLO format (open-edge-platform#859)
- Refactor Datumaro format code and test code (open-edge-platform#824)
- Add publish to PyPI Github action (open-edge-platform#867)
- Add --no-media-encryption option (open-edge-platform#875)
- Fix image filenames and anomaly mask appearance in MVTec exporter (open-edge-platform#835)
- Fix CIFAR10 and 100 detect function (open-edge-platform#836)
- Fix celeba and align_celeba detect function (open-edge-platform#837)
- Choose the top priority detect format for all directory depths (open-edge-platform#839)
- Fix MVTec format detect function (open-edge-platform#843)
- Fix wrong
__len__()of Subset when the item is removed (open-edge-platform#854) - Fix mask visualization bug (open-edge-platform#860)
- Fix detect unit tests to test false negatives as well (open-edge-platform#868)
- Add Data Explorer (open-edge-platform#773)
- Add Ellipse annotation type (open-edge-platform#807)
- Add MVTec anomaly data support (open-edge-platform#810)
- Refactor existing tests (open-edge-platform#803)
- Raise ImportError on importing malformed COCO directory (open-edge-platform#812)
- Remove the duplicated and cyclical category context in documentation (open-edge-platform#822)
- Fix for importing CVAT image 1.1 data format exported to project level (open-edge-platform#795)
- Fix a problem on setting log-level via CLI (open-edge-platform#800)
- Fix code format with the latest black==23.1.0 (open-edge-platform#802)
- Fix Explain command cannot find the model (#721) (open-edge-platform#804)
- Fix a problem found on model remove CLI command (open-edge-platform#805)
- Add Tile transformation (open-edge-platform#790)
- Add Video keyframe extraction (open-edge-platform#791)
- Add TileTransform documentation and Jupyter notebook example (open-edge-platform#794)
- Add MergeTile transformation (open-edge-platform#796)
- Improved mask_to_rle performance (open-edge-platform#770)
- N/A
- N/A
- Fix MacOS CI failures (open-edge-platform#789)
- Fix auto-documentation for the data_format plugins (open-edge-platform#793)
- Add security.md file for the SDL (open-edge-platform#798)
- Support for exclusive of labels with LabelGroup (open-edge-platform#742)
- Jupyter samples
- Introducing how to merge datasets (open-edge-platform#738)
- Introducing how to visualize dataset (open-edge-platform#747)
- Introducing how to filter dataset (open-edge-platform#748)
- Introducing how to transform dataset (open-edge-platform#759)
- Visualization Python API
- Bbox feature (open-edge-platform#744)
- Label, Points, Polygon, PolyLine, and Caption visualization features (open-edge-platform#746)
- Mask, SuperResolution, Depth visualization features (open-edge-platform#747)
- Documentation for Python API
(open-edge-platform#753)
- dataset handler, visualizer, filter descriptions (open-edge-platform#761)
__repr__for Dataset (open-edge-platform#750)- Support for exporting as CVAT video format (open-edge-platform#757)
- CodeCov coverage reporting feature to CI/CD (open-edge-platform#756)
- Jupyter notebook example rendering to documentation (open-edge-platform#758)
- An interface to manipulate 'infos' to store the dataset meta-info (open-edge-platform#767)
- 'bbox' annotation when importing a COCO dataset (open-edge-platform#772)
- Wrap title text according to its plot width (open-edge-platform#769)
- Get list of subsets and support only Image media type in visualizer (open-edge-platform#768)
- N/A
- N/A
- Correcting static type checking (open-edge-platform#743)
- Fixing a VOC dataset export when a label contains 'space' (open-edge-platform#771)
- N/A
- Support for custom media types, new
PointCloudmedia type,DatasetItem.mediaand.media_as(type)members (open-edge-platform#539) - [API] A way to request dataset and extractor media type with
media_type(open-edge-platform#539) - BraTS format (import-only) (.npy and .nii.gz), new
MultiframeImagemedia type (open-edge-platform#628) - Common Semantic Segmentation dataset format (import-only) (open-edge-platform#685)
- An option to disable
data/prefix inclusion in YOLO export (open-edge-platform#689) - New command
describe-downloadsto print information about downloadable datasets (open-edge-platform#678) - Detection for Cityscapes format (open-edge-platform#680)
- Maximum recursion
--depthparameter fordetect-datasetCLI command (open-edge-platform#680) - An option to save a single subset in the
downloadcommand (open-edge-platform#697) - Common Super Resolution dataset format (import-only) (open-edge-platform#700)
- Kinetics 400/600/700 dataset format (import-only) (open-edge-platform#706)
- NYU Depth Dataset V2 format (import-only) (open-edge-platform#712)
env.detect_dataset()now returns a list of detected formats at all recursion levels instead of just the lowest one (open-edge-platform#680)- Open Images: allowed to store annotations file in root path as well (open-edge-platform#680)
- Improved parsing error messages in COCO, VOC and YOLO formats (open-edge-platform#684, open-edge-platform#686, open-edge-platform#687)
- YOLO format now supports almost any subset names, except
backup,namesandclasses(instead of justtrainandvalid). The reserved names now raise an error on exporting. (open-edge-platform#688)
--save-imagesis replaced with--save-mediain CLI and converter API (open-edge-platform#539)- [API]
image,point_cloudandrelated_imagesofDatasetItemare replaced withmediaandmedia_as(type)members and c-tor parameters (open-edge-platform#539)
- N/A
- Detection for LFW format (open-edge-platform#680)
- Adding depth value of image when dataset is exported in VOC format (open-edge-platform#726)
- Adding to handle the numerical labels in task chains properly (open-edge-platform#726)
- Fixing the issue that annotations inside another annotation (polygon) are duplicated during import for VOC format (open-edge-platform#726)
- N/A
- Ability to import a video as frames with the
video_framesformat and to split a video into frames with thedatum util split_videocommand (open-edge-platform#555) --subsetparameter in theimage_dirformat (open-edge-platform#555)MediaManagerAPI to control loaded media resources at runtime (open-edge-platform#555)- Command to detect the format of a dataset (open-edge-platform#576)
- More comfortable access to library API via
import datumaro(open-edge-platform#630) - CLI command-like free functions (
export,transform, ...) (open-edge-platform#630) - Reading specific annotation files for train dataset in Cityscapes (open-edge-platform#632)
- Random sampling transforms (
random_sampler,label_random_sampler) to create smaller datasets from bigger ones (open-edge-platform#636, open-edge-platform#640) - API to report dataset import and export progress; API to report dataset import and export errors and take action (skip, fail) (supported in COCO, VOC and YOLO formats) (open-edge-platform#650)
- Support for downloading the ImageNetV2 and COCO datasets (open-edge-platform#653, open-edge-platform#659)
- A way for formats to signal that they don't support detection (open-edge-platform#665)
- Removal transforms to remove items/annoations/attributes from dataset
(
remove_items,remove_annotations,remove_attributes) (open-edge-platform#670)
- Allowed direct file paths in
datum import. Such sources are imported like when therpathparameter is specified, however, only the selected path is copied into the project (open-edge-platform#555) - Improved
statsperformance, added new filtering parameters, image stats (unique,repeated) moved to thedatasetsection, removedmeanandstdfrom thedatasetsection (open-edge-platform#621) - Allowed
Imagecreation from justsizeinfo (open-edge-platform#634) - Added image search in VOC XML-based subformats (open-edge-platform#634)
- Added image path equality checks in simple merge, when applicable (open-edge-platform#634)
- Supported saving box attributes when downloading the TFDS version of VOC (open-edge-platform#668)
- Switched to a
pyproject.toml-based build (open-edge-platform#671)
- TBD
- Official support of Python 3.6 (due to it's EOL) (open-edge-platform#617)
- Backward compatibility annotation symbols in
components.extractor(open-edge-platform#630)
- Prohibited calling
add,importandexportcommands without a project (open-edge-platform#555) - Calling
make_dataseton empty project tree now produces the error properly (open-edge-platform#555) - Saving (overwriting) a dataset in a project when rpath is used (open-edge-platform#613)
- Output image extension preserving in the
Resizetransform (open-edge-platform#606) - Memory overuse in the
Resizetransform (open-edge-platform#607) - Invalid image pixels produced by the
Resizetransform (open-edge-platform#618) - Numeric warnings that sometimes occurred in
statscommand (e.g. open-edge-platform#607) (open-edge-platform#621) - Added missing item attribute merging in simple merge (open-edge-platform#634)
- Inability to disambiguate VOC from LabelMe in some cases (open-edge-platform#658)
- TBD
- Command to download public datasets (open-edge-platform#582)
- Extension autodetection in
ByteImage(open-edge-platform#595) - MPII Human Pose Dataset (import-only) (.mat and .json) (open-edge-platform#584)
- MARS format (import-only) (open-edge-platform#585)
- The
pycocotoolsdependency lower bound is raised to2.0.4. (open-edge-platform#449) smooth_linefromdatumaro.util.annotation_util- the function is renamed toapproximate_lineand has updated interface (open-edge-platform#592)
- Python 3.6 support
- TBD
- Fails in multimerge when lines are not approximated and when there are no label categories (open-edge-platform#592)
- Cannot convert LabelMe dataset, that has no subsets (open-edge-platform#600)
- TBD
- Video reading API (open-edge-platform#521)
- Python API documentation (open-edge-platform#526)
- Mapillary Vistas dataset format (Import-only) (open-edge-platform#537)
- Datumaro can now be installed on Windows on Python 3.9 (open-edge-platform#547)
- Import for SYNTHIA dataset format (open-edge-platform#532)
- Support of
scoreattribute in KITTI detetion (open-edge-platform#571) - Support for Accuracy Checker dataset meta files in formats (open-edge-platform#553, open-edge-platform#569, open-edge-platform#575)
- Import for VoTT dataset format (open-edge-platform#573)
- Image resizing transform (open-edge-platform#581)
- The following formats can now be detected unambiguously:
ade20k2017,ade20k2020,camvid,coco,cvat,datumaro,icdar_text_localization,icdar_text_segmentation,icdar_word_recognition,imagenet_txt,kitti_raw,label_me,lfw,mot_seq,open_images,vgg_face2,voc,widerface,yolo(open-edge-platform#531, open-edge-platform#536, open-edge-platform#550, open-edge-platform#557, open-edge-platform#558) - Allowed Pytest-native tests (open-edge-platform#563)
- Allowed export options in the
datum mergecommand (open-edge-platform#545)
- Using
Image,ByteImagefromdatumaro.util.image- these classes are moved todatumaro.components.media(open-edge-platform#538)
- Equality comparison support between
datumaro.components.media.Imageandnumpy.ndarray(open-edge-platform#568)
- Bug #560: import issue with MOT dataset when using seqinfo.ini file (open-edge-platform#564)
- Empty lines in VOC subset lists are not ignored (open-edge-platform#587)
- TBD
- Import for CelebA dataset format. (open-edge-platform#484)
- File
people.txtbecame optional in LFW (open-edge-platform#509) - File
image_ids_and_rotation.csvbecame optional Open Images (open-edge-platform#509) - Allowed underscores (
_) in subset names in COCO (open-edge-platform#509) - Allowed annotation files with arbitrary names in COCO (open-edge-platform#509)
- The
icdar_text_localizationformat is no longer detected in every directory (open-edge-platform#531) - Updated
pycocotoolsversion to 2.0.2 (open-edge-platform#534)
- TBD
- TBD
- Unhandled exception when a file is specified as the source for a COCO or MOTS dataset (open-edge-platform#530)
- Exporting dataset without
colorattribute into theicdar_text_segmentationformat (open-edge-platform#556)
- TBD
- A new installation target:
pip install datumaro[default], which should be used by default. The simpledatumarois supposed for library users. (open-edge-platform#238) - Dataset and project versioning capabilities (Git-like) (open-edge-platform#238)
- "dataset revpath" concept in CLI, allowing to pass a dataset path with
the dataset format in
diff,merge,explainandinfoCLI commands (open-edge-platform#238) import,remove,commit,checkout,log,status,infoCLI commands (open-edge-platform#238)Coco*Extractorclasses now have an option to preserve label IDs from the original annotation file (open-edge-platform#453)patchCLI command to patch datasets (open-edge-platform#401)ProjectLabelstransform to change dataset labels for merging etc. (open-edge-platform#401, open-edge-platform#478)- Support for custom labels in the KITTI detection format (open-edge-platform#481)
- Type annotations and docs for Annotation classes (open-edge-platform#493)
- Options to control label loading behavior in
imagenet_txtimport (open-edge-platform#434, open-edge-platform#489)
- A project can contain and manage multiple datasets instead of a single one. CLI operations can be applied to the whole project, or to separate datasets. Datasets are modified inplace, by default (open-edge-platform#328)
- CLI help for builtin plugins doesn't require project (open-edge-platform#328)
- Annotation-related classes were moved into a new module,
datumaro.components.annotation(open-edge-platform#439) - Rollback utilities replaced with Scope utilities (open-edge-platform#444)
- The
Projectclass fromdatumaro.componentsis changed completely (open-edge-platform#238) diffandediffare joined into a singlediffCLI command (open-edge-platform#238)- Projects use new file layout, incompatible with old projects.
An old project can be updated with
datum project migrate(open-edge-platform#238) - Inheriting
CliPluginis not required in plugin classes (open-edge-platform#238) Importers do not createProjects anymore and just return a list of extractor configurations (open-edge-platform#238)
- TBD
import,project mergeCLI commands (open-edge-platform#238)- Support for project hierarchies. A project cannot be a source anymore (open-edge-platform#238)
- Project cannot have independent internal dataset anymore. All the project data must be stored in the project data sources (open-edge-platform#238)
datumaro_projectformat (open-edge-platform#238)- Unused
pathfield ofDatasetItem(open-edge-platform#455)
- Deprecation warning in
open_images_format.py(open-edge-platform#440) lazy_imagereturning unrelated data sometimes (open-edge-platform#409)- Invalid call to
pycocotools.mask.iou(open-edge-platform#450) - Importing of Open Images datasets without image data (open-edge-platform#463)
- Return value type in
Dataset.is_modified(open-edge-platform#401) - Remapping of secondary categories in
RemapLabels(open-edge-platform#401) - VOC dataset patching for classification and segmentation tasks (open-edge-platform#478)
- Exported mask label ids in KITTI segmentation (open-edge-platform#481)
- Missing
labelforPointsread in the LFW format (open-edge-platform#494)
- TBD
- The Open Images format now supports bounding box and segmentation mask annotations (open-edge-platform#352, open-edge-platform#388).
- Bounding boxes values decrement transform (open-edge-platform#366)
- Improved error reporting in
Dataset(open-edge-platform#386) - Support ADE20K format (import only) (open-edge-platform#400)
- Documentation website at https://openvinotoolkit.github.io/datumaro (open-edge-platform#420)
- Datumaro no longer depends on scikit-image (open-edge-platform#379)
Datasetremembers export options on saving / exporting for the first time (open-edge-platform#386)
- TBD
- TBD
- Application of
remap_labelsto dataset categories of different length (open-edge-platform#314) - Patching of datasets in formats (open-edge-platform#348)
- Improved Cityscapes export performance (open-edge-platform#367)
- Incorrect format of
*_labelIds.pngin Cityscapes export (open-edge-platform#325, open-edge-platform#342) - Item id in ImageNet format (open-edge-platform#371)
- Double quotes for ICDAR Word Recognition (open-edge-platform#375)
- Wrong display of builtin formats in CLI (open-edge-platform#332)
- Non utf-8 encoding of annotation files in Market-1501 export (open-edge-platform#392)
- Import of ICDAR, PASCAL VOC and VGGFace2 images from subdirectories on WIndows (open-edge-platform#392)
- Saving of images with Unicode paths on Windows (open-edge-platform#392)
- Calling
ProjectDataset.transform()with a string argument (open-edge-platform#402) - Attributes casting for CVAT format (open-edge-platform#403)
- Loading of custom project plugins (open-edge-platform#404)
- Reading, writing anno file and saving name of the subset for test subset (open-edge-platform#447)
- Fixed unsafe unpickling in CIFAR import (open-edge-platform#362)
- Support for import/export zip archives with images (open-edge-platform#273)
- Subformat importers for VOC and COCO (open-edge-platform#281)
- Support for KITTI dataset segmentation and detection format (open-edge-platform#282)
- Updated YOLO format user manual (open-edge-platform#295)
ItemTransformclass, which describes item-wise datasetTransforms (open-edge-platform#297)keep-emptyexport parameter in VOC format (open-edge-platform#297)- A base class for dataset validation plugins (open-edge-platform#299)
- Partial support for the Open Images format; only images and image-level labels can be read/written (open-edge-platform#291, open-edge-platform#315).
- Support for Supervisely Point Cloud dataset format (open-edge-platform#245, open-edge-platform#353)
- Support for KITTI Raw / Velodyne Points dataset format (open-edge-platform#245)
- Support for CIFAR-100 and documentation for CIFAR-10/100 (open-edge-platform#301)
- Tensorflow AVX check is made optional in API and disabled by default (open-edge-platform#305)
- Extensions for images in ImageNet_txt are now mandatory (open-edge-platform#302)
- Several dependencies now have lower bounds (open-edge-platform#308)
- TBD
- TBD
- Incorrect image layout on saving and a problem with ecoding on loading (open-edge-platform#284)
- An error when XPath filter is applied to the dataset or its subset (open-edge-platform#259)
- Tracking of
Datasetchanges done by transforms (open-edge-platform#297) - Improved CLI startup time in several cases (open-edge-platform#306)
- Known issue: loading CIFAR can result in arbitrary code execution (open-edge-platform#327)
- Support for escaping in attribute values in LabelMe format (open-edge-platform#49)
- Support for Segmentation Splitting (open-edge-platform#223)
- Support for CIFAR-10/100 dataset format (open-edge-platform#225, open-edge-platform#243)
- Support for COCO panoptic and stuff format (open-edge-platform#210)
- Documentation file and integration tests for Pascal VOC format (open-edge-platform#228)
- Support for MNIST and MNIST in CSV dataset formats (open-edge-platform#234)
- Documentation file for COCO format (open-edge-platform#241)
- Documentation file and integration tests for YOLO format (open-edge-platform#246)
- Support for Cityscapes dataset format (open-edge-platform#249)
- Support for Validator configurable threshold (open-edge-platform#250)
- LabelMe format saves dataset items with their relative paths by subsets without changing names (open-edge-platform#200)
- Allowed arbitrary subset count and names in classification and detection splitters (open-edge-platform#207)
- Annotation-less dataset elements are now participate in subset splitting (open-edge-platform#211)
- Classification task in LFW dataset format (open-edge-platform#222)
- Testing is now performed with pytest instead of unittest (open-edge-platform#248)
- TBD
- TBD
- Added support for auto-merging (joining) of datasets with no labels and having labels (open-edge-platform#200)
- Allowed explicit label removal in
remap_labelstransform (open-edge-platform#203) - Image extension in CVAT format export (open-edge-platform#214)
- Added a label "face" for bounding boxes in Wider Face (open-edge-platform#215)
- Allowed adding "difficult", "truncated", "occluded" attributes when converting to Pascal VOC if these attributes are not present (open-edge-platform#216)
- Empty lines in YOLO annotations are ignored (open-edge-platform#221)
- Export in VOC format when no image info is available (open-edge-platform#239)
- Fixed saving attribute in WiderFace extractor (open-edge-platform#251)
- TBD
- TBD
- Added an option to allow undeclared annotation attributes in CVAT format export (open-edge-platform#192)
- COCO exports images in separate dirs by subsets. Added an option to control this (open-edge-platform#195)
- TBD
- TBD
- Instance masks of
backgroundclass no more introduce an instance (open-edge-platform#188) - Added support for label attributes in Datumaro format (open-edge-platform#192)
- TBD
- OpenVINO plugin examples (open-edge-platform#159)
- Dataset validation for classification and detection datasets (open-edge-platform#160)
- Arbitrary image extensions in formats (import and export) (open-edge-platform#166)
- Ability to set a custom subset name for an imported dataset (open-edge-platform#166)
- CLI support for NDR(open-edge-platform#178)
- Common ICDAR format is split into 3 sub-formats (open-edge-platform#174)
- TBD
- TBD
- The ability to work with file names containing Cyrillic and spaces (open-edge-platform#148)
- Image reading and saving in ICDAR formats (open-edge-platform#174)
- Unnecessary image loading on dataset saving (open-edge-platform#176)
- Allowed spaces in ICDAR captions (open-edge-platform#182)
- Saving of masks in VOC when masks are not requested (open-edge-platform#184)
- TBD
- TBD
- TBD
- TBD
- TBD
- Images with no annotations are exported again in VOC formats (open-edge-platform#123)
- Inference result for only one output layer in OpenVINO launcher (open-edge-platform#125)
- TBD
Icdar13/15dataset format (open-edge-platform#96)- Laziness, source caching, tracking of changes and partial updating for
Dataset(open-edge-platform#102) Market-1501dataset format (open-edge-platform#108)LFWdataset format (open-edge-platform#110)- Support of polygons' and masks' confusion matrices and mismathing classes in
diffcommand (open-edge-platform#117) - Add near duplicate image removal plugin (open-edge-platform#113)
- Sampler Plugin that analyzes inference result from the given dataset and selects samples for annotation(open-edge-platform#115)
- OpenVINO model launcher is updated for OpenVINO r2021.1 (open-edge-platform#100)
- TBD
- TBD
- High memory consumption and low performance of mask import/export, #53 (open-edge-platform#101)
- Masks, covered by class 0 (background), should be exported with holes inside (open-edge-platform#104)
diffcommand invocation problem with missing class methods (open-edge-platform#117)
- TBD
WiderFacedataset format (open-edge-platform#65, open-edge-platform#90)- Function to transform annotations to labels (open-edge-platform#66)
- Dataset splits for classification, detection and re-id tasks (open-edge-platform#68, open-edge-platform#81)
VGGFace2dataset format (open-edge-platform#69, open-edge-platform#82)- Unique image count statistic (open-edge-platform#87)
- Installation with pip by name
datumaro
Datasetclass extended with new operations:save,load,export,import_from,detect,run_model(open-edge-platform#71)- Allowed importing
Extractor-only defined formats (inProject.import_from,dataset.import_fromand CLI/project import) (open-edge-platform#71) datum project ...commands replaced withdatum ...commands (open-edge-platform#84)- Supported more image formats in
ImageNetextractors (open-edge-platform#85) - Allowed adding
Importer-defined formats as project sources (source add) (open-edge-platform#86) - Added max search depth in
ImageDirformat and importers (open-edge-platform#86)
datum project ...CLI context (open-edge-platform#84)
- TBD
- Allow plugins inherited from
Extractor(instead of onlySourceExtractor) (open-edge-platform#70) - Windows installation with
pipforpycocotools(open-edge-platform#73) YOLOextractor path matching on Windows (open-edge-platform#73)- Fixed inplace file copying when saving images (open-edge-platform#76)
- Fixed
labelmapparameter type checking inVOCconverter (open-edge-platform#76) - Fixed model copying on addition in CLI (open-edge-platform#94)
- TBD
CamViddataset format (open-edge-platform#57)- Ability to install
opencv-python-headlessdependency withDATUMARO_HEADLESS=1environment variable instead ofopencv-python(open-edge-platform#62)
- Allow empty supercategory in COCO (open-edge-platform#54)
- Allow Pascal VOC to search in subdirectories (open-edge-platform#50)
- TBD
- TBD
- TBD
- TBD
ImageNetandImageNetTxtdataset formats (open-edge-platform#41)
- TBD
- TBD
- TBD
- Default
label-mapparameter value for VOC converter (open-edge-platform#34) - Randomness of random split transform (open-edge-platform#38)
Transform.subsets()method (open-edge-platform#38)- Supported unknown image formats in TF Detection API converter (open-edge-platform#40)
- Supported empty attribute values in CVAT extractor (open-edge-platform#45)
- TBD
ByteImageclass to represent encoded images in memory and avoid recoding on save (open-edge-platform#27)
- Implementation of format plugins simplified (open-edge-platform#22)
defaultis now a default subset name, instead ofNone. The values are interchangeable. (open-edge-platform#22)- Improved performance of transforms (open-edge-platform#22)
- TBD
image/depthvalue from VOC export (open-edge-platform#27)
- Zero division errors in dataset statistics (open-edge-platform#31)
- TBD
reindexoption in COCO and CVAT converters (open-edge-platform#18)- Support for relative paths in LabelMe format (open-edge-platform#19)
- MOTS png mask format support (https://github.com/openvinotoolkit/datumaro/21)
- TBD
- TBD
- TBD
- TBD
- TBD
- Initial release
## [Unreleased]
### New features
- TBD
### Enhancements
- TBD
### Deprecated
- TBD
### Removed
- TBD
### Bug fixes
- TBD
### Security
- TBD