open-edge-platform · wonjuleee · Apr 5, 2023 · Apr 3, 2023 · Apr 3, 2023 · Apr 3, 2023
diff --git a/docs/source/docs/data-formats/datumaro_format.md b/docs/source/docs/data-formats/datumaro_format.md
@@ -0,0 +1,137 @@
+# Datumaro Format
+
+So far, in the field of computer vision, there are various tasks such as classification, detection,
+and segmentation, as well as pose estimation and visual tracking, and public data is used by providing
+a format suitable for each task. Even within the same segmentation task, some data formats provide
+annotation information as polygons, while others provide mask form. In order to ensure compatibility
+with different tasks and formats, we provide a novel Datumaro format with `.json` or `.datum`
+extensions.
+
+A variety of metadata can be stored in the datumaro format. First of all, `dm_format_version` field
+is provided for backward compatibility to help with data version tracing.And various metadata can be
+added to the `info` field. For example, you can record task types such as detection and segmentation,
+or record data creation time. Labels and attributes can be saved in the `categories` field, and mask
+colormap information can be saved. In addition, in the datumaro format, in order to respond to
+hierarchical classification or multi-label classification tasks, `label_group` is provided to record
+whether or not enabling multiple selection between labels in a group and the `parent` is provided to
+specify the parent label for each label. Finally, in the `item` field, we can write the annotation
+information for each media id, and additionally write the data path and data size.
+
+Here is the example of `json` annotation file:
+
+```json
+{
+    "dm_format_version": "1.0",
+    "infos": {
+        "task": "anomaly_detection",
+        "creation time": "2023.4.1"
+    },
+    "categories": {
+        "label": {
+            "labels": [
+                {
+                    "name": "Normal",
+                    "parent": "",
+                    "attributes": []
+                },
+                {
+                    "name": "Anomalous",
+                    "parent": "",
+                    "attributes": []
+                }
+            ],
+            "label_groups": [
+                {
+                    "name": "Label",
+                    "group_type": "exclusive",
+                    "labels": [
+                        "Anomalous",
+                        "Normal"
+                    ]
+                }
+            ],
+            "attributes": []
+        },
+        "mask": {
+            "colormap": [
+                {
+                    "label_id": 1,
+                    "r": 255,
+                    "g": 255,
+                    "b": 255
+                }
+            ]
+        }
+    },
+    "items": [
+        {
+            "id": "good_001",
+            "annotations": [
+                {
+                    "id": 0,
+                    "type": "label",
+                    "attributes": {},
+                    "group": 0,
+                    "label_id": 0
+                }
+            ],
+            "image": {
+                "path": "good_001.jpg",
+                "size": [
+                    900,
+                    900
+                ]
+            }
+        },
+        {
+            "id": "broken_small_001",
+            "annotations": [
+                {
+                    "id": 0,
+                    "type": "bbox",
+                    "attributes": {},
+                    "group": 0,
+                    "label_id": 1,
+                    "z_order": 0,
+                    "bbox": [
+                        350.8999938964844,
+                        151.3899993896484,
+                        275.1399841308594,
+                        126.4900054931640
+                    ]
+                }
+            ],
+            "image": {
+                "path": "broken_small_001.jpg",
+                "size": [
+                    900,
+                    900
+                ]
+            }
+        },
+    ]
+}
+```
+
+A Datumaro format directory have the following structure:
+
+<!--lint disable fenced-code-flag-->
+```
+dataset/
+├── dataset_meta.json # a list of non-format labels (optional)
+├── images/
+│   ├── train/  # directory with training images
+│   |    ├── img001.png
+│   |    ├── img002.png
+│   |    └── ...
+│   ├── val/  # directory with validation images
+│   |    ├── img001.png
+│   |    ├── img002.png
+│   |    └── ...
+│   └── ...
+│
+└── annotations/
+    ├── train.json  # annotation file with training data
+    ├── val.json  # annotation file with validation data
+    └── ...
+```
diff --git a/docs/source/docs/data-formats/index.rst b/docs/source/docs/data-formats/index.rst
@@ -0,0 +1,9 @@
+Data Formats
+###########
+
+.. toctree::
+   :maxdepth: 1
+
+   supported_formats
+   media_formats
+   datumaro_format
diff --git a/.../source/docs/user-manual/media_formats.md → ...source/docs/data-formats/media_formats.md b/.../source/docs/user-manual/media_formats.md → ...source/docs/data-formats/media_formats.md
@@ -1,7 +1,8 @@
-# Media formats
+# Supported Media Formats
 
 Datumaro supports the following media types:
 - 2D RGB(A) images
+- Videos
 - KITTI Point Clouds
 
 To create an unlabelled dataset from an arbitrary directory with images use

diff --git a/...rce/docs/user-manual/supported_formats.md → ...ce/docs/data-formats/supported_formats.md b/...rce/docs/user-manual/supported_formats.md → ...ce/docs/data-formats/supported_formats.md
@@ -1,4 +1,4 @@
-# Dataset Formats
+# Supported Dataset Formats
 
 List of supported formats:
 - ADE20k (v2017) (import-only)

diff --git a/docs/source/docs/index.rst b/docs/source/docs/index.rst
@@ -13,6 +13,14 @@ Docs
   :caption: Guides
 
   user-manual/index
+  data-formats/index
+
+.. toctree::
+  :hidden:
+  :caption: Level Up
+
+  level-up/basic_skills/index
+  level-up/intermediate_skills/index
 
 .. toctree::
   :hidden:

diff --git a/docs/source/docs/level-up/basic_skills/index.rst b/docs/source/docs/level-up/basic_skills/index.rst
@@ -0,0 +1,17 @@
+Basic Skills
+#################
+
+.. toctree::
+   :maxdepth: 1
+
+   import
+   export
+   validate
+   visualize
+   filter
+   compare
+   transform
+   merge
+   split
+   search
+   generate
diff --git a/docs/source/docs/level-up/index.rst b/docs/source/docs/level-up/index.rst
@@ -0,0 +1,8 @@
+User Manual
+###########
+
+.. toctree::
+   :maxdepth: 1
+
+   basic_skills/index
+   intermediate_skills/index
diff --git a/docs/source/docs/level-up/intermediate_skills/data_aggregation.md b/docs/source/docs/level-up/intermediate_skills/data_aggregation.md
@@ -0,0 +1,28 @@
+# Data Aggregation
+
+Datumaro aims to refine data
+
+``` bash
+datum create -o <project/dir>
+datum import -p <project/dir> -f image_dir <directory/path/>
+```
+
+or, if you work with Datumaro API:
+
+- for using with a project:
+
+  ```python
+  from datumaro.project import Project
+
+  project = Project.init()
+  project.import_source('source1', format='image_dir', url='directory/path/')
+  dataset = project.working_tree.make_dataset()
+  ```
+
+- for using as a dataset:
+
+  ```python
+  from datumaro import Dataset
+
+  dataset = Dataset.import_from('directory/path/', 'image_dir')
+  ```
diff --git a/docs/source/docs/level-up/intermediate_skills/data_comparison.md b/docs/source/docs/level-up/intermediate_skills/data_comparison.md
@@ -0,0 +1,28 @@
+# Data Comparison
+
+Datumaro aims to refine data
+
+``` bash
+datum create -o <project/dir>
+datum import -p <project/dir> -f image_dir <directory/path/>
+```
+
+or, if you work with Datumaro API:
+
+- for using with a project:
+
+  ```python
+  from datumaro.project import Project
+
+  project = Project.init()
+  project.import_source('source1', format='image_dir', url='directory/path/')
+  dataset = project.working_tree.make_dataset()
+  ```
+
+- for using as a dataset:
+
+  ```python
+  from datumaro import Dataset
+
+  dataset = Dataset.import_from('directory/path/', 'image_dir')
+  ```
diff --git a/docs/source/docs/level-up/intermediate_skills/data_exploration.md b/docs/source/docs/level-up/intermediate_skills/data_exploration.md
@@ -0,0 +1,28 @@
+# Data Explorartion
+
+Datumaro aims to refine data
+
+``` bash
+datum create -o <project/dir>
+datum import -p <project/dir> -f image_dir <directory/path/>
+```
+
+or, if you work with Datumaro API:
+
+- for using with a project:
+
+  ```python
+  from datumaro.project import Project
+
+  project = Project.init()
+  project.import_source('source1', format='image_dir', url='directory/path/')
+  dataset = project.working_tree.make_dataset()
+  ```
+
+- for using as a dataset:
+
+  ```python
+  from datumaro import Dataset
+
+  dataset = Dataset.import_from('directory/path/', 'image_dir')
+  ```
diff --git a/docs/source/docs/level-up/intermediate_skills/data_generation.md b/docs/source/docs/level-up/intermediate_skills/data_generation.md
@@ -0,0 +1,28 @@
+# Data Generation
+
+Datumaro aims to refine data
+
+``` bash
+datum create -o <project/dir>
+datum import -p <project/dir> -f image_dir <directory/path/>
+```
+
+or, if you work with Datumaro API:
+
+- for using with a project:
+
+  ```python
+  from datumaro.project import Project
+
+  project = Project.init()
+  project.import_source('source1', format='image_dir', url='directory/path/')
+  dataset = project.working_tree.make_dataset()
+  ```
+
+- for using as a dataset:
+
+  ```python
+  from datumaro import Dataset
+
+  dataset = Dataset.import_from('directory/path/', 'image_dir')
+  ```
diff --git a/docs/source/docs/level-up/intermediate_skills/data_merge.md b/docs/source/docs/level-up/intermediate_skills/data_merge.md
@@ -0,0 +1,28 @@
+# Data Merge
+
+Datumaro aims to refine data
+
+``` bash
+datum create -o <project/dir>
+datum import -p <project/dir> -f image_dir <directory/path/>
+```
+
+or, if you work with Datumaro API:
+
+- for using with a project:
+
+  ```python
+  from datumaro.project import Project
+
+  project = Project.init()
+  project.import_source('source1', format='image_dir', url='directory/path/')
+  dataset = project.working_tree.make_dataset()
+  ```
+
+- for using as a dataset:
+
+  ```python
+  from datumaro import Dataset
+
+  dataset = Dataset.import_from('directory/path/', 'image_dir')
+  ```
diff --git a/docs/source/docs/level-up/intermediate_skills/data_refinement.md b/docs/source/docs/level-up/intermediate_skills/data_refinement.md
@@ -0,0 +1,28 @@
+# Data Refinement
+
+Datumaro aims to refine data
+
+``` bash
+datum create -o <project/dir>
+datum import -p <project/dir> -f image_dir <directory/path/>
+```
+
+or, if you work with Datumaro API:
+
+- for using with a project:
+
+  ```python
+  from datumaro.project import Project
+
+  project = Project.init()
+  project.import_source('source1', format='image_dir', url='directory/path/')
+  dataset = project.working_tree.make_dataset()
+  ```
+
+- for using as a dataset:
+
+  ```python
+  from datumaro import Dataset
+
+  dataset = Dataset.import_from('directory/path/', 'image_dir')
+  ```
diff --git a/docs/source/docs/level-up/intermediate_skills/index.rst b/docs/source/docs/level-up/intermediate_skills/index.rst
@@ -0,0 +1,12 @@
+Intermediate Skills
+#################
+
+.. toctree::
+   :maxdepth: 1
+
+   data_refinement
+   data_comparison
+   data_aggregation
+   data_merge
+   data_exploration
+   data_generation