Split video directory by subset in datumaro format. #1485
Split video directory by subset in datumaro format. #1485wonjuleee merged 2 commits intoopen-edge-platform:developfrom jihyeonyi:jihyeony/split_video_directory_by_subset
Conversation
…load_image to fix test error
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #1485 +/- ##
===========================================
- Coverage 80.85% 80.76% -0.10%
===========================================
Files 271 272 +1
Lines 30689 31179 +490
Branches 6197 6283 +86
===========================================
+ Hits 24815 25183 +368
- Misses 4489 4584 +95
- Partials 1385 1412 +27
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
LGTM, but is there any modification required for datumaro importer for videos?
One more question, considering the fact that a video could be huge, this change needs more storages (2 twice at the worst case). Do you have any reason to save duplicated videos for each subset?
I have modified the import, too. Basic assumption is that a video isn't shared by subsets. So there shouldn't be duplicated videos. However, if this assumption is too string, than we need another workaround to save videos. |
Summary
According to the datumaro documentation, video directory is not separated by subset.

So a video will be overwritten if it has a duplicated filename in other subset.
Following example has two subsets,

trainandval, and both subsets have Video_0.avi, ..., Video_9.avi.So the videos in the train subset is overwritten by the videos in the val subset.
So I'd like to split the video directory by subset to avoid this situation.
How to test
Checklist
License
Feel free to contact the maintainers if that's a concern.