Skip to content

Combine Export or "Dump" mulitple Jobs into one .zip with train/test/val splits #791

@Sparrowtech

Description

@Sparrowtech

WORKFLOW WORKAROUNDS: We've created individual "Jobs" to represent different classes of objects; i.e. "car, truck, van, helicopter, airplane, etc." largely due to CVAT difficulties-ability to load very large datasets. Each CVAT Job represents ~2500 images and tends to be collectively around 1GB in size between the images and annotations. Currently there are ~ 60 different jobs or classes of objects, 60 GB and ~ 150,000 images.

Routinely we create specific datasets (10-20 object classes or Jobs") which require a lot of post-exporting heavy lifting having to merge tfrecords or xml files into one or batches, not to mention splitting of train/test/val sets. I know that there are a lot of tools out there to help with pre-process and we currently employ many.

Would be ideal to have functionality to choose " Car, Airplane, Helicopter, Bus, ... etc" from the dashboard to EXPORT INTO ONE TASK... AND ability to choose ratio of images to be split into train/test/val sets. e.g. 70% train, 15% test, 15% val. resulting in .zip file(s) with images-annotations or tfrecords created. No extra processing for randomizing, just extract split % from each job and combined for e.g. "Train" insuring well balanced classes rather than relying on function later unknown which is just a random exercise.

Thanks!

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions