Skip to content

yolo import - getting image list from image_info if possible#111

Merged
Eldies merged 6 commits intodevelopfrom
dl/fix-yolo-subset-folder-processing
May 26, 2025
Merged

yolo import - getting image list from image_info if possible#111
Eldies merged 6 commits intodevelopfrom
dl/fix-yolo-subset-folder-processing

Conversation

@Eldies
Copy link
Copy Markdown

@Eldies Eldies commented May 22, 2025

Summary

Ultralytics yolo subset can be specified as a folder in data.yaml. In that case, if there are only annotations (no images) in the dataset folder, importer finds no images, creates no items, and therefore does not read any annotations.

Using image_info to get items.

How to test

Checklist

  • I have added unit tests to cover my changes.​
  • I have added integration tests to cover my changes.​
  • I have added the description of my changes into CHANGELOG.​
  • I have updated the documentation accordingly

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below).
# Copyright (C) 2025 CVAT.ai Corporation
#
# SPDX-License-Identifier: MIT

@Eldies Eldies changed the title getting image list from image_info if possible yolo import - getting image list from image_info if possible May 22, 2025
Comment thread src/datumaro/plugins/data_formats/yolo/base.py Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current approach with fake image name crafting doesn't look right to me. It also keeps these fake image paths in the media paths of produced dataset items. It looks like it would be better to change _get_lazy_subset_items to generate a list of (image path, annotation path) instead.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you suggest to create DatasetItems with empty media in such cases?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, creating a list of tuples with 1 or 2 paths filled is probably enough.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, to create DatasetItem with media we need a path for that media. And unless we pass the correct extension from somewhere, it will be a faked path

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I forgot we removed a way to create meta-only Image with just size. Ok, let's keep it as it is.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an option, we could try to change the image info passed to include extensions.

@sonarqubecloud
Copy link
Copy Markdown

@Eldies Eldies merged commit afc20ba into develop May 26, 2025
6 checks passed
@Eldies Eldies deleted the dl/fix-yolo-subset-folder-processing branch May 26, 2025 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants