Skip to content

Add custom max iou assigner to prevent CPU OOM in training phase#2228

Merged
jaegukhyun merged 4 commits intoopen-edge-platform:developfrom
jaegukhyun:safe-assign
Jun 8, 2023
Merged

Add custom max iou assigner to prevent CPU OOM in training phase#2228
jaegukhyun merged 4 commits intoopen-edge-platform:developfrom
jaegukhyun:safe-assign

Conversation

@jaegukhyun
Copy link
Copy Markdown
Contributor

@jaegukhyun jaegukhyun commented Jun 8, 2023

Summary

When a training image contain too many gt(over > 10000), current assigner, matching anchor bbox and gt, raises CPU OOM.
This is because assigner makes too large cpu tensor matrix. To fix this problem, this PR add a new custom max iou assigner.
This new custom max iou assigner splits gt bboxes into reasonable size when gt bboxes is too large. This may lead time increase but prevent sudden cpu oom.

How to test

In my desktop which has 64GB RAM cannot train Kiemgetal dataset. However after this fix, my desktop can train Kiemgetal dataset.

Checklist

  • I have added unit tests to cover my changes.​
  • I have added integration tests to cover my changes.​
  • I have added e2e tests for validation.
  • I have added the description of my changes into CHANGELOG in my target branch (e.g., CHANGELOG in develop).​
  • I have updated the documentation in my target branch accordingly (e.g., documentation in develop).
  • I have linked related issues.

License

  • I submit my code changes under the same Apache License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below).
# Copyright (C) 2023 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

@github-actions github-actions bot added ALGO Any changes in OTX Algo Tasks implementation TEST Any changes in tests labels Jun 8, 2023
@github-actions github-actions bot added the DOC Improvements or additions to documentation label Jun 8, 2023
@jaegukhyun jaegukhyun marked this pull request as ready for review June 8, 2023 03:54
@jaegukhyun jaegukhyun requested a review from a team as a code owner June 8, 2023 03:54
Co-authored-by: Sungman Cho <sungman.cho@intel.com>
sungmanc
sungmanc previously approved these changes Jun 8, 2023
Copy link
Copy Markdown
Contributor

@sungmanc sungmanc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the nice work. BTW, how about using our own docstring instead of the original one for CustomMaxIoUAssigner?

goodsong81
goodsong81 previously approved these changes Jun 8, 2023
Copy link
Copy Markdown

@goodsong81 goodsong81 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the great work. Minor comment and question from my side:

…m_max_iou_assigner.py

Co-authored-by: Songki Choi <songki.choi@intel.com>
@jaegukhyun jaegukhyun dismissed stale reviews from goodsong81 and sungmanc via be39736 June 8, 2023 04:21
@jaegukhyun jaegukhyun requested a review from sungmanc June 8, 2023 06:23
@jaegukhyun jaegukhyun merged commit 302a8dd into open-edge-platform:develop Jun 8, 2023
@jaegukhyun jaegukhyun deleted the safe-assign branch June 8, 2023 07:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ALGO Any changes in OTX Algo Tasks implementation DOC Improvements or additions to documentation TEST Any changes in tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants