-
Notifications
You must be signed in to change notification settings - Fork 635
[Algorithm] Support Rotated FCOS (ICCV'2019) #223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
cc1febf
Support Rotated FCOS
liuyanyi 6526f85
update fcos kld result
liuyanyi 3a150d8
add polyiou
liuyanyi 8c6e5e6
add model zoo
liuyanyi 7f1bf47
rename poly_iou to rotated_iou
liuyanyi 8d9d559
Merge branch 'dev' of https://github.com/open-mmlab/mmrotate into rot…
liuyanyi 5f49be1
add test
liuyanyi 212e1ce
remove train_cfg, add import check for mmcv version
liuyanyi f3a8018
add readme
liuyanyi 365fe93
add readme
liuyanyi 68297c9
merge dev
liuyanyi a7d9801
Merge branch 'dev' of https://github.com/open-mmlab/mmrotate into rot…
liuyanyi dca9332
update dota result
liuyanyi 31887d8
Revert "update dota result"
liuyanyi 0c3fa57
resolve conflicts
liuyanyi 507a6ac
publish fcos models
liuyanyi f1cbc98
merge dev
liuyanyi 7ad3b4a
fix lint
liuyanyi a9b3157
add metafile.yml
liuyanyi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| # Rotated FCOS | ||
|
|
||
| > [FCOS: Fully Convolutional One-Stage Object Detection](https://arxiv.org/abs/1904.01355) | ||
|
|
||
| <!-- [ALGORITHM] --> | ||
|
|
||
| ## Abstract | ||
|
|
||
| <div align=center> | ||
| <img src="https://user-images.githubusercontent.com/40661020/143882011-45b234bc-d04b-4bbe-a822-94bec057ac86.png"/> | ||
| </div> | ||
|
|
||
| We propose a fully convolutional one-stage object detector (FCOS) to solve object detection in a per-pixel prediction | ||
| fashion, analogue to semantic segmentation. Almost all state-of-the-art object detectors such as RetinaNet, SSD, YOLOv3, | ||
| and Faster R-CNN rely on pre-defined anchor boxes. In contrast, our proposed detector FCOS is anchor box free, as well | ||
| as proposal free. By eliminating the predefined set of anchor boxes, FCOS completely avoids the complicated computation | ||
| related to anchor boxes such as calculating overlapping during training. More importantly, we also avoid all | ||
| hyper-parameters related to anchor boxes, which are often very sensitive to the final detection performance. With the | ||
| only post-processing non-maximum suppression (NMS), FCOS with ResNeXt-64x4d-101 achieves 44.7% in AP with single-model | ||
| and single-scale testing, surpassing previous one-stage detectors with the advantage of being much simpler. For the | ||
| first time, we demonstrate a much simpler and flexible detection framework achieving improved detection accuracy. We | ||
| hope that the proposed FCOS framework can serve as a simple and strong alternative for many other instance-level tasks. | ||
|
|
||
| ## Results and Models | ||
|
|
||
| DOTA1.0 | ||
|
|
||
| | Backbone | mAP | Angle | Separate Angle | Tricks | lr schd | Mem (GB) | Inf Time (fps) | Aug | Batch Size | Configs | Download | | ||
| | :----------------------: | :---: | :---: | :------------: | :----: | :-----: | :------: | :------------: | :-: | :--------: | :---------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | | ||
| | ResNet50 (1024,1024,200) | 70.70 | le90 | Y | Y | 1x | 4.18 | 26.4 | - | 2 | [rotated_fcos_sep_angle_r50_fpn_1x_dota_le90](./rotated_fcos_sep_angle_r50_fpn_1x_dota_le90.py) | [model](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90-0be71a0c.pth) \| [log](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90_20220409_023250.log.json) | | ||
| | ResNet50 (1024,1024,200) | 71.28 | le90 | N | Y | 1x | 4.18 | 25.9 | - | 2 | [rotated_fcos_r50_fpn_1x_dota_le90](./rotated_fcos_r50_fpn_1x_dota_le90.py) | [model](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_r50_fpn_1x_dota_le90/rotated_fcos_r50_fpn_1x_dota_le90-d87568ed.pth) \| [log](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_r50_fpn_1x_dota_le90/rotated_fcos_r50_fpn_1x_dota_le90_20220413_163526.log.json) | | ||
| | ResNet50 (1024,1024,200) | 71.76 | le90 | Y | Y | 1x | 4.23 | 25.7 | - | 2 | [rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90](./rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90.py) | [model](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90-4e044ad2.pth) \| [log](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90_20220409_080616.log.json) | | ||
| | ResNet50 (1024,1024,200) | 71.89 | le90 | N | Y | 1x | 4.18 | 26.2 | - | 2 | [rotated_fcos_kld_r50_fpn_1x_dota_le90](./rotated_fcos_kld_r50_fpn_1x_dota_le90.py) | [model](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_kld_r50_fpn_1x_dota_le90/rotated_fcos_kld_r50_fpn_1x_dota_le90-ecafdb2b.pth) \| [log](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_kld_r50_fpn_1x_dota_le90/rotated_fcos_kld_r50_fpn_1x_dota_le90_20220409_202939.log.json) | | ||
|
|
||
| **Notes:** | ||
|
|
||
| - `MS` means multiple scale image split. | ||
| - `RR` means random rotation. | ||
| - `Rotated IoU Loss` need mmcv version 1.5.0 or above. | ||
| - `Separate Angle` means angle loss is calculated separately. | ||
| At this time bbox loss uses horizontal bbox loss such as `IoULoss`, `GIoULoss`. | ||
| - Tricks means setting `norm_on_bbox`, `centerness_on_reg`, `center_sampling` as `True`. | ||
| - Inf time was tested on a single RTX3090. | ||
|
|
||
| ## Citation | ||
|
|
||
| ``` | ||
| @article{tian2019fcos, | ||
| title={FCOS: Fully Convolutional One-Stage Object Detection}, | ||
| author={Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong}, | ||
| journal={arXiv preprint arXiv:1904.01355}, | ||
| year={2019} | ||
| } | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| Collections: | ||
| - Name: rotated_fcos | ||
| Metadata: | ||
| Training Data: DOTAv1.0 | ||
| Training Techniques: | ||
| - SGD with Momentum | ||
| - Weight Decay | ||
| Training Resources: 1x Tesla V100 | ||
| Architecture: | ||
| - ResNet | ||
| Paper: | ||
| URL: https://arxiv.org/abs/1904.01355 | ||
| Title: 'FCOS: Fully Convolutional One-Stage Object Detection' | ||
| README: configs/rotated_fcos/README.md | ||
|
|
||
| Models: | ||
| - Name: rotated_fcos_sep_angle_r50_fpn_1x_dota_le90 | ||
| In Collection: rotated_fcos | ||
| Config: configs/rotated_fcos/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90.py | ||
| Metadata: | ||
| Training Data: DOTAv1.0 | ||
| Results: | ||
| - Task: Oriented Object Detection | ||
| Dataset: DOTAv1.0 | ||
| Metrics: | ||
| mAP: 70.70 | ||
| Weights: https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90-0be71a0c.pth | ||
|
|
||
| - Name: rotated_fcos_r50_fpn_1x_dota_le90 | ||
| In Collection: rotated_fcos | ||
| Config: configs/rotated_fcos/rotated_fcos_r50_fpn_1x_dota_le90.py | ||
| Metadata: | ||
| Training Data: DOTAv1.0 | ||
| Results: | ||
| - Task: Oriented Object Detection | ||
| Dataset: DOTAv1.0 | ||
| Metrics: | ||
| mAP: 71.28 | ||
| Weights: https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_r50_fpn_1x_dota_le90/rotated_fcos_r50_fpn_1x_dota_le90-d87568ed.pth | ||
|
|
||
| - Name: rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90 | ||
| In Collection: rotated_fcos | ||
| Config: configs/rotated_fcos/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90.py | ||
| Metadata: | ||
| Training Data: DOTAv1.0 | ||
| Results: | ||
| - Task: Oriented Object Detection | ||
| Dataset: DOTAv1.0 | ||
| Metrics: | ||
| mAP: 71.76 | ||
| Weights: https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90-4e044ad2.pth | ||
|
|
||
| - Name: rotated_fcos_kld_r50_fpn_1x_dota_le90 | ||
| In Collection: rotated_fcos | ||
| Config: configs/rotated_fcos/rotated_fcos_kld_r50_fpn_1x_dota_le90.py | ||
| Metadata: | ||
| Training Data: DOTAv1.0 | ||
| Results: | ||
| - Task: Oriented Object Detection | ||
| Dataset: DOTAv1.0 | ||
| Metrics: | ||
| mAP: 71.89 | ||
| Weights: https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_kld_r50_fpn_1x_dota_le90/rotated_fcos_kld_r50_fpn_1x_dota_le90-ecafdb2b.pth |
30 changes: 30 additions & 0 deletions
30
configs/rotated_fcos/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| _base_ = 'rotated_fcos_sep_angle_r50_fpn_1x_dota_le90.py' | ||
| angle_version = 'le90' | ||
|
|
||
| # model settings | ||
| model = dict( | ||
| bbox_head=dict( | ||
| type='CSLRFCOSHead', | ||
| center_sampling=True, | ||
| center_sample_radius=1.5, | ||
| norm_on_bbox=True, | ||
| centerness_on_reg=True, | ||
| separate_angle=True, | ||
| scale_angle=False, | ||
| angle_coder=dict( | ||
| type='CSLCoder', | ||
| angle_version=angle_version, | ||
| omega=1, | ||
| window='gaussian', | ||
| radius=1), | ||
| loss_cls=dict( | ||
| type='FocalLoss', | ||
| use_sigmoid=True, | ||
| gamma=2.0, | ||
| alpha=0.25, | ||
| loss_weight=1.0), | ||
| loss_bbox=dict(type='GIoULoss', loss_weight=1.0), | ||
| loss_centerness=dict( | ||
| type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), | ||
| loss_angle=dict( | ||
| type='SmoothFocalLoss', gamma=2.0, alpha=0.25, loss_weight=0.2)), ) | ||
11 changes: 11 additions & 0 deletions
11
configs/rotated_fcos/rotated_fcos_kld_r50_fpn_1x_dota_le90.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| _base_ = 'rotated_fcos_r50_fpn_1x_dota_le90.py' | ||
|
|
||
| model = dict( | ||
| bbox_head=dict( | ||
| loss_bbox=dict( | ||
| _delete_=True, | ||
| type='GDLoss_v1', | ||
| loss_type='kld', | ||
| fun='log1p', | ||
| tau=1, | ||
| loss_weight=1.0)), ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,81 @@ | ||
| _base_ = [ | ||
| '../_base_/datasets/dotav1.py', '../_base_/schedules/schedule_1x.py', | ||
| '../_base_/default_runtime.py' | ||
| ] | ||
| angle_version = 'le90' | ||
|
|
||
| # model settings | ||
| model = dict( | ||
| type='RotatedFCOS', | ||
| backbone=dict( | ||
| type='ResNet', | ||
| depth=50, | ||
| num_stages=4, | ||
| out_indices=(0, 1, 2, 3), | ||
| frozen_stages=1, | ||
| zero_init_residual=False, | ||
| norm_cfg=dict(type='BN', requires_grad=True), | ||
| norm_eval=True, | ||
| style='pytorch', | ||
| init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')), | ||
| neck=dict( | ||
| type='FPN', | ||
| in_channels=[256, 512, 1024, 2048], | ||
| out_channels=256, | ||
| start_level=1, | ||
| add_extra_convs='on_output', # use P5 | ||
| num_outs=5, | ||
| relu_before_extra_convs=True), | ||
| bbox_head=dict( | ||
| type='RotatedFCOSHead', | ||
| num_classes=15, | ||
| in_channels=256, | ||
| stacked_convs=4, | ||
| feat_channels=256, | ||
| strides=[8, 16, 32, 64, 128], | ||
| center_sampling=True, | ||
| center_sample_radius=1.5, | ||
| norm_on_bbox=True, | ||
| centerness_on_reg=True, | ||
| separate_angle=False, | ||
| scale_angle=True, | ||
| bbox_coder=dict( | ||
| type='DistanceAnglePointCoder', angle_version=angle_version), | ||
| loss_cls=dict( | ||
| type='FocalLoss', | ||
| use_sigmoid=True, | ||
| gamma=2.0, | ||
| alpha=0.25, | ||
| loss_weight=1.0), | ||
| loss_bbox=dict(type='RotatedIoULoss', loss_weight=1.0), | ||
| loss_centerness=dict( | ||
| type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)), | ||
| # training and testing settings | ||
| train_cfg=None, | ||
| test_cfg=dict( | ||
| nms_pre=2000, | ||
| min_bbox_size=0, | ||
| score_thr=0.05, | ||
| nms=dict(iou_thr=0.1), | ||
| max_per_img=2000)) | ||
|
|
||
| img_norm_cfg = dict( | ||
| mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) | ||
| train_pipeline = [ | ||
| dict(type='LoadImageFromFile'), | ||
| dict(type='LoadAnnotations', with_bbox=True), | ||
| dict(type='RResize', img_scale=(1024, 1024)), | ||
| dict( | ||
| type='RRandomFlip', | ||
| flip_ratio=[0.25, 0.25, 0.25], | ||
| direction=['horizontal', 'vertical', 'diagonal'], | ||
| version=angle_version), | ||
| dict(type='Normalize', **img_norm_cfg), | ||
| dict(type='Pad', size_divisor=32), | ||
| dict(type='DefaultFormatBundle'), | ||
| dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) | ||
| ] | ||
| data = dict( | ||
| train=dict(pipeline=train_pipeline, version=angle_version), | ||
| val=dict(version=angle_version), | ||
| test=dict(version=angle_version)) |
83 changes: 83 additions & 0 deletions
83
configs/rotated_fcos/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,83 @@ | ||
| _base_ = [ | ||
| '../_base_/datasets/dotav1.py', '../_base_/schedules/schedule_1x.py', | ||
| '../_base_/default_runtime.py' | ||
| ] | ||
| angle_version = 'le90' | ||
|
|
||
| # model settings | ||
| model = dict( | ||
| type='RotatedFCOS', | ||
| backbone=dict( | ||
| type='ResNet', | ||
| depth=50, | ||
| num_stages=4, | ||
| out_indices=(0, 1, 2, 3), | ||
| frozen_stages=1, | ||
| zero_init_residual=False, | ||
| norm_cfg=dict(type='BN', requires_grad=True), | ||
| norm_eval=True, | ||
| style='pytorch', | ||
| init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')), | ||
| neck=dict( | ||
| type='FPN', | ||
| in_channels=[256, 512, 1024, 2048], | ||
| out_channels=256, | ||
| start_level=1, | ||
| add_extra_convs='on_output', # use P5 | ||
| num_outs=5, | ||
| relu_before_extra_convs=True), | ||
| bbox_head=dict( | ||
| type='RotatedFCOSHead', | ||
| num_classes=15, | ||
| in_channels=256, | ||
| stacked_convs=4, | ||
| feat_channels=256, | ||
| strides=[8, 16, 32, 64, 128], | ||
| center_sampling=True, | ||
| center_sample_radius=1.5, | ||
| norm_on_bbox=True, | ||
| centerness_on_reg=True, | ||
| separate_angle=True, | ||
| scale_angle=True, | ||
| bbox_coder=dict( | ||
| type='DistanceAnglePointCoder', angle_version=angle_version), | ||
| h_bbox_coder=dict(type='DistancePointBBoxCoder'), | ||
| loss_cls=dict( | ||
| type='FocalLoss', | ||
| use_sigmoid=True, | ||
| gamma=2.0, | ||
| alpha=0.25, | ||
| loss_weight=1.0), | ||
| loss_bbox=dict(type='GIoULoss', loss_weight=1.0), | ||
| loss_angle=dict(type='L1Loss', loss_weight=0.2), | ||
| loss_centerness=dict( | ||
| type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)), | ||
| # training and testing settings | ||
| train_cfg=None, | ||
| test_cfg=dict( | ||
| nms_pre=2000, | ||
| min_bbox_size=0, | ||
| score_thr=0.05, | ||
| nms=dict(iou_thr=0.1), | ||
| max_per_img=2000)) | ||
|
|
||
| img_norm_cfg = dict( | ||
| mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) | ||
| train_pipeline = [ | ||
| dict(type='LoadImageFromFile'), | ||
| dict(type='LoadAnnotations', with_bbox=True), | ||
| dict(type='RResize', img_scale=(1024, 1024)), | ||
| dict( | ||
| type='RRandomFlip', | ||
| flip_ratio=[0.25, 0.25, 0.25], | ||
| direction=['horizontal', 'vertical', 'diagonal'], | ||
| version=angle_version), | ||
| dict(type='Normalize', **img_norm_cfg), | ||
| dict(type='Pad', size_divisor=32), | ||
| dict(type='DefaultFormatBundle'), | ||
| dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) | ||
| ] | ||
| data = dict( | ||
| train=dict(pipeline=train_pipeline, version=angle_version), | ||
| val=dict(version=angle_version), | ||
| test=dict(version=angle_version)) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.