open-mmlab · ZwwWayne · Jun 6, 2022 · Apr 9, 2022 · Apr 10, 2022 · Apr 13, 2022
diff --git a/README.md b/README.md
@@ -98,6 +98,7 @@ A summary can be found in the [Model Zoo](docs/en/model_zoo.md) page.
 - [x] [Rotated RetinaNet-OBB/HBB](configs/rotated_retinanet/README.md) (ICCV'2017)
 - [x] [Rotated FasterRCNN-OBB](configs/rotated_faster_rcnn/README.md) (TPAMI'2017)
 - [x] [Rotated RepPoints-OBB](configs/rotated_reppoints/README.md) (ICCV'2019)
+- [x] [Rotated FCOS](configs/rotated_fcos/README.md) (ICCV'2019)
 - [x] [RoI Transformer](configs/roi_trans/README.md) (CVPR'2019)
 - [x] [Gliding Vertex](configs/gliding_vertex/README.md) (TPAMI'2020)
 - [x] [Rotated ATSS-OBB](configs/rotated_atss/README.md) (CVPR'2020)

diff --git a/README_zh-CN.md b/README_zh-CN.md
@@ -95,6 +95,7 @@ MMRotate 也提供了其他更详细的教程:
 - [x] [Rotated RetinaNet-OBB/HBB](configs/rotated_retinanet/README.md) (ICCV'2017)
 - [x] [Rotated FasterRCNN-OBB](configs/rotated_faster_rcnn/README.md) (TPAMI'2017)
 - [x] [Rotated RepPoints-OBB](configs/rotated_reppoints/README.md) (ICCV'2019)
+- [x] [Rotated FCOS](configs/rotated_fcos/README.md) (ICCV'2019)
 - [x] [RoI Transformer](configs/roi_trans/README.md) (CVPR'2019)
 - [x] [Gliding Vertex](configs/gliding_vertex/README.md) (TPAMI'2020)
 - [x] [Rotated ATSS-OBB](configs/rotated_atss/README.md) (CVPR'2020)

diff --git a/configs/rotated_fcos/README.md b/configs/rotated_fcos/README.md
@@ -0,0 +1,54 @@
+# Rotated FCOS
+
+> [FCOS: Fully Convolutional One-Stage Object Detection](https://arxiv.org/abs/1904.01355)
+
+<!-- [ALGORITHM] -->
+
+## Abstract
+
+<div align=center>
+<img src="https://user-images.githubusercontent.com/40661020/143882011-45b234bc-d04b-4bbe-a822-94bec057ac86.png"/>
+</div>
+
+We propose a fully convolutional one-stage object detector (FCOS) to solve object detection in a per-pixel prediction
+fashion, analogue to semantic segmentation. Almost all state-of-the-art object detectors such as RetinaNet, SSD, YOLOv3,
+and Faster R-CNN rely on pre-defined anchor boxes. In contrast, our proposed detector FCOS is anchor box free, as well
+as proposal free. By eliminating the predefined set of anchor boxes, FCOS completely avoids the complicated computation
+related to anchor boxes such as calculating overlapping during training. More importantly, we also avoid all
+hyper-parameters related to anchor boxes, which are often very sensitive to the final detection performance. With the
+only post-processing non-maximum suppression (NMS), FCOS with ResNeXt-64x4d-101 achieves 44.7% in AP with single-model
+and single-scale testing, surpassing previous one-stage detectors with the advantage of being much simpler. For the
+first time, we demonstrate a much simpler and flexible detection framework achieving improved detection accuracy. We
+hope that the proposed FCOS framework can serve as a simple and strong alternative for many other instance-level tasks.
+
+## Results and Models
+
+DOTA1.0
+
+|         Backbone         |  mAP  | Angle | Separate Angle | Tricks | lr schd | Mem (GB) | Inf Time (fps) | Aug | Batch Size |                                                Configs                                                |                                                                                                                                                                                   Download                                                                                                                                                                                   |
+| :----------------------: | :---: | :---: | :------------: | :----: | :-----: | :------: | :------------: | :-: | :--------: | :---------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| ResNet50 (1024,1024,200) | 70.70 | le90  |       Y        |   Y    |   1x    |   4.18   |      26.4      |  -  |     2      |    [rotated_fcos_sep_angle_r50_fpn_1x_dota_le90](./rotated_fcos_sep_angle_r50_fpn_1x_dota_le90.py)    |       [model](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90-0be71a0c.pth) \| [log](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90_20220409_023250.log.json)       |
+| ResNet50 (1024,1024,200) | 71.28 | le90  |       N        |   Y    |   1x    |   4.18   |      25.9      |  -  |     2      |              [rotated_fcos_r50_fpn_1x_dota_le90](./rotated_fcos_r50_fpn_1x_dota_le90.py)              |                           [model](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_r50_fpn_1x_dota_le90/rotated_fcos_r50_fpn_1x_dota_le90-d87568ed.pth) \| [log](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_r50_fpn_1x_dota_le90/rotated_fcos_r50_fpn_1x_dota_le90_20220413_163526.log.json)                           |
+| ResNet50 (1024,1024,200) | 71.76 | le90  |       Y        |   Y    |   1x    |   4.23   |      25.7      |  -  |     2      | [rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90](./rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90.py) | [model](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90-4e044ad2.pth) \| [log](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90_20220409_080616.log.json) |
+| ResNet50 (1024,1024,200) | 71.89 | le90  |       N        |   Y    |   1x    |   4.18   |      26.2      |  -  |     2      |          [rotated_fcos_kld_r50_fpn_1x_dota_le90](./rotated_fcos_kld_r50_fpn_1x_dota_le90.py)          |                   [model](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_kld_r50_fpn_1x_dota_le90/rotated_fcos_kld_r50_fpn_1x_dota_le90-ecafdb2b.pth) \| [log](https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_kld_r50_fpn_1x_dota_le90/rotated_fcos_kld_r50_fpn_1x_dota_le90_20220409_202939.log.json)                   |
+
+**Notes:**
+
+- `MS` means multiple scale image split.
+- `RR` means random rotation.
+- `Rotated IoU Loss` need mmcv version 1.5.0 or above.
+- `Separate Angle` means angle loss is calculated separately.
+  At this time bbox loss uses horizontal bbox loss such as `IoULoss`, `GIoULoss`.
+- Tricks means setting `norm_on_bbox`, `centerness_on_reg`, `center_sampling` as `True`.
+- Inf time was tested on a single RTX3090.
+
+## Citation
+
+```
+@article{tian2019fcos,
+  title={FCOS: Fully Convolutional One-Stage Object Detection},
+  author={Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong},
+  journal={arXiv preprint arXiv:1904.01355},
+  year={2019}
+}
+```
diff --git a/configs/rotated_fcos/metafile.yml b/configs/rotated_fcos/metafile.yml
@@ -0,0 +1,63 @@
+Collections:
+- Name: rotated_fcos
+  Metadata:
+    Training Data: DOTAv1.0
+    Training Techniques:
+      - SGD with Momentum
+      - Weight Decay
+    Training Resources: 1x Tesla V100
+    Architecture:
+      - ResNet
+  Paper:
+    URL: https://arxiv.org/abs/1904.01355
+    Title: 'FCOS: Fully Convolutional One-Stage Object Detection'
+  README: configs/rotated_fcos/README.md
+
+Models:
+  - Name: rotated_fcos_sep_angle_r50_fpn_1x_dota_le90
+    In Collection: rotated_fcos
+    Config: configs/rotated_fcos/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90.py
+    Metadata:
+      Training Data: DOTAv1.0
+    Results:
+      - Task: Oriented Object Detection
+        Dataset: DOTAv1.0
+        Metrics:
+          mAP: 70.70
+    Weights: https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90-0be71a0c.pth
+
+  - Name: rotated_fcos_r50_fpn_1x_dota_le90
+    In Collection: rotated_fcos
+    Config: configs/rotated_fcos/rotated_fcos_r50_fpn_1x_dota_le90.py
+    Metadata:
+      Training Data: DOTAv1.0
+    Results:
+      - Task: Oriented Object Detection
+        Dataset: DOTAv1.0
+        Metrics:
+          mAP: 71.28
+    Weights: https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_r50_fpn_1x_dota_le90/rotated_fcos_r50_fpn_1x_dota_le90-d87568ed.pth
+
+  - Name: rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90
+    In Collection: rotated_fcos
+    Config: configs/rotated_fcos/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90.py
+    Metadata:
+      Training Data: DOTAv1.0
+    Results:
+      - Task: Oriented Object Detection
+        Dataset: DOTAv1.0
+        Metrics:
+          mAP: 71.76
+    Weights: https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90-4e044ad2.pth
+
+  - Name: rotated_fcos_kld_r50_fpn_1x_dota_le90
+    In Collection: rotated_fcos
+    Config: configs/rotated_fcos/rotated_fcos_kld_r50_fpn_1x_dota_le90.py
+    Metadata:
+      Training Data: DOTAv1.0
+    Results:
+      - Task: Oriented Object Detection
+        Dataset: DOTAv1.0
+        Metrics:
+          mAP: 71.89
+    Weights: https://download.openmmlab.com/mmrotate/v0.1.0/rotated_fcos/rotated_fcos_kld_r50_fpn_1x_dota_le90/rotated_fcos_kld_r50_fpn_1x_dota_le90-ecafdb2b.pth
diff --git a/configs/rotated_fcos/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90.py b/configs/rotated_fcos/rotated_fcos_csl_gaussian_r50_fpn_1x_dota_le90.py
@@ -0,0 +1,30 @@
+_base_ = 'rotated_fcos_sep_angle_r50_fpn_1x_dota_le90.py'
+angle_version = 'le90'
+
+# model settings
+model = dict(
+    bbox_head=dict(
+        type='CSLRFCOSHead',
+        center_sampling=True,
+        center_sample_radius=1.5,
+        norm_on_bbox=True,
+        centerness_on_reg=True,
+        separate_angle=True,
+        scale_angle=False,
+        angle_coder=dict(
+            type='CSLCoder',
+            angle_version=angle_version,
+            omega=1,
+            window='gaussian',
+            radius=1),
+        loss_cls=dict(
+            type='FocalLoss',
+            use_sigmoid=True,
+            gamma=2.0,
+            alpha=0.25,
+            loss_weight=1.0),
+        loss_bbox=dict(type='GIoULoss', loss_weight=1.0),
+        loss_centerness=dict(
+            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+        loss_angle=dict(
+            type='SmoothFocalLoss', gamma=2.0, alpha=0.25, loss_weight=0.2)), )
diff --git a/configs/rotated_fcos/rotated_fcos_kld_r50_fpn_1x_dota_le90.py b/configs/rotated_fcos/rotated_fcos_kld_r50_fpn_1x_dota_le90.py
@@ -0,0 +1,11 @@
+_base_ = 'rotated_fcos_r50_fpn_1x_dota_le90.py'
+
+model = dict(
+    bbox_head=dict(
+        loss_bbox=dict(
+            _delete_=True,
+            type='GDLoss_v1',
+            loss_type='kld',
+            fun='log1p',
+            tau=1,
+            loss_weight=1.0)), )
diff --git a/configs/rotated_fcos/rotated_fcos_r50_fpn_1x_dota_le90.py b/configs/rotated_fcos/rotated_fcos_r50_fpn_1x_dota_le90.py
@@ -0,0 +1,81 @@
+_base_ = [
+    '../_base_/datasets/dotav1.py', '../_base_/schedules/schedule_1x.py',
+    '../_base_/default_runtime.py'
+]
+angle_version = 'le90'
+
+# model settings
+model = dict(
+    type='RotatedFCOS',
+    backbone=dict(
+        type='ResNet',
+        depth=50,
+        num_stages=4,
+        out_indices=(0, 1, 2, 3),
+        frozen_stages=1,
+        zero_init_residual=False,
+        norm_cfg=dict(type='BN', requires_grad=True),
+        norm_eval=True,
+        style='pytorch',
+        init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
+    neck=dict(
+        type='FPN',
+        in_channels=[256, 512, 1024, 2048],
+        out_channels=256,
+        start_level=1,
+        add_extra_convs='on_output',  # use P5
+        num_outs=5,
+        relu_before_extra_convs=True),
+    bbox_head=dict(
+        type='RotatedFCOSHead',
+        num_classes=15,
+        in_channels=256,
+        stacked_convs=4,
+        feat_channels=256,
+        strides=[8, 16, 32, 64, 128],
+        center_sampling=True,
+        center_sample_radius=1.5,
+        norm_on_bbox=True,
+        centerness_on_reg=True,
+        separate_angle=False,
+        scale_angle=True,
+        bbox_coder=dict(
+            type='DistanceAnglePointCoder', angle_version=angle_version),
+        loss_cls=dict(
+            type='FocalLoss',
+            use_sigmoid=True,
+            gamma=2.0,
+            alpha=0.25,
+            loss_weight=1.0),
+        loss_bbox=dict(type='RotatedIoULoss', loss_weight=1.0),
+        loss_centerness=dict(
+            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)),
+    # training and testing settings
+    train_cfg=None,
+    test_cfg=dict(
+        nms_pre=2000,
+        min_bbox_size=0,
+        score_thr=0.05,
+        nms=dict(iou_thr=0.1),
+        max_per_img=2000))
+
+img_norm_cfg = dict(
+    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(type='LoadAnnotations', with_bbox=True),
+    dict(type='RResize', img_scale=(1024, 1024)),
+    dict(
+        type='RRandomFlip',
+        flip_ratio=[0.25, 0.25, 0.25],
+        direction=['horizontal', 'vertical', 'diagonal'],
+        version=angle_version),
+    dict(type='Normalize', **img_norm_cfg),
+    dict(type='Pad', size_divisor=32),
+    dict(type='DefaultFormatBundle'),
+    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
+]
+data = dict(
+    train=dict(pipeline=train_pipeline, version=angle_version),
+    val=dict(version=angle_version),
+    test=dict(version=angle_version))
diff --git a/configs/rotated_fcos/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90.py b/configs/rotated_fcos/rotated_fcos_sep_angle_r50_fpn_1x_dota_le90.py
@@ -0,0 +1,83 @@
+_base_ = [
+    '../_base_/datasets/dotav1.py', '../_base_/schedules/schedule_1x.py',
+    '../_base_/default_runtime.py'
+]
+angle_version = 'le90'
+
+# model settings
+model = dict(
+    type='RotatedFCOS',
+    backbone=dict(
+        type='ResNet',
+        depth=50,
+        num_stages=4,
+        out_indices=(0, 1, 2, 3),
+        frozen_stages=1,
+        zero_init_residual=False,
+        norm_cfg=dict(type='BN', requires_grad=True),
+        norm_eval=True,
+        style='pytorch',
+        init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
+    neck=dict(
+        type='FPN',
+        in_channels=[256, 512, 1024, 2048],
+        out_channels=256,
+        start_level=1,
+        add_extra_convs='on_output',  # use P5
+        num_outs=5,
+        relu_before_extra_convs=True),
+    bbox_head=dict(
+        type='RotatedFCOSHead',
+        num_classes=15,
+        in_channels=256,
+        stacked_convs=4,
+        feat_channels=256,
+        strides=[8, 16, 32, 64, 128],
+        center_sampling=True,
+        center_sample_radius=1.5,
+        norm_on_bbox=True,
+        centerness_on_reg=True,
+        separate_angle=True,
+        scale_angle=True,
+        bbox_coder=dict(
+            type='DistanceAnglePointCoder', angle_version=angle_version),
+        h_bbox_coder=dict(type='DistancePointBBoxCoder'),
+        loss_cls=dict(
+            type='FocalLoss',
+            use_sigmoid=True,
+            gamma=2.0,
+            alpha=0.25,
+            loss_weight=1.0),
+        loss_bbox=dict(type='GIoULoss', loss_weight=1.0),
+        loss_angle=dict(type='L1Loss', loss_weight=0.2),
+        loss_centerness=dict(
+            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)),
+    # training and testing settings
+    train_cfg=None,
+    test_cfg=dict(
+        nms_pre=2000,
+        min_bbox_size=0,
+        score_thr=0.05,
+        nms=dict(iou_thr=0.1),
+        max_per_img=2000))
+
+img_norm_cfg = dict(
+    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(type='LoadAnnotations', with_bbox=True),
+    dict(type='RResize', img_scale=(1024, 1024)),
+    dict(
+        type='RRandomFlip',
+        flip_ratio=[0.25, 0.25, 0.25],
+        direction=['horizontal', 'vertical', 'diagonal'],
+        version=angle_version),
+    dict(type='Normalize', **img_norm_cfg),
+    dict(type='Pad', size_divisor=32),
+    dict(type='DefaultFormatBundle'),
+    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
+]
+data = dict(
+    train=dict(pipeline=train_pipeline, version=angle_version),
+    val=dict(version=angle_version),
+    test=dict(version=angle_version))