Add ASFF (three fuse feature layers) int the Head for V5(s,m,l,x)

## 🚀 Feature
Add ASFF fuse feature layers to the Head : the level1-level 3 scale maps are respectively fused into 3 corresponding scale feature maps, and the fusion weights are adaptively adjusted.

## Motivation
1.  Refer to the feature fusion case of yolov3_asff. [paper](https://arxiv.org/abs/1911.09516)
2.  Add optional four yolov5_asff models structure (in yaml file )
3.  The ASFF method is very suitable for the YOLO series, and through reading the paper, I found that it has a reasonable explanatory nature. It can be incorporated into an alternative structure of V5. 
4.  Integrate ASFF functions into the project  and hope to make a contribution for yoloV5 project 
## Pitch
I add ASFFV5 classes at 310 line in https://github.com/positive666/yolov5/blob/master/models/common.py :
Add asff layers structure for yolov5(s,m,x,l),Integrated into YOLOV5's code project. and different more than v3_asff and add RFB block.such as, yolov5s.yaml:
```
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [1024, False]],  # 23 (P5/32-large)

   [[17,20,23], 1, ASFFV5, [0, 512, 0.5 ]],   
   [[17,20,23], 1, ASFFV5, [1, 256, 0.5 ]],   
   [[17,20,23], 1, ASFFV5, [2, 128 ,0.5]],  
  #[[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  [[26, 25, 24], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]
```
## ASFF Interpretability  
The paper  also explains why the weight parameter of feature fusion comes from output feature + convolution, because the fusion weight parameter and feature are closely related .
![Image](https://github.com/ruinmessi/ASFF/raw/master/doc/asff.png)
## COCO

System | test-dev mAP | Time (V100) | Time (2080ti)
-- | -- | -- | --
YOLOv3 608 | 33.0 | 20ms | 26ms
YOLOv3 608+ BoFs | 37.0 | 20ms | 26ms
YOLOv3 608 (our baseline) | 38.8 | 20ms | 26ms
YOLOv3 608+ ASFF | 40.6 | 22ms | 30ms
YOLOv3 608+ ASFF* | 42.4 | 22ms | 30ms
YOLOv3 800+ ASFF* | 43.9 | 34ms | 38ms
YOLOv3 MobileNetV1 416 + BoFs | 28.6 | - | 22 ms
YOLOv3 MobileNetV2 416 (our baseline) | 29.0 | - | 22 ms
YOLOv3 MobileNetV2 416 +ASFF | 30.6 | - | 24 ms




I also plan to add some other tricks, such as aware IOU, and other transformer idea etc., I will conduct some experiments and  changes in the future 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ASFF (three fuse feature layers) int the Head for V5(s,m,l,x) #2348

🚀 Feature

Motivation

Pitch

ASFF Interpretability

COCO

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

System	test-dev mAP	Time (V100)	Time (2080ti)
YOLOv3 608	33.0	20ms	26ms
YOLOv3 608+ BoFs	37.0	20ms	26ms
YOLOv3 608 (our baseline)	38.8	20ms	26ms
YOLOv3 608+ ASFF	40.6	22ms	30ms
YOLOv3 608+ ASFF*	42.4	22ms	30ms
YOLOv3 800+ ASFF*	43.9	34ms	38ms
YOLOv3 MobileNetV1 416 + BoFs	28.6	-	22 ms
YOLOv3 MobileNetV2 416 (our baseline)	29.0	-	22 ms
YOLOv3 MobileNetV2 416 +ASFF	30.6	-	24 ms

Uh oh!

Add ASFF (three fuse feature layers) int the Head for V5(s,m,l,x) #2348

Description

🚀 Feature

Motivation

Pitch

ASFF Interpretability

COCO

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions