Skip to content

Add ASFF (three fuse feature layers) int the Head for V5(s,m,l,x) #2348

@positive666

Description

@positive666

🚀 Feature

Add ASFF fuse feature layers to the Head : the level1-level 3 scale maps are respectively fused into 3 corresponding scale feature maps, and the fusion weights are adaptively adjusted.

Motivation

  1. Refer to the feature fusion case of yolov3_asff. paper
  2. Add optional four yolov5_asff models structure (in yaml file )
  3. The ASFF method is very suitable for the YOLO series, and through reading the paper, I found that it has a reasonable explanatory nature. It can be incorporated into an alternative structure of V5.
  4. Integrate ASFF functions into the project and hope to make a contribution for yoloV5 project

Pitch

I add ASFFV5 classes at 310 line in https://github.com/positive666/yolov5/blob/master/models/common.py :
Add asff layers structure for yolov5(s,m,x,l),Integrated into YOLOV5's code project. and different more than v3_asff and add RFB block.such as, yolov5s.yaml:

head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [1024, False]],  # 23 (P5/32-large)

   [[17,20,23], 1, ASFFV5, [0, 512, 0.5 ]],   
   [[17,20,23], 1, ASFFV5, [1, 256, 0.5 ]],   
   [[17,20,23], 1, ASFFV5, [2, 128 ,0.5]],  
  #[[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  [[26, 25, 24], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

ASFF Interpretability

The paper also explains why the weight parameter of feature fusion comes from output feature + convolution, because the fusion weight parameter and feature are closely related .
Image

COCO

System test-dev mAP Time (V100) Time (2080ti)
YOLOv3 608 33.0 20ms 26ms
YOLOv3 608+ BoFs 37.0 20ms 26ms
YOLOv3 608 (our baseline) 38.8 20ms 26ms
YOLOv3 608+ ASFF 40.6 22ms 30ms
YOLOv3 608+ ASFF* 42.4 22ms 30ms
YOLOv3 800+ ASFF* 43.9 34ms 38ms
YOLOv3 MobileNetV1 416 + BoFs 28.6 - 22 ms
YOLOv3 MobileNetV2 416 (our baseline) 29.0 - 22 ms
YOLOv3 MobileNetV2 416 +ASFF 30.6 - 24 ms

I also plan to add some other tricks, such as aware IOU, and other transformer idea etc., I will conduct some experiments and changes in the future

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions