Skip to content

pt: fix se_a type_one_side performance degradation#3361

Merged
wanghan-iapcm merged 1 commit intodeepmodeling:develfrom
njzjz:pt-fix-se-a-performance
Feb 29, 2024
Merged

pt: fix se_a type_one_side performance degradation#3361
wanghan-iapcm merged 1 commit intodeepmodeling:develfrom
njzjz:pt-fix-se-a-performance

Conversation

@njzjz
Copy link
Copy Markdown
Member

@njzjz njzjz commented Feb 28, 2024

The code in this PR is ugly, but applying a mask is causing performance degradation for ~3 ms/step.

When applying a mask, aten::nonzero has a high host time, as it causes host-device synchronization:
image

After fixing:
image

See pytorch/pytorch#12461 for more information.

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.86%. Comparing base (2a1508d) to head (aa02b18).

Additional details and impacted files
@@           Coverage Diff           @@
##            devel    #3361   +/-   ##
=======================================
  Coverage   75.85%   75.86%           
=======================================
  Files         416      416           
  Lines       34908    34914    +6     
  Branches     1614     1614           
=======================================
+ Hits        26480    26486    +6     
  Misses       7560     7560           
  Partials      868      868           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wanghan-iapcm wanghan-iapcm added this pull request to the merge queue Feb 29, 2024
github-merge-queue Bot pushed a commit that referenced this pull request Feb 29, 2024
The code in this PR is ugly, but applying a mask is causing performance
degradation for ~3 ms/step.

When applying a mask, `aten::nonzero` has a high host time, as it causes
host-device synchronization:

![image](https://github.com/deepmodeling/deepmd-kit/assets/9496702/86b3518c-206d-410d-928e-2f605746147c)

After fixing:

![image](https://github.com/deepmodeling/deepmd-kit/assets/9496702/af9e86fa-7908-4bbb-ace7-58b4602e167f)

See pytorch/pytorch#12461 for more
information.

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
@wanghan-iapcm
Copy link
Copy Markdown
Collaborator

Is the exclude_mask has side effect on the performance?

@wanghan-iapcm wanghan-iapcm removed this pull request from the merge queue due to a manual request Feb 29, 2024
@njzjz
Copy link
Copy Markdown
Member Author

njzjz commented Feb 29, 2024

Is the exclude_mask has side effect on the performance?

It has a different problem. It seems the integer index is slow: pytorch/pytorch#15245 I haven't tested it.

@wanghan-iapcm wanghan-iapcm added this pull request to the merge queue Feb 29, 2024
Merged via the queue into deepmodeling:devel with commit 48c8818 Feb 29, 2024
@njzjz njzjz mentioned this pull request Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants