Skip to content

fix dummy-load deepseek nextn#5739

Closed
lambert0312 wants to merge 11 commits intosgl-project:mainfrom
lambert0312:fix_dummy_load_deepseek_nextn
Closed

fix dummy-load deepseek nextn#5739
lambert0312 wants to merge 11 commits intosgl-project:mainfrom
lambert0312:fix_dummy_load_deepseek_nextn

Conversation

@lambert0312
Copy link
Contributor

@lambert0312 lambert0312 commented Apr 25, 2025

Motivation

Ref #4535

Modifications

Purely load logic for deepseek_nextn.py.

Checklist

@ispobock
Copy link
Collaborator

Could you add accuracy test result?

@lambert0312
Copy link
Contributor Author

Could you add accuracy test result?

@ispobock The accuracy test results are as follows:

  • gsm8k
Accuracy: 0.950
Invalid: 0.000
Latency: 14.089 s
Output throughput: 1527.533 token/s
  • mmlu
Total latency: 146.207
Average accuracy: 0.875

@ispobock
Copy link
Collaborator

I think we can merge this PR first. I have some refactor for the nextn part recently, may have some conflict with this PR. I can rebase it after it merged.
cc: @merrymercy

@lambert0312
Copy link
Contributor Author

I think we can merge this PR first. I have some refactor for the nextn part recently, may have some conflict with this PR. I can rebase it after it merged. cc: @merrymercy

See if it can be merged? @merrymercy @ispobock @zhyncs

@lambert0312
Copy link
Contributor Author

@ispobock I see your PR #5793 has been merged. My PR seems unnecessary. I will close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants