-
Notifications
You must be signed in to change notification settings - Fork 121
Description
The feedback from the MLCommons TF on automation and reproducibility to extend CM workflows to support the following MLC projects:
-
check how to add network and multi-node code to MLPerf inference and CM automation (collaboration with MLC Network TF)
- extend MLPerf inference with Flask code, gluing with our ref client/server code (Python and later C++) and CM wrapping
- address suggestions from Nvidia
- --network-server=IP1,IP2...
- --network-client
-
continue improving unified CM interface to run MLPerf inference implementations from different vendors
- Optimized MLPerf inference implementations
- Intel submissions (see Intel docs)
- Support installation of conda packages in CM
- Qualcomm submission
- Add CM scripts to preprocess, calibrate and compile QAIC models for ResNet50, RetinaNet and Bert
- Test in AWS
- Test on Thundercomm RB6
- Automatic model installation from a host device
- Automatic detection and usage of quantization parameters
- Nvidia submission
- Google submission
- NeuralMagic submission
- Intel submissions (see Intel docs)
- Add possibility to run any MLPerf implementation including ref
- Add possibility to change target device (eg GeForce instead of A100)
- Expose batch sizes from all existing MLPerf inference reference implementations (when applicable) in edge category in a unified way for ONNX, PyTorch and TF via the CM interface. Report implementations with hardwired batch size.
- Request from Miro: improve MLPerf inference docs for various backends
- Optimized MLPerf inference implementations
-
Develop universal CM-MLPerf docker to run any implementation with local data set and model (similar to Nvidia and Intel but with a unified CM interface)
-
Prototype new universal CM workflow to run any app on any target (with C++/Android/SSH)
-
Add support for any ONNX+loadgen model testing with tuning (prototyped already)
-
Improve CM docs (basic CM message and tutorials/notes for "users" and "developers")
-
Update/improve a list of all reusable, portable and tech-agnostic CM-MLOps scripts
-
Start adding FAQ/notes from Discord/GitHub discussions about CM-MLPerf
-
prototype/reuse above universal CM workflow with ABTF for
- inference
- support different targets (host, remove embedded, Android)
- get all info about target
- add Python and C++ code for loadgen with different backends (PyTorch, ONNX, TF, TFLite, QAIC)
- add object detection with COCO and trained model from Rod (without accuracy for now)
- connect with training CM workflow
- training (https://github.com/mlcommons/abtf-ssd-pytorch)
- present CM-MLPerf at Croissant TF and discuss possible collaboration (doc)
- add CM script to get Croissant
- add datasets via Croissant
- train and save model in CM cache to be loaded to inference
- test with Rod
- present prototype progress in next ABTF meeting (Grigori)
- inference
-
unify experiment and visualization
- prepare high-level meta to run the whole experiment
- [ ]aggregate and visualize results
- if MLPerf run is very short, we need to kind of calibrate it by multiplting N*10 for example similar to what I did in CK
Metadata
Metadata
Assignees
Type
Projects
Status