Summary
Implement a framework-independent DP model format.
Detailed Description
Background
Currently, the DP model file is dependent on the deep learning framework. The TensorFlow model is in ProtoBuf format (.pb), while the developing PyTorch model is in .pt format. These two files are hard to convert between each other. The ONNX package aims to do it on the OP level, but it is limited since both TensorFlow and PyTorch have lots of unsupported OPs, and DP models may have customized OPs.
The DeePMD-kit needs to implement a framework-independent DP model format to have multiple backend support, as described below. Different frameworks are expected to behave similarly for the same model data.
Data structure
-
The model data is based on the current input parameters, ensuring alignment for each framework. Unimplemented parameters should also be aligned, and the framework raises a NotImplementedError during runtime.
-
Add a @variables key to each layer's dictionary, with a type of dict[str, np.ndarray], to store network parameters corresponding to what is needed to be restored in the current init_frz_model (which currently ensures complete restoration). "@variables" has a special character @ and should be a reserved name and avoided in the future. The keys of @variables should be aligned for all frameworks. Type embedding should be explicitly written and not hidden.
{
"argument1": ...,
"@variables": {
"variable1": ...,
}
}
- Add the following meta-information at the top level: (1) Software, version, and module used to generate the model file. (2) Generation time. (3) A unified model definition version for all frameworks.
{
"model": ...,
"software": ...,
"software_version": ...,
"time": ...,
"model_version": ...,
}
Data storage
HDF5 file is used to store data. h5py is a dependency of TensorFlow, PyTorch, and the existing DeePMD-kit, so this doesn't bring extra dependencies.
- All variables are stored in the HDF5 file using a unique path. The
json path is preserved and should not be used.
- The JSON file is stored in the
json path, where the type of @variables is dict[str, str]. The value of the @variables dict is the path to the variable, which could be different among different platforms.
- Convert
dict[str, np.ndarray] to dict[str, str] when saving the model and convert it back when restoring it.
Binding with class
Add deserialize (methodclass) and serialize to each class. The parent class should call the method of subclass. The implementation should follow dpdispacher:
https://github.com/deepmodeling/dpdispatcher/blob/065731a60be3b58979b54f1d33562ef189800158/dpdispatcher/submission.py#L97-L166
The deserialize (methodclass) and serialize of the top class can be called by external modules.
Progress
Further Information, Files, and Links
No response
Summary
Implement a framework-independent DP model format.
Detailed Description
Background
Currently, the DP model file is dependent on the deep learning framework. The TensorFlow model is in ProtoBuf format (
.pb), while the developing PyTorch model is in.ptformat. These two files are hard to convert between each other. The ONNX package aims to do it on the OP level, but it is limited since both TensorFlow and PyTorch have lots of unsupported OPs, and DP models may have customized OPs.The DeePMD-kit needs to implement a framework-independent DP model format to have multiple backend support, as described below. Different frameworks are expected to behave similarly for the same model data.
Data structure
The model data is based on the current input parameters, ensuring alignment for each framework. Unimplemented parameters should also be aligned, and the framework raises a
NotImplementedErrorduring runtime.Add a
@variableskey to each layer's dictionary, with a type ofdict[str, np.ndarray], to store network parameters corresponding to what is needed to be restored in the currentinit_frz_model(which currently ensures complete restoration). "@variables" has a special character@and should be a reserved name and avoided in the future. The keys of@variablesshould be aligned for all frameworks. Type embedding should be explicitly written and not hidden.{ "argument1": ..., "@variables": { "variable1": ..., } }{ "model": ..., "software": ..., "software_version": ..., "time": ..., "model_version": ..., }Data storage
HDF5 file is used to store data.
h5pyis a dependency of TensorFlow, PyTorch, and the existing DeePMD-kit, so this doesn't bring extra dependencies.jsonpath is preserved and should not be used.jsonpath, where the type of@variablesisdict[str, str]. The value of the@variablesdict is the path to the variable, which could be different among different platforms.dict[str, np.ndarray]todict[str, str]when saving the model and convert it back when restoring it.Binding with class
Add
deserialize(methodclass) andserializeto each class. The parent class should call the method of subclass. The implementation should follow dpdispacher:https://github.com/deepmodeling/dpdispatcher/blob/065731a60be3b58979b54f1d33562ef189800158/dpdispatcher/submission.py#L97-L166
The
deserialize(methodclass) andserializeof the top class can be called by external modules.Progress
Modelsupport DP native model format #2987DOSModelEnergyModelsupport DP native model format #2987FrozenModelLinearModelMultiModelPairwiseDPRcTensorModelDescriptorsupport DP native model format #2987DescrptHybridDescrptLocFrameDescrptSeAEbdV2DescrptSeAEbdDescrptSeAEfDescrptSeAMaskDescrptSeAsupport DP native model format #2987DescrptSeAttenV2DescrptSeAttenDescrptSeRDescrptSeTFittingsupport DP native model format #2987DipoleFittingSeADOSFittingEnerFittingPolarFittingSeATypeEmbedNetaparamandfparamplaceholders inconvert_dp_to_pbFurther Information, Files, and Links
No response