Skip to content

A large error induced by compression for se_e3 descriptor #2250

@njzjz

Description

@njzjz

Discussed in #2182

Originally posted by shihao-code December 15, 2022
When I used a hybrid descriptor of se_e2_a and se_e3, the RMSE of deep potential is very small (3 meV/atom for energy and 59 meV/Ang for atomic force), however, after compressing the potential, the RMSE change very large (16 meV/atom for energy and 64 meV/Ang for atomic force). But if I only used se_e2_a descriptor with keepind other parameter in input.json file unchanged, there is no change before and after compression. And if only se_e3 descriptor was used, there is also a large error induced by compression.

Verison of deepmd-kit: 2.1.5_cuda11.6

Command I used: dp compress -i FeH.pb -o FeH-compress.pb --step 0.002

The output of compression:

Loading BaseGPU/2021
  Loading requirement: nvhpc/21.3 cuda/11.2 openmpi/4.0.3cu11.2.v2
WARNING:tensorflow:From /sqfs/work/G14979/u6b368/bin/deepmd_kit_gpu_2.1.5_cuda11.6/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS.
WARNING:root:Environment variable KMP_BLOCKTIME is empty. Use the default value 0
WARNING:root:Environment variable KMP_AFFINITY is empty. Use the default value granularity=fine,verbose,compact,1,0
/sqfs/work/G14979/u6b368/bin/deepmd_kit_gpu_2.1.5_cuda11.6/lib/python3.10/importlib/__init__.py:169: UserWarning: The NumPy module was reloaded (imported a second time). This can in some cases result in small but subtle issues and is discouraged.
  _bootstrap._exec(spec, module)
DEEPMD INFO    


DEEPMD INFO    stage 1: compress the model
DEEPMD INFO     _____               _____   __  __  _____           _     _  _   
DEEPMD INFO    |  __ \             |  __ \ |  \/  ||  __ \         | |   (_)| |  
DEEPMD INFO    | |  | |  ___   ___ | |__) || \  / || |  | | ______ | | __ _ | |_ 
DEEPMD INFO    | |  | | / _ \ / _ \|  ___/ | |\/| || |  | ||______|| |/ /| || __|
DEEPMD INFO    | |__| ||  __/|  __/| |     | |  | || |__| |        |   < | || |_ 
DEEPMD INFO    |_____/  \___| \___||_|     |_|  |_||_____/         |_|\_\|_| \__|
DEEPMD INFO    Please read and cite:
DEEPMD INFO    Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018)
DEEPMD INFO    installed to:         /home/conda/feedstock_root/build_artifacts/deepmd-kit_1663923590539/work/_skbuild/linux-x86_64-3.10/cmake-install
DEEPMD INFO    source :              v2.1.5
DEEPMD INFO    source brach:         HEAD
DEEPMD INFO    source commit:        6e3d4a62
DEEPMD INFO    source commit at:     2022-09-23 16:10:28 +0800
DEEPMD INFO    build float prec:     double
DEEPMD INFO    build variant:        cuda
DEEPMD INFO    build with tf inc:    /sqfs/work/G14979/u6b368/bin/deepmd_kit_gpu_2.1.5_cuda11.6/lib/python3.10/site-packages/tensorflow/include;/sqfs/work/G14979/u6b368/bin/deepmd_kit_gpu_2.1.5_cuda11.6/include
DEEPMD INFO    build with tf lib:    
DEEPMD INFO    ---Summary of the training---------------------------------------
DEEPMD INFO    running on:           gpu0501
DEEPMD INFO    computing device:     gpu:0
DEEPMD INFO    CUDA_VISIBLE_DEVICES: 0,1
DEEPMD INFO    Count of visible GPU: 2
DEEPMD INFO    num_intra_threads:    0
DEEPMD INFO    num_inter_threads:    0
DEEPMD INFO    -----------------------------------------------------------------
DEEPMD INFO    training without frame parameter
DEEPMD INFO    training data with lower boundary: [-0.22680075 -0.29381635]
DEEPMD INFO    training data with upper boundary: [30.16753829 41.82551879]
OMP: Info #155: KMP_AFFINITY: Initial OS proc set respected: 0
OMP: Info #216: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #157: KMP_AFFINITY: 1 available OS procs
OMP: Info #158: KMP_AFFINITY: Uniform topology
OMP: Info #287: KMP_AFFINITY: topology layer "LL cache" is equivalent to "socket".
OMP: Info #287: KMP_AFFINITY: topology layer "L3 cache" is equivalent to "socket".
OMP: Info #287: KMP_AFFINITY: topology layer "L2 cache" is equivalent to "core".
OMP: Info #287: KMP_AFFINITY: topology layer "L1 cache" is equivalent to "core".
OMP: Info #192: KMP_AFFINITY: 1 socket x 1 core/socket x 1 thread/core (1 total cores)
OMP: Info #218: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #172: KMP_AFFINITY: OS proc 0 maps to socket 0 core 0 thread 0 
OMP: Info #254: KMP_AFFINITY: pid 449229 tid 449422 thread 0 bound to OS proc set 0
OMP: Info #254: KMP_AFFINITY: pid 449229 tid 449421 thread 1 bound to OS proc set 0
DEEPMD INFO    training data with lower boundary: [-1505.35165116 -4165.88651941]
DEEPMD INFO    training data with upper boundary: [1505.35165116 4165.88651941]
DEEPMD INFO    built lr
DEEPMD INFO    built network
DEEPMD INFO    built training
DEEPMD INFO    initialize model from scratch
INFO:tensorflow:/sqfs2/cmc/0/work/G14979/u6b368/bbb0/model-compression/model.ckpt.index
DEEPMD INFO    /sqfs2/cmc/0/work/G14979/u6b368/bbb0/model-compression/model.ckpt.index
INFO:tensorflow:0
DEEPMD INFO    0
INFO:tensorflow:/sqfs2/cmc/0/work/G14979/u6b368/bbb0/model-compression/model.ckpt.data-00000-of-00001
DEEPMD INFO    /sqfs2/cmc/0/work/G14979/u6b368/bbb0/model-compression/model.ckpt.data-00000-of-00001
INFO:tensorflow:69300
DEEPMD INFO    69300
INFO:tensorflow:/sqfs2/cmc/0/work/G14979/u6b368/bbb0/model-compression/model.ckpt.meta
DEEPMD INFO    /sqfs2/cmc/0/work/G14979/u6b368/bbb0/model-compression/model.ckpt.meta
INFO:tensorflow:1659000
DEEPMD INFO    1659000
DEEPMD INFO    finished compressing
DEEPMD INFO    


DEEPMD INFO    stage 2: freeze the model
INFO:tensorflow:Restoring parameters from model-compression/model.ckpt
DEEPMD INFO    Restoring parameters from model-compression/model.ckpt
DEEPMD INFO    The following nodes will be frozen: ['model_type', 'descrpt_attr/rcut', 'descrpt_attr/ntypes', 'model_attr/tmap', 'model_attr/model_type', 'model_attr/model_version', 'train_attr/min_nbor_dist', 'train_attr/training_script', 'o_energy', 'o_force', 'o_virial', 'o_atom_energy', 'o_atom_virial', 'fitting_attr/dfparam', 'fitting_attr/daparam']
WARNING:tensorflow:From /sqfs/work/G14979/u6b368/bin/deepmd_kit_gpu_2.1.5_cuda11.6/lib/python3.10/site-packages/deepmd/entrypoints/freeze.py:246: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
DEEPMD WARNING From /sqfs/work/G14979/u6b368/bin/deepmd_kit_gpu_2.1.5_cuda11.6/lib/python3.10/site-packages/deepmd/entrypoints/freeze.py:246: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
WARNING:tensorflow:From /sqfs/work/G14979/u6b368/bin/deepmd_kit_gpu_2.1.5_cuda11.6/lib/python3.10/site-packages/tensorflow/python/framework/convert_to_constants.py:925: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
DEEPMD WARNING From /sqfs/work/G14979/u6b368/bin/deepmd_kit_gpu_2.1.5_cuda11.6/lib/python3.10/site-packages/tensorflow/python/framework/convert_to_constants.py:925: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
DEEPMD INFO    1258 ops in the final graph.

My input.json file

        "descriptor": {
	    "type": "hybrid",
	    "list": [
		 {
	            "type": "se_e2_a",
	            "sel": "auto",
	            "rcut_smth": 0.5,
		    "activation_function": "tanh",
	            "rcut": 6.5,
	            "neuron": [
	                30,
	                60,
			120
	            ],
	            "resnet_dt": false,
	            "axis_neuron": 32,
	            "seed": 13290,
	            "_comment": " that's all"
		 },
		 {
                    "type": "se_e3",
                    "sel": "auto",
                    "rcut_smth": 0.5,
                    "activation_function": "tanh",
                    "rcut": 5.0,
                    "neuron": [
                        5,
                        10,
                        20
                    ],
                    "resnet_dt": false,
                    "seed": 1327,
                    "_comment": " that's all"
		 }
	    ]
        },
        "fitting_net": {
            "neuron": [
                320,
                320,
		320
            ],
            "resnet_dt": true,
            "seed": 6374,
            "_comment": " that's all"
        },

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugcriticalCritical bugs that may break the results without messagesreproducedThis bug has been reproduced by developers

    Type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions