Skip to content

CPU based pre-trained model #23

@johnyoonh

Description

@johnyoonh

I am guessing that the model provided is for machines with CUDA-capable device.
Do you guys happen to have a pre-trained CPU version for cnndm_model.bin ?

@@ -165,7 +165,7 @@ def main():
     print(args.model_recover_path)
     for model_recover_path in glob.glob(args.model_recover_path.strip()):
         logger.info("***** Recover model: %s *****", model_recover_path)
-        model_recover = torch.load(model_recover_path)
+        model_recover = torch.load(model_recover_path, map_location="cpu")
DATA_DIR=../cnndm_data
MODEL_RECOVER_PATH=../cnndm_model.bin
EVAL_SPLIT=test
export PYTORCH_PRETRAINED_BERT_CACHE=/tmp/bert-cased-pretrained-cache
# run decoding
python biunilm/decode_seq2seq.py --fp16 --amp --bert_model bert-large-cased --new_segment_ids --mode s2s --need_score_t
races \
  --input_file ${DATA_DIR}/${EVAL_SPLIT}.src --split ${EVAL_SPLIT} --tokenized_input \
  --model_recover_path ${MODEL_RECOVER_PATH} \
  --max_seq_length 768 --max_tgt_length 128 \
  --batch_size 64 --beam_size 5 --length_penalty 0 \
  --forbid_duplicate_ngrams --forbid_ignore_word ".|[X_SEP]"
11/04/2019 15:55:06 - INFO - pytorch_pretrained_bert.tokenization -   loading vocabulary file https://s3.amazonaws.com/
models.huggingface.co/bert/bert-large-cased-vocab.txt from cache at /tmp/bert-cased-pretrained-cache/cee054f6aafe5e2cf8
16d2228704e326446785f940f5451a5b26033516a4ac3d.e13dbb970cb325137104fb2e5f36fe865f27746c6b526f6352861b1980eb80b1
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=51 error=38 : no CUDA-capable device is detected
Traceback (most recent call last):
  File "biunilm/decode_seq2seq.py", line 254, in <module>
    main()
  File "biunilm/decode_seq2seq.py", line 147, in main
    amp_handle = amp.init(enable_caching=True)
  File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/apex/amp/amp.py", line 65, in init
    handle = AmpHandle(enable_caching, verbose)
  File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/apex/amp/handle.py", line 14, in __init__
    self._default_scaler = LossScaler()
  File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/apex/amp/scaler.py", line 35, in __init__
    self._overflow_buf = torch.cuda.IntTensor([0])
  File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/torch/cuda/__init__.py", line 163, in _lazy_init
    torch._C._cuda_init()
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:51
[1]    72305 exit 1     python biunilm/decode_seq2seq.py --fp16 --amp --bert_model bert-large-cased

without --amp:

Traceback (most recent call last):
  File "biunilm/decode_seq2seq.py", line 254, in <module>
    main()
  File "biunilm/decode_seq2seq.py", line 216, in main
    position_ids, input_mask, task_idx=task_idx, mask_qkv=mask_qkv)
  File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/john/code/unilm/src/pytorch_pretrained_bert/modeling.py", line 1409, in forward
    return self.beam_search(input_ids, token_type_ids, position_ids, attention_mask, task_idx=task_idx, mask_qkv=mask_qkv)
  File "/home/john/code/unilm/src/pytorch_pretrained_bert/modeling.py", line 1528, in beam_search
    output_all_encoded_layers=True, prev_embedding=prev_embedding, prev_encoded_layers=prev_encoded_layers, mask_qkv=mask_qkv)
  File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/john/code/unilm/src/pytorch_pretrained_bert/modeling.py", line 1062, in forward
    input_ids, token_type_ids, attention_mask)
  File "/home/john/code/unilm/src/pytorch_pretrained_bert/modeling.py", line 1037, in get_extended_attention_mask
    extended_attention_mask = (1.0 - extended_attention_mask) * -10000.0
  File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/torch/tensor.py", line 371, in __rsub__
    return _C._VariableFunctions.rsub(self, other)
RuntimeError: "add_cpu" not implemented for 'Half'

Packages:

pytorch-pretrained-bert 0.4.0    
torch                   1.1.0 
tensorboardX            1.9
apex                     0.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions