Skip to content

name 'M' is not defined, seems M = Model(D_train.get_metadata()) encounter error #32

@no7dw

Description

@no7dw

when I run unit test in docker (cpu ver.), it reports an error:

root@85a655cc87d1:/app/codalab# python run_local_test.py
2020-05-07 06:36:42 INFO run_local_test.py: ##################################################
2020-05-07 06:36:42 INFO run_local_test.py: Begin running local test using
2020-05-07 06:36:42 INFO run_local_test.py: code_dir = AutoDL_sample_code_submission
2020-05-07 06:36:42 INFO run_local_test.py: dataset_dir = miniciao
2020-05-07 06:36:42 INFO run_local_test.py: ##################################################
2020-05-07 06:36:42 INFO run_local_test.py: Cleaning existing output directory of last run: /app/codalab/AutoDL_sample_result_submission
2020-05-07 06:36:42 INFO run_local_test.py: Cleaning existing output directory of last run: /app/codalab/AutoDL_scoring_output
python /app/codalab/AutoDL_ingestion_program/ingestion.py --dataset_dir=/app/codalab/AutoDL_sample_data/miniciao --code_dir=/app/codalab/AutoDL_sample_code_submission --time_budget=1200.0
python /app/codalab/AutoDL_scoring_program/score.py --solution_dir=/app/codalab/AutoDL_sample_data/miniciao
2020-05-07 06:36:43,653 INFO score.py: ===== Start scoring program. Version: v20191204 =====
2020-05-07 06:36:44,673 INFO ingestion.py: ************************************************
2020-05-07 06:36:44,673 INFO ingestion.py: ******** Processing dataset Miniciao ********
2020-05-07 06:36:44,673 INFO ingestion.py: ************************************************
2020-05-07 06:36:44,673 INFO ingestion.py: Reading training set and test set...
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/tensor_array_ops.py:162: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2020-05-07 06:36:44,928 INFO ingestion.py: Creating model...this process should not exceed 20min.
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/model.py", line 19, in <lambda>
    threading.Thread(target=lambda: torch.cuda.synchronize()),
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 398, in synchronize
    _lazy_init()
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 192, in _lazy_init
    _check_driver()
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 102, in _check_driver
    http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

2020-05-07 06:36:46,014 INFO ingestion.py: Initialization success, time spent so far 1.0854098796844482 sec
2020-05-07 06:36:46,014 ERROR ingestion.py: Failed to initializing model.
2020-05-07 06:36:46,015 ERROR ingestion.py: Encountered exception:
Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Traceback (most recent call last):
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 339, in <module>
    M = Model(D_train.get_metadata()) # The metadata of D_train and D_test only differ in sample_count
  File "/usr/lib/python3.5/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 208, in time_limit
    yield
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 339, in <module>
    M = Model(D_train.get_metadata()) # The metadata of D_train and D_test only differ in sample_count
  File "/app/codalab/AutoDL_sample_code_submission/model.py", line 54, in __init__
    self.domain_model = DomainModel(self.metadata)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/model.py", line 42, in __init__
    super(Model, self).__init__(metadata)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/skeleton/projects/logic.py", line 88, in __init__
    self.build()
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/model.py", line 66, in build
    self.model_9.init(model_dir=model_path, gain=1.0)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/architectures/resnet.py", line 244, in init
    model_dir=self.model_dir)
  File "/usr/local/lib/python3.5/dist-packages/torch/hub.py", line 499, in load_state_dict_from_url
    return torch.load(cached_file, map_location=map_location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 426, in load
    return _load(f, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 613, in _load
    result = unpickler.load()
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 576, in persistent_load
    deserialized_objects[root_key] = restore_location(obj, location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 155, in default_restore_location
    result = fn(storage, location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 131, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 115, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
2020-05-07 06:36:46,035 INFO ingestion.py: ===== Start core part of ingestion program. Version: v20191204 =====
2020-05-07 06:36:46,039 INFO ingestion.py: Failed to run ingestion.
2020-05-07 06:36:46,039 ERROR ingestion.py: Encountered exception:
name 'M' is not defined
Traceback (most recent call last):
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 358, in <module>
    if not hasattr(M, attr):
NameError: name 'M' is not defined
2020-05-07 06:36:46,044 INFO ingestion.py: Wrote the file end.txt marking the end of ingestion.
2020-05-07 06:36:46,045 INFO ingestion.py: [-] Done, but encountered some errors during ingestion.
2020-05-07 06:36:46,045 INFO ingestion.py: [-] Overall time spent  0.01 sec
2020-05-07 06:36:46,079 INFO ingestion.py: [Ingestion terminated]

first I thought it was an netowrk issue during download training data, but I tried run test with proxy, orI downloaded the the r9-xxx.pth.tar , even after build with another machine (with docker of course) still without luck.

It's weird that log report :

Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False.

which I'm using docker of cpu ver

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions