用cifar10和resnet50訓練模型時
運行下面程式,
python train_image_classifier.py \
--train_dir=cifar10/train_dir \
--dataset_name=cifar10\
--dataset_split_name=train \
--dataset_dir=cifar10/data \
--model_name=resnet_v2_50 \
--checkpoint_path=pretrained/resnet_v2_50.ckpt \
--checkpoint_exclude_scopes=resnet_v2_50/logits \
--max_number_of_steps=50 \
--batch_size=16 \
--learning_rate=0.001 \
--log_every_n_steps=100 \
--optimizer=adam
反饋就是這樣
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from cifar10/train_dir/model.ckpt-62
I0308 18:30:32.481478 140012097914688 saver.py:1280] Restoring parameters from cifar10/train_dir/model.ckpt-62
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Cannot assign a device for operation resnet_v2_50/Pad: node resnet_v2_50/Pad (defined at /home/belle/xun/slim/nets/resnet_utils.py:122) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0 ]. Make sure the device specification refers to a valid device.
[[resnet_v2_50/Pad]]
Errors may have originated from an input operation.
Input Source operations connected to node resnet_v2_50/Pad:
fifo_queue_Dequeue (defined at train_image_classifier.py:488)
I0308 18:30:32.733913 140012097914688 coordinator.py:224] Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Cannot assign a device for operation resnet_v2_50/Pad: node resnet_v2_50/Pad (defined at /home/belle/xun/slim/nets/resnet_utils.py:122) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0 ]. Make sure the device specification refers to a valid device.
[[resnet_v2_50/Pad]]
Errors may have originated from an input operation.
Input Source operations connected to node resnet_v2_50/Pad:
fifo_queue_Dequeue (defined at train_image_classifier.py:488)
Traceback (most recent call last):
File "/home/belle/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
return fn(*args)
File "/home/belle/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1339, in _run_fn
self._extend_graph()
File "/home/belle/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1374, in _extend_graph
tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation resnet_v2_50/Pad: {{node resnet_v2_50/Pad}}was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0 ]. Make sure the device specification refers to a valid device.
[[resnet_v2_50/Pad]]
。。。。。。。
E0308 18:30:33.083234 140012097914688 tf_should_use.py:71] ==================================
Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>):
<tf.Tensor 'init_ops/report_uninitialized_variables/boolean_mask/GatherV2:0' shape=(?,) dtype=string>
If you want to mark it as used call its "mark_used()" method.
It was originally created here:
File "train_image_classifier.py", line 608, in <module>
tf.app.run() File "/home/belle/.local/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/home/belle/.local/lib/python3.6/site-packages/absl/app.py", line 321, in run
raise File "/home/belle/.local/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv)) File "train_image_classifier.py", line 604, in main
sync_optimizer=optimizer if FLAGS.sync_replicas else None) File "/home/belle/.local/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 796, in train
should_retry = True File "/home/belle/.local/lib/python3.6/site-packages/tensorflow/python/util/tf_should_use.py", line 193, in wrapped
return _add_should_use_warning(fn(*args, **kwargs))
這是個什么情況啊,救救孩子吧!
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/90825.html
標籤:人工智能技術
上一篇:今晚周志華《“人工智能”的內涵》增加B站觀看平臺南大招生小藍鯨 今天今晚(3月8日)19:30
下一篇:怎么提升博客等級?
