Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED UnknownError (see ab-有解無憂

訓練MTCNN時出現以下錯誤，有沒有大佬知道怎樣解決？萬分謝謝！！！
(tensorflow_gpu) E:\TensorFlow\MTCNN-Tensorflow-master\train_models>python train_PNet.py
2020-03-19 08:22:03.806618: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-03-19 08:22:05.011111: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce GTX 960M major: 5 minor: 0 memoryClockRate(GHz): 1.176
pciBusID: 0000:01:00.0
totalMemory: 2.00GiB freeMemory: 1.65GiB
2020-03-19 08:22:05.018715: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2020-03-19 08:22:07.352337: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-03-19 08:22:07.358422: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0
2020-03-19 08:22:07.361958: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N
2020-03-19 08:22:07.367832: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1399 MB memory) -> physical GPU (device: 0, name: GeForce GTX 960M, pci bus id: 0000:01:00.0, compute capability: 5.0)
E:/TensorFlow/MTCNN-Tensorflow-master/prepare_data/DATA/imglists/PNet\train_PNet_landmark.txt
Total size of the dataset is:  1429535
E:/TensorFlow/MTCNN-Tensorflow-master/data/MTCNN_model/PNet_landmark/PNet
dataset dir is: E:/TensorFlow/MTCNN-Tensorflow-master/prepare_data/DATA/imglists/PNet\train_PNet_landmark.tfrecord_shuffle
WARNING:tensorflow:From E:/TensorFlow/MTCNN-Tensorflow-master/prepare_data\read_tfrecord_v2.py:12: string_input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by `tf.data`. Use `tf.data.Dataset.from_tensor_slices(string_tensor).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs)`. If `shuffle=False`, omit the `.shuffle(...)`.
WARNING:tensorflow:From E:\Users\Lenovo\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\training\input.py:276: input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by `tf.data`. Use `tf.data.Dataset.from_tensor_slices(input_tensor).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs)`. If `shuffle=False`, omit the `.shuffle(...)`.
WARNING:tensorflow:From E:\Users\Lenovo\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\training\input.py:188: limit_epochs (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
WARNING:tensorflow:From E:/TensorFlow/MTCNN-Tensorflow-master/prepare_data\read_tfrecord_v2.py:14: TFRecordReader.__init__ (from tensorflow.python.ops.io_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by `tf.data`. Use `tf.data.TFRecordDataset`.
WARNING:tensorflow:From E:/TensorFlow/MTCNN-Tensorflow-master/prepare_data\read_tfrecord_v2.py:43: batch (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by `tf.data`. Use `tf.data.Dataset.batch(batch_size)` (or `padded_batch(...)` if `dynamic_pad=True`).
(384, 12, 12, 3)
load summary for :  conv1/add
(384, 10, 10, 10)
load summary for :  pool1/MaxPool
(384, 5, 5, 10)
load summary for :  conv2/add
(384, 3, 3, 16)
load summary for :  conv3/add
(384, 1, 1, 32)
load summary for :  conv4_1/Softmax
(384, 1, 1, 2)
load summary for :  conv4_2/BiasAdd
(384, 1, 1, 4)
load summary for :  conv4_3/BiasAdd
(384, 1, 1, 10)
WARNING:tensorflow:From E:\TensorFlow\MTCNN-Tensorflow-master\train_models\mtcnn_model.py:239: get_regularization_losses (from tensorflow.contrib.losses.python.losses.loss_ops) is deprecated and will be removed after 2016-12-30.
Instructions for updating:
Use tf.losses.get_regularization_losses instead.
2020-03-19 08:22:09.749695: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2020-03-19 08:22:09.753009: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-03-19 08:22:09.756651: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0
2020-03-19 08:22:09.758998: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N
2020-03-19 08:22:09.761490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1399 MB memory) -> physical GPU (device: 0, name: GeForce GTX 960M, pci bus id: 0000:01:00.0, compute capability: 5.0)
WARNING:tensorflow:From E:\TensorFlow\MTCNN-Tensorflow-master\train_models\train.py:192: start_queue_runners (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
2020-03-19 08:22:12.072682: E tensorflow/stream_executor/cuda/cuda_dnn.cc:373] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2020-03-19 08:22:12.077482: E tensorflow/stream_executor/cuda/cuda_dnn.cc:377] Error retrieving driver version: Unimplemented: kernel reported driver version not implemented on Windows
2020-03-19 08:22:12.082171: E tensorflow/stream_executor/cuda/cuda_dnn.cc:373] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2020-03-19 08:22:12.085237: E tensorflow/stream_executor/cuda/cuda_dnn.cc:377] Error retrieving driver version: Unimplemented: kernel reported driver version not implemented on Windows
Traceback (most recent call last):
  File "E:\Users\Lenovo\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call
    return fn(*args)
  File "E:\Users\Lenovo\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "E:\Users\Lenovo\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
         [[{{node conv1/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_adjust_saturation/Identity_1_0_0/_51, conv1/weights/read)]]
         [[{{node Mean_1/_215}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1195_Mean_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train_PNet.py", line 49, in <module>
    train_PNet(base_dir, prefix, end_epoch, display, lr)
  File "train_PNet.py", line 35, in train_PNet
    train(net_factory,prefix, end_epoch, base_dir, display=display, base_lr=lr)
  File "E:\TensorFlow\MTCNN-Tensorflow-master\train_models\train.py", line 221, in train
    _,_,summary = sess.run([train_op, lr_op ,summary_op], feed_dict={input_image: image_batch_array, label: label_batch_array, bbox_target: bbox_batch_array,landmark_target:landmark_batch_array})
  File "E:\Users\Lenovo\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\client\session.py", line 929, in run
    run_metadata_ptr)
  File "E:\Users\Lenovo\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "E:\Users\Lenovo\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run
    run_metadata)
  File "E:\Users\Lenovo\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.err

uj5u.com熱心網友回復：

   往程式中加下面代碼試一下
    import os
    os.environ["CUDA_VISIBLE_DEVICES"] = '0'   #指定第一塊GPU可用
    config = tf.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = 0.5  # 程式最多只能占用指定gpu50%的顯存
    config.gpu_options.allow_growth = True      #程式按需申請記憶體
    sess = tf.Session(config = config)

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/68400.html

標籤：機器視覺

上一篇：電子設計網表及網表的生成

下一篇：蘋果電腦遇見這個問題怎么解決