deepfacelab中文网

 找回密码
 立即注册(仅限QQ邮箱)
12
返回列表 发新帖
楼主: xin376151654

ubuntu 18.04 训练时无法选择GPU

[复制链接]

7

主题

23

帖子

300

积分

初级丹师

Rank: 3Rank: 3

积分
300
 楼主| 发表于 2022-2-18 18:17:23 | 显示全部楼层
suprea 发表于 2022-2-1 16:17
TF, CUDA,CUDNN不匹配

跑下看看报什么错

>>> import tensorflow as tf
2022-02-18 18:14:22.979420: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
>>> tf.config.list_physical_devices()
2022-02-18 18:14:28.900155: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-02-18 18:14:28.902044: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2022-02-18 18:14:29.020662: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:1a:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2022-02-18 18:14:29.022386: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 1 with properties:
pciBusID: 0000:1b:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2022-02-18 18:14:29.024074: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 2 with properties:
pciBusID: 0000:3d:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2022-02-18 18:14:29.025775: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 3 with properties:
pciBusID: 0000:3e:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2022-02-18 18:14:29.027459: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 4 with properties:
pciBusID: 0000:88:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2022-02-18 18:14:29.029156: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 5 with properties:
pciBusID: 0000:89:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2022-02-18 18:14:29.030838: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 6 with properties:
pciBusID: 0000:b1:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2022-02-18 18:14:29.032508: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 7 with properties:
pciBusID: 0000:b2:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2022-02-18 18:14:29.032531: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-02-18 18:14:29.038037: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2022-02-18 18:14:29.038084: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2022-02-18 18:14:29.039346: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-02-18 18:14:29.039660: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-02-18 18:14:29.039897: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory
2022-02-18 18:14:29.041051: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2022-02-18 18:14:29.041214: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2022-02-18 18:14:29.041231: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]
回复 支持 反对

使用道具 举报

0

主题

50

帖子

574

积分

高级丹师

Rank: 5Rank: 5

积分
574
发表于 2022-2-20 08:03:53 | 显示全部楼层
xin376151654 发表于 2022-2-18 18:17
>>> import tensorflow as tf
2022-02-18 18:14:22.979420: I tensorflow/stream_executor/platform/defa ...

lib不存在 “Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory”

建议用anaconda然后指定版本,或者让它自己resolve,比如“conda create -n dflenv -c conda-forge python=3.7.3 tensorflow-gpu=2.6.2”
回复 支持 反对

使用道具 举报

7

主题

23

帖子

300

积分

初级丹师

Rank: 3Rank: 3

积分
300
 楼主| 发表于 2022-2-21 15:18:40 | 显示全部楼层
suprea 发表于 2022-2-20 08:03
lib不存在 “Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot ...

我在网上也是 搜索的 “Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory” 这个错误
把libcusolver.so.11  复制为 libcusolver.so.10
现在可以正常跑了
回复 支持 反对

使用道具 举报

QQ|Archiver|手机版|deepfacelab中文网 |网站地图

GMT+8, 2024-9-20 23:32 , Processed in 0.093077 second(s), 9 queries , Redis On.

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表