deepfacelab中文网

 找回密码
 立即注册(仅限QQ邮箱)
查看: 1024|回复: 7

大家看看这是什么问题呀?

[复制链接]

2

主题

27

帖子

146

积分

高级丹童

Rank: 2

积分
146
 楼主| 发表于 2022-5-15 16:36:33 | 显示全部楼层 |阅读模式
星级打分
  • 1
  • 2
  • 3
  • 4
  • 5
平均分:NAN  参与人数:0  我的评分:未评


==================== Model Summary ====================
==                                                   ==
==            Model name: TestMuskIronMan_SAEHD      ==
==                                                   ==
==     Current iteration: 0                          ==
==                                                   ==
==------------------ Model Options ------------------==
==                                                   ==
==            resolution: 256                        ==
==             face_type: wf                         ==
==     models_opt_on_gpu: True                       ==
==                 archi: liae-ud                    ==
==               ae_dims: 512                        ==
==                e_dims: 128                        ==
==                d_dims: 128                        ==
==           d_mask_dims: 128                        ==
==       masked_training: False                      ==
==       eyes_mouth_prio: False                      ==
==           uniform_yaw: False                      ==
==         blur_out_mask: False                      ==
==             adabelief: True                       ==
==            lr_dropout: n                          ==
==           random_warp: True                       ==
==      random_hsv_power: 0.01                       ==
==       true_face_power: 0.0                        ==
==      face_style_power: 0.0                        ==
==        bg_style_power: 0.0                        ==
==               ct_mode: rct                        ==
==              clipgrad: True                       ==
==              pretrain: False                      ==
==       autobackup_hour: 4                          ==
== write_preview_history: False                      ==
==           target_iter: 0                          ==
==       random_src_flip: False                      ==
==       random_dst_flip: False                      ==
==            batch_size: 16                         ==
==             gan_power: 0.1                        ==
==        gan_patch_size: 32                         ==
==              gan_dims: 32                         ==
==                                                   ==

=======================================================
Starting. Press "Enter" to stop training and save model.

Trying to do the first iteration. If an error occurs, reduce the model parameters.

!!!
Windows 10 users IMPORTANT notice. You should set this setting in order to work correctly.
https://i.imgur.com/B7cmDCB.jpg
!!!
You are training the model from scratch. It is strongly recommended to use a pretrained model to speed up the training and improve the quality.

Error: 2 root error(s) found.
  (0) Resource exhausted: OOM when allocating tensor with shape[16,1024,64,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node Conv2D_26 (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:101) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

         [[concat_15/concat/_151]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

  (1) Resource exhausted: OOM when allocating tensor with shape[16,1024,64,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node Conv2D_26 (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:101) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

0 successful operations.
0 derived errors ignored.

Errors may have originated from an input operation.
Input Source operations connected to node Conv2D_26:
Pad_26 (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:87)
decoder/upscalem2/conv1/weight/read (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:61)

Input Source operations connected to node Conv2D_26:
Pad_26 (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:87)
decoder/upscalem2/conv1/weight/read (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:61)

Original stack trace for 'Conv2D_26':
  File "threading.py", line 884, in _bootstrap
  File "threading.py", line 916, in _bootstrap_inner
  File "threading.py", line 864, in run
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread
    debug=debug)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 424, in on_initialize
    gpu_pred_src_src, gpu_pred_src_srcm = self.decoder(gpu_src_code)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in __call__
    return self.forward(*args, **kwargs)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 243, in forward
    m = self.upscalem2(m)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in __call__
    return self.forward(*args, **kwargs)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 71, in forward
    x = self.conv1(x)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\LayerBase.py", line 14, in __call__
    return self.forward(*args, **kwargs)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py", line 101, in forward
    x = tf.nn.conv2d(x, weight, strides, 'VALID', dilations=dilations, data_format=nn.data_format)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 2397, in conv2d
    name=name)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 972, in conv2d
    data_format=data_format, dilations=dilations, name=name)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 750, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3569, in _create_op_internal
    op_def=op_def)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 2045, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)

Traceback (most recent call last):
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1375, in _do_call
    return fn(*args)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1360, in _run_fn
    target_list, run_metadata)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1453, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
  (0) Resource exhausted: OOM when allocating tensor with shape[16,1024,64,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node Conv2D_26}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

         [[concat_15/concat/_151]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

  (1) Resource exhausted: OOM when allocating tensor with shape[16,1024,64,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node Conv2D_26}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\mainscripts\Trainer.py", line 129, in trainerThread
    iter, iter_time = model.train_one_iter()
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\ModelBase.py", line 474, in train_one_iter
    losses = self.onTrainOneIter()
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 774, in onTrainOneIter
    src_loss, dst_loss = self.src_dst_train (warped_src, target_src, target_srcm, target_srcm_em, warped_dst, target_dst, target_dstm, target_dstm_em)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 584, in src_dst_train
    self.target_dstm_em:target_dstm_em,
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 968, in run
    run_metadata_ptr)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1191, in _run
    feed_dict_tensor, options, run_metadata)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1369, in _do_run
    run_metadata)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1394, in _do_call
    raise type(e)(node_def, op, message)  # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
  (0) Resource exhausted: OOM when allocating tensor with shape[16,1024,64,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node Conv2D_26 (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:101) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

         [[concat_15/concat/_151]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

  (1) Resource exhausted: OOM when allocating tensor with shape[16,1024,64,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node Conv2D_26 (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:101) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

0 successful operations.
0 derived errors ignored.

Errors may have originated from an input operation.
Input Source operations connected to node Conv2D_26:
Pad_26 (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:87)
decoder/upscalem2/conv1/weight/read (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:61)

Input Source operations connected to node Conv2D_26:
Pad_26 (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:87)
decoder/upscalem2/conv1/weight/read (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:61)

Original stack trace for 'Conv2D_26':
  File "threading.py", line 884, in _bootstrap
  File "threading.py", line 916, in _bootstrap_inner
  File "threading.py", line 864, in run
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread
    debug=debug)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 424, in on_initialize
    gpu_pred_src_src, gpu_pred_src_srcm = self.decoder(gpu_src_code)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in __call__
    return self.forward(*args, **kwargs)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 243, in forward
    m = self.upscalem2(m)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in __call__
    return self.forward(*args, **kwargs)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 71, in forward
    x = self.conv1(x)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\LayerBase.py", line 14, in __call__
    return self.forward(*args, **kwargs)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2D.py", line 101, in forward
    x = tf.nn.conv2d(x, weight, strides, 'VALID', dilations=dilations, data_format=nn.data_format)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 2397, in conv2d
    name=name)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 972, in conv2d
    data_format=data_format, dilations=dilations, name=name)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 750, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3569, in _create_op_internal
    op_def=op_def)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 2045, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)




回复

使用道具 举报

2

主题

27

帖子

146

积分

高级丹童

Rank: 2

积分
146
 楼主| 发表于 2022-5-15 16:37:23 | 显示全部楼层
我百度了下,好像是显存问题,但是我这些参数都不大呀,不知道为啥
回复 支持 反对

使用道具 举报

1

主题

111

帖子

1073

积分

初级丹圣

Rank: 8Rank: 8

积分
1073
发表于 2022-5-15 16:43:25 | 显示全部楼层
什么显卡啊,BS竟然开到16
回复 支持 反对

使用道具 举报

2

主题

27

帖子

146

积分

高级丹童

Rank: 2

积分
146
 楼主| 发表于 2022-5-15 16:52:46 | 显示全部楼层
saimon 发表于 2022-5-15 16:43
什么显卡啊,BS竟然开到16

我bs改小点,试试
回复 支持 反对

使用道具 举报

50

主题

1223

帖子

8020

积分

高级丹圣

Rank: 13Rank: 13Rank: 13Rank: 13

积分
8020
发表于 2022-5-15 17:25:27 | 显示全部楼层
==                e_dims: 128                        ==
==                d_dims: 128                        ==
==           d_mask_dims: 128                        ==

这几个参数很奇特啊,你这模型哪里来的呀?

来路不明的模型还是还要浪费时间也妙。
回复 支持 反对

使用道具 举报

11

主题

297

帖子

2426

积分

初级丹圣

Rank: 8Rank: 8

积分
2426
发表于 2022-5-15 17:34:06 | 显示全部楼层
爆显存了,bs开小点啊。
回复 支持 反对

使用道具 举报

14

主题

586

帖子

4395

积分

高级丹圣

Rank: 13Rank: 13Rank: 13Rank: 13

积分
4395
发表于 2022-5-15 17:42:22 | 显示全部楼层
一般操作都是调整bs
回复 支持 反对

使用道具 举报

6

主题

219

帖子

1479

积分

初级丹圣

Rank: 8Rank: 8

积分
1479
发表于 2022-5-15 20:18:34 | 显示全部楼层
BS竟然开到16太高了
回复 支持 反对

使用道具 举报

QQ|Archiver|手机版|deepfacelab中文网 |网站地图

GMT+8, 2024-9-22 01:11 , Processed in 0.125481 second(s), 9 queries , Redis On.

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表