deepfacelab中文网

 找回密码
 立即注册(仅限QQ邮箱)
查看: 878|回复: 5

求求各位老大帮忙,都练不了下去了

[复制链接]

1

主题

3

帖子

158

积分

高级丹童

Rank: 2

积分
158
 楼主| 发表于 2023-2-27 21:45:01 | 显示全部楼层 |阅读模式
星级打分
  • 1
  • 2
  • 3
  • 4
  • 5
平均分:NAN  参与人数:0  我的评分:未评
我用默认参数来训练,结果出来都是错误的,麻烦各位帮忙看下到底是哪里的原因


感谢大佬帮助





Running trainer.

[new] No saved models found. Enter a name of a new model :
new

Model first run.

Choose one or several GPU idxs (separated by comma).

[CPU] : CPU
  [0] : NVIDIA GeForce RTX 3060 Laptop GPU

[0] Which GPU indexes to choose? :
0




[0] Autobackup every N hour ( 0..24 ?:help ) :
0
[n] Write preview history ( y/n ?:help ) :
n
[0] Target iteration :
0
[n] Flip SRC faces randomly ( y/n ?:help ) :
n
[y] Flip DST faces randomly ( y/n ?:help ) :
y
[4] Batch_size ( ?:help ) :
4
[128] Resolution ( 64-640 ?:help ) :
128
[f] Face type ( h/mf/f/wf/head ?:help ) : wf
wf
[liae-ud] AE architecture ( ?:help ) : df-udt
df-udt
[256] AutoEncoder dimensions ( 32-1024 ?:help ) :
256
[64] Encoder dimensions ( 16-256 ?:help ) :
64
[64] Decoder dimensions ( 16-256 ?:help ) :
64
[22] Decoder mask dimensions ( 16-256 ?:help ) :
22
[y] Masked training ( y/n ?:help ) :
y
[n] Eyes and mouth priority ( y/n ?:help ) : y
[n] Uniform yaw distribution of samples ( y/n ?:help ) : y
[n] Blur out mask ( y/n ?:help ) : y
[y] Place models and optimizer on GPU ( y/n ?:help ) : y
[y] Use AdaBelief optimizer? ( y/n ?:help ) : y
[n] Use learning rate dropout ( n/y/cpu ?:help ) : y
y
[y] Enable random warp of samples ( y/n ?:help ) : n
[0.0] Random hue/saturation/light intensity ( 0.0 .. 0.3 ?:help ) :
0.0
[0.0] GAN power ( 0.0 .. 5.0 ?:help ) : 0.1
0.1
[16] GAN patch size ( 3-640 ?:help ) :
16
[16] GAN dimensions ( 4-512 ?:help ) :
16
[0.0] 'True face' power. ( 0.0000 .. 1.0 ?:help ) :
0.0
[0.0] Face style power ( 0.0..100.0 ?:help ) :
0.0
[0.0] Background style power ( 0.0..100.0 ?:help ) :
0.0
[none] Color transfer for src faceset ( none/rct/lct/mkl/idt/sot ?:help ) : rct
rct
[n] Enable gradient clipping ( y/n ?:help ) :
n
[n] Enable pretraining mode ( y/n ?:help ) :
n
Initializing models: 100%|###############################################################| 7/7 [00:02<00:00,  3.41it/s]
Loaded 14256 packed faces from C:\DeepFaceLab_NVIDIA_RTX3000_series\workspace\data_src\aligned
Sort by yaw: 100%|##################################################################| 128/128 [00:00<00:00, 462.57it/s]
Loaded 10229 packed faces from C:\DeepFaceLab_NVIDIA_RTX3000_series\workspace\data_dst\aligned
Sort by yaw: 100%|##################################################################| 128/128 [00:00<00:00, 675.91it/s]
======================== Model Summary ========================
==                                                           ==
==            Model name: new_SAEHD                          ==
==                                                           ==
==     Current iteration: 0                                  ==
==                                                           ==
==---------------------- Model Options ----------------------==
==                                                           ==
==            resolution: 128                                ==
==             face_type: wf                                 ==
==     models_opt_on_gpu: True                               ==
==                 archi: df-udt                             ==
==               ae_dims: 256                                ==
==                e_dims: 64                                 ==
==                d_dims: 64                                 ==
==           d_mask_dims: 22                                 ==
==       masked_training: True                               ==
==       eyes_mouth_prio: True                               ==
==           uniform_yaw: True                               ==
==         blur_out_mask: True                               ==
==             adabelief: True                               ==
==            lr_dropout: y                                  ==
==           random_warp: False                              ==
==      random_hsv_power: 0.0                                ==
==       true_face_power: 0.0                                ==
==      face_style_power: 0.0                                ==
==        bg_style_power: 0.0                                ==
==               ct_mode: rct                                ==
==              clipgrad: False                              ==
==              pretrain: False                              ==
==       autobackup_hour: 0                                  ==
== write_preview_history: False                              ==
==           target_iter: 0                                  ==
==       random_src_flip: False                              ==
==       random_dst_flip: True                               ==
==            batch_size: 4                                  ==
==             gan_power: 0.1                                ==
==        gan_patch_size: 16                                 ==
==              gan_dims: 16                                 ==
==                                                           ==
==----------------------- Running On ------------------------==
==                                                           ==
==          Device index: 0                                  ==
==                  Name: NVIDIA GeForce RTX 3060 Laptop GPU ==
==                  VRAM: 3.41GB                             ==
==                                                           ==
===============================================================
Starting. Press "Enter" to stop training and save model.

Trying to do the first iteration. If an error occurs, reduce the model parameters.

!!!
Windows 10 users IMPORTANT notice. You should set this setting in order to work correctly.
https://i.imgur.com/B7cmDCB.jpg
!!!
You are training the model from scratch. It is strongly recommended to use a pretrained model to speed up the training and improve the quality.

Error: OOM when allocating tensor with shape[4,32,65,65] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node conv2d_transpose_1 (defined at C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2DTranspose.py:81) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


Errors may have originated from an input operation.
Input Source operations connected to node conv2d_transpose_1:
stack_1 (defined at C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2DTranspose.py:74)
D_src/upconvs_1/weight/read (defined at C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2DTranspose.py:43)
concat_4 (defined at C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\models\PatchDiscriminator.py:184)

Original stack trace for 'conv2d_transpose_1':
  File "threading.py", line 884, in _bootstrap
  File "threading.py", line 916, in _bootstrap_inner
  File "threading.py", line 864, in run
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread
    debug=debug)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 518, in on_initialize
    gpu_pred_src_src_d2           = self.D_src(gpu_pred_src_src_masked_opt)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in __call__
    return self.forward(*args, **kwargs)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\models\PatchDiscriminator.py", line 183, in forward
    x = tf.nn.leaky_relu( upconv(x), 0.2 )
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\LayerBase.py", line 14, in __call__
    return self.forward(*args, **kwargs)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2DTranspose.py", line 81, in forward
    x = tf.nn.conv2d_transpose(x, weight, output_shape, strides, padding=self.padding, data_format=nn.data_format)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 2613, in conv2d_transpose
    name=name)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 2698, in conv2d_transpose_v2
    name=name)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 1291, in conv2d_backprop_input
    name=name)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 750, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3569, in _create_op_internal
    op_def=op_def)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 2045, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)

Traceback (most recent call last):
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1375, in _do_call
    return fn(*args)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1360, in _run_fn
    target_list, run_metadata)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1453, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[4,32,65,65] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node conv2d_transpose_1}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\mainscripts\Trainer.py", line 129, in trainerThread
    iter, iter_time = model.train_one_iter()
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\ModelBase.py", line 474, in train_one_iter
    losses = self.onTrainOneIter()
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 774, in onTrainOneIter
    src_loss, dst_loss = self.src_dst_train (warped_src, target_src, target_srcm, target_srcm_em, warped_dst, target_dst, target_dstm, target_dstm_em)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 584, in src_dst_train
    self.target_dstm_em:target_dstm_em,
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 968, in run
    run_metadata_ptr)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1191, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1369, in _do_run
    run_metadata)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1394, in _do_call
    raise type(e)(node_def, op, message)  # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[4,32,65,65] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node conv2d_transpose_1 (defined at C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2DTranspose.py:81) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


Errors may have originated from an input operation.
Input Source operations connected to node conv2d_transpose_1:
stack_1 (defined at C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2DTranspose.py:74)
D_src/upconvs_1/weight/read (defined at C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2DTranspose.py:43)
concat_4 (defined at C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\models\PatchDiscriminator.py:184)

Original stack trace for 'conv2d_transpose_1':
  File "threading.py", line 884, in _bootstrap
  File "threading.py", line 916, in _bootstrap_inner
  File "threading.py", line 864, in run
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread
    debug=debug)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 518, in on_initialize
    gpu_pred_src_src_d2           = self.D_src(gpu_pred_src_src_masked_opt)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in __call__
    return self.forward(*args, **kwargs)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\models\PatchDiscriminator.py", line 183, in forward
    x = tf.nn.leaky_relu( upconv(x), 0.2 )
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\LayerBase.py", line 14, in __call__
    return self.forward(*args, **kwargs)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\layers\Conv2DTranspose.py", line 81, in forward
    x = tf.nn.conv2d_transpose(x, weight, output_shape, strides, padding=self.padding, data_format=nn.data_format)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 2613, in conv2d_transpose
    name=name)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 2698, in conv2d_transpose_v2
    name=name)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 1291, in conv2d_backprop_input
    name=name)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 750, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3569, in _create_op_internal
    op_def=op_def)
  File "C:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 2045, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)


回复

使用道具 举报

11

主题

162

帖子

886

积分

高级丹师

Rank: 5Rank: 5

积分
886
发表于 2023-2-27 21:58:30 | 显示全部楼层
多看看论坛教程,OOM一般是显存不够
回复 支持 反对

使用道具 举报

15

主题

1921

帖子

2万

积分

高级丹圣

Rank: 13Rank: 13Rank: 13Rank: 13

积分
26851

万事如意节日勋章

发表于 2023-2-28 14:19:40 | 显示全部楼层
Device index: 0                                  ==
==                  Name: NVIDIA GeForce RTX 3060 Laptop GPU ==
==                  VRAM: 3.41GB                             ==
==                                                           ==
==========================
dfl这个软件对显卡的显存要求高。显存就是生产力

3060 可用显存只有3.41G啊。3.4G太少了。跑个一般模型就爆显存。

3050 8G 可用的显存都可以达到7.1G
4.jpg

回复 支持 反对

使用道具 举报

1

主题

171

帖子

1187

积分

初级丹圣

别以为拿着扫把的都是神僧

Rank: 8Rank: 8

积分
1187
发表于 2023-2-28 17:26:51 | 显示全部楼层
笔记本的3060?显存报错,要么换个小模型,要么换个大显卡
回复 支持 反对

使用道具 举报

7

主题

871

帖子

5962

积分

高级丹圣

Rank: 13Rank: 13Rank: 13Rank: 13

积分
5962

万事如意节日勋章

发表于 2023-3-1 11:15:20 | 显示全部楼层
虚拟内存怼高点试试,然后Place models and optimizer on GPU这个选n,把模型放进内存,会慢很多不过能跑起来.
回复 支持 反对

使用道具 举报

3

主题

74

帖子

711

积分

高级丹师

Rank: 5Rank: 5

积分
711
发表于 2023-3-5 11:30:11 | 显示全部楼层
试一试软件下载区置顶的ice版本dfl,应该可以
回复 支持 反对

使用道具 举报

QQ|Archiver|手机版|deepfacelab中文网 |网站地图

GMT+8, 2024-11-12 04:42 , Processed in 0.137683 second(s), 38 queries .

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表