deepfacelab中文网

 找回密码
 立即注册(仅限QQ邮箱)
查看: 749|回复: 6

新手小白第一帖,求助!!

[复制链接]

3

主题

49

帖子

1692

积分

初级丹圣

Rank: 8Rank: 8

积分
1692
 楼主| 发表于 2023-2-23 12:04:43 | 显示全部楼层 |阅读模式
星级打分
  • 1
  • 2
  • 3
  • 4
  • 5
平均分:NAN  参与人数:0  我的评分:未评
本人白小纯,笔记本3050,4g显存,quick96没问题,跑的挺快的,有几个问题求助大神:

1quick96是不是跑多少万次面部都不清晰锐利,总感觉做了模糊。

2我用saehd训练不了,什么问题?

======================== Model Summary ========================
==                                                           ==
==            Model name: hhh_SAEHD                          ==
==                                                           ==
==     Current iteration: 0                                  ==
==                                                           ==
==---------------------- Model Options ----------------------==
==                                                           ==
==            resolution: 128                                ==
==             face_type: f                                  ==
==     models_opt_on_gpu: True                               ==
==                 archi: liae-ud                            ==
==               ae_dims: 256                                ==
==                e_dims: 64                                 ==
==                d_dims: 64                                 ==
==           d_mask_dims: 22                                 ==
==       masked_training: True                               ==
==       eyes_mouth_prio: True                               ==
==           uniform_yaw: False                              ==
==         blur_out_mask: True                               ==
==             adabelief: True                               ==
==            lr_dropout: n                                  ==
==           random_warp: True                               ==
==      random_hsv_power: 0.0                                ==
==       true_face_power: 0.0                                ==
==      face_style_power: 0.0                                ==
==        bg_style_power: 0.0                                ==
==               ct_mode: none                               ==
==              clipgrad: False                              ==
==              pretrain: False                              ==
==       autobackup_hour: 1                                  ==
== write_preview_history: False                              ==
==           target_iter: 0                                  ==
==       random_src_flip: False                              ==
==       random_dst_flip: True                               ==
==            batch_size: 4                                  ==
==             gan_power: 0.0                                ==
==        gan_patch_size: 16                                 ==
==              gan_dims: 16                                 ==
==                                                           ==
==----------------------- Running On ------------------------==
==                                                           ==
==          Device index: 0                                  ==
==                  Name: NVIDIA GeForce RTX 3050 Laptop GPU ==
==                  VRAM: 1.63GB                             ==
==                                                           ==
===============================================================
Starting. Press "Enter" to stop training and save model.

Trying to do the first iteration. If an error occurs, reduce the model parameters.

!!!
Windows 10 users IMPORTANT notice. You should set this setting in order to work correctly.
https://i.imgur.com/B7cmDCB.jpg
!!!
You are training the model from scratch. It is strongly recommended to use a pretrained model to speed up the training and improve the quality.




Error: 2 root error(s) found.
  (0) Resource exhausted: failed to allocate memory
         [[node mul_229 (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:63) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

         [[concat_8/concat/_123]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

  (1) Resource exhausted: failed to allocate memory
         [[node mul_229 (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:63) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

0 successful operations.
0 derived errors ignored.

Errors may have originated from an input operation.
Input Source operations connected to node mul_229:
src_dst_opt/ms_decoder/upscalem0/conv1/weight_0/read (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37)

Input Source operations connected to node mul_229:
src_dst_opt/ms_decoder/upscalem0/conv1/weight_0/read (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37)

Original stack trace for 'mul_229':
  File "threading.py", line 884, in _bootstrap
  File "threading.py", line 916, in _bootstrap_inner
  File "threading.py", line 864, in run
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread
    debug=debug)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 564, in on_initialize
    src_dst_loss_gv_op = self.src_dst_opt.get_update_op (nn.average_gv_list (gpu_G_loss_gvs))
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 63, in get_update_op
    m_t = self.beta_1*ms + (1.0-self.beta_1) * g
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1076, in _run_op
    return tensor_oper(a.value(), *args, **kwargs)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1400, in r_binary_op_wrapper
    return func(x, y, name=name)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1710, in _mul_dispatch
    return multiply(x, y, name=name)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\math_ops.py", line 530, in multiply
    return gen_math_ops.mul(x, y, name)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 6245, in mul
    "Mul", x=x, y=y, name=name)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 750, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3569, in _create_op_internal
    op_def=op_def)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 2045, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)

Traceback (most recent call last):
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1375, in _do_call
    return fn(*args)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1360, in _run_fn
    target_list, run_metadata)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1453, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
  (0) Resource exhausted: failed to allocate memory
         [[{{node mul_229}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

         [[concat_8/concat/_123]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

  (1) Resource exhausted: failed to allocate memory
         [[{{node mul_229}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\mainscripts\Trainer.py", line 129, in trainerThread
    iter, iter_time = model.train_one_iter()
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\ModelBase.py", line 474, in train_one_iter
    losses = self.onTrainOneIter()
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 774, in onTrainOneIter
    src_loss, dst_loss = self.src_dst_train (warped_src, target_src, target_srcm, target_srcm_em, warped_dst, target_dst, target_dstm, target_dstm_em)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 584, in src_dst_train
    self.target_dstm_em:target_dstm_em,
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 968, in run
    run_metadata_ptr)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1191, in _run
    feed_dict_tensor, options, run_metadata)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1369, in _do_run
    run_metadata)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1394, in _do_call
    raise type(e)(node_def, op, message)  # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
  (0) Resource exhausted: failed to allocate memory
         [[node mul_229 (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:63) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

         [[concat_8/concat/_123]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

  (1) Resource exhausted: failed to allocate memory
         [[node mul_229 (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:63) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

0 successful operations.
0 derived errors ignored.

Errors may have originated from an input operation.
Input Source operations connected to node mul_229:
src_dst_opt/ms_decoder/upscalem0/conv1/weight_0/read (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37)

Input Source operations connected to node mul_229:
src_dst_opt/ms_decoder/upscalem0/conv1/weight_0/read (defined at D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37)

Original stack trace for 'mul_229':
  File "threading.py", line 884, in _bootstrap
  File "threading.py", line 916, in _bootstrap_inner
  File "threading.py", line 864, in run
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread
    debug=debug)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 564, in on_initialize
    src_dst_loss_gv_op = self.src_dst_opt.get_update_op (nn.average_gv_list (gpu_G_loss_gvs))
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 63, in get_update_op
    m_t = self.beta_1*ms + (1.0-self.beta_1) * g
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1076, in _run_op
    return tensor_oper(a.value(), *args, **kwargs)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1400, in r_binary_op_wrapper
    return func(x, y, name=name)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1710, in _mul_dispatch
    return multiply(x, y, name=name)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\math_ops.py", line 530, in multiply
    return gen_math_ops.mul(x, y, name)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 6245, in mul
    "Mul", x=x, y=y, name=name)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 750, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3569, in _create_op_internal
    op_def=op_def)
  File "D:\DeepFaceLab_NVIDIA_RTX3000_series\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 2045, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)

sa.jpg 万分感谢
回复

使用道具 举报

3

主题

49

帖子

1692

积分

初级丹圣

Rank: 8Rank: 8

积分
1692
 楼主| 发表于 2023-2-24 07:54:32 | 显示全部楼层
本帖最后由 softglow 于 2023-2-24 09:04 编辑

求助各位大神,是显存不足么
回复 支持 反对

使用道具 举报

0

主题

4

帖子

35

积分

初级丹童

Rank: 1

积分
35
发表于 2023-2-25 13:50:58 | 显示全部楼层
只要报错OOM就是显存不足了,quick96默认分辨率比较低,所以能跑起来
回复 支持 反对

使用道具 举报

0

主题

4

帖子

35

积分

初级丹童

Rank: 1

积分
35
发表于 2023-2-25 13:54:21 | 显示全部楼层
quick96不是迭代次数越多越好,看你素材了,有的就是迭代越多,机器学的越来越糟糕,也跟参数有关,一般情况quick96训练还可以,但要是正八经做一个效果,quick96不行,没法调参数。你这个可能就是硬件问题了……
回复 支持 反对

使用道具 举报

3

主题

49

帖子

1692

积分

初级丹圣

Rank: 8Rank: 8

积分
1692
 楼主| 发表于 2023-2-25 19:17:11 | 显示全部楼层
非常感谢,应该是显卡不行,把models_opt_on_gpu该成no就跑起来了,万分感谢朋友们的帮助
回复 支持 反对

使用道具 举报

3

主题

49

帖子

1692

积分

初级丹圣

Rank: 8Rank: 8

积分
1692
 楼主| 发表于 2023-2-25 19:23:30 | 显示全部楼层
quick96确实不行,我用的原始素材都是1920=1080的近景人脸,跑20万动作都跟的上,但脸部模糊遮罩生硬,只能体验下
回复 支持 反对

使用道具 举报

15

主题

1936

帖子

2万

积分

高级丹圣

Rank: 13Rank: 13Rank: 13Rank: 13

积分
26084

万事如意节日勋章

发表于 2023-3-3 10:38:23 | 显示全部楼层
本帖最后由 come3002 于 2023-3-3 10:43 编辑

楼主。dfl 显存是生产力。你的笔记本的 3050显卡可用的运存只有1g 太少了。一般都是3G左右
建议看看 教程,把显存利用率提高一些,至少提升到2.7 以上。
另外还有一种策略,开启虚拟内存。把虚拟内存  设置到 跟 dfl 相同的盘。大小调成最小70,最大100g


我的 笔记本 还是 1650 比楼主还弱,可用的3.2g,虚拟内存 70-100g。可用运行 猫的 224 也可用运行256.
远远好于128的效果
1650 显存.jpg
4.jpg

正方形 对比明显.jpg
回复 支持 反对

使用道具 举报

QQ|Archiver|手机版|deepfacelab中文网 |网站地图

GMT+8, 2024-9-23 06:38 , Processed in 0.100482 second(s), 10 queries , Redis On.

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表