deepfacelab中文网

 找回密码
 立即注册(仅限QQ邮箱)
楼主: Hedwig

LIAE-UDT_288_WF 160W纯女性预训练模型

  [复制链接]

51

主题

1231

帖子

7668

积分

高级丹圣

Rank: 13Rank: 13Rank: 13Rank: 13

积分
7668
发表于 2022-3-3 08:36:08 | 显示全部楼层
Hedwig 发表于 2022-3-3 08:31
OOM显存不够,bs开太高了,试试4或者2。还不行就把gpu优化关了。这个模型参数本来就比较大了 ...

多谢!看来是参数太高,我的显卡顶不住了。再试试。没办法,只有RX6600价格还可以。有机会上4080
回复 支持 反对

使用道具 举报

10

主题

454

帖子

1万

积分

高级丹圣

Rank: 13Rank: 13Rank: 13Rank: 13

积分
11214

可爱萌新勋章荣誉会员勋章小有贡献勋章

 楼主| 发表于 2022-3-3 08:33:13 | 显示全部楼层
小鲤鱼 发表于 2022-3-2 22:29
完整的报错信息弹出来了。求大佬解读:
初始化模型:  80%|############################################## ...

看显示错误OOM内存或者显存不够,试试把bs调小,4或者2,还不行就关闭gpu优化
回复 支持 反对

使用道具 举报

10

主题

454

帖子

1万

积分

高级丹圣

Rank: 13Rank: 13Rank: 13Rank: 13

积分
11214

可爱萌新勋章荣誉会员勋章小有贡献勋章

 楼主| 发表于 2022-3-3 08:31:06 | 显示全部楼层
WinKK 发表于 2022-3-2 20:34
按你说的,还是有错误发生。我贴在下面,。
我显卡是RX6600 8G,是不是有什么没设对的地方? ...

OOM显存不够,bs开太高了,试试4或者2。还不行就把gpu优化关了。这个模型参数本来就比较大了
回复 支持 反对

使用道具 举报

0

主题

305

帖子

2031

积分

初级丹圣

Rank: 8Rank: 8

积分
2031
发表于 2022-3-3 07:09:25 | 显示全部楼层
感谢分享
回复

使用道具 举报

0

主题

8

帖子

378

积分

初级丹师

Rank: 3Rank: 3

积分
378
发表于 2022-3-2 22:29:36 | 显示全部楼层
完整的报错信息弹出来了。求大佬解读:
初始化模型:  80%|#########################################################6              | 4/5 [03:20<00:50, 50.22s/it]
Error: OOM when allocating tensor with shape[3840] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign (defined at G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


Caused by op 'src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign', defined at:
  File "threading.py", line 884, in _bootstrap
  File "threading.py", line 916, in _bootstrap_inner
  File "threading.py", line 864, in run
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread
    debug=debug)
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\models\ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 328, in on_initialize
    self.src_dst_opt.initialize_variables (self.src_dst_trainable_weights, vars_on_cpu=optimizer_vars_on_cpu, lr_dropout_on_cpu=self.options['lr_dropout']=='cpu')
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 37, in initialize_variables
    vs = { v.name : tf.get_variable ( f'vs_{v.name}'.replace(':','_'), v.shape, dtype=v.dtype, initializer=tf.initializers.constant(0.0), trainable=False) for v in trainable_weights }
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 37, in <dictcomp>
    vs = { v.name : tf.get_variable ( f'vs_{v.name}'.replace(':','_'), v.shape, dtype=v.dtype, initializer=tf.initializers.constant(0.0), trainable=False) for v in trainable_weights }
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1479, in get_variable
    aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1220, in get_variable
    aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 547, in get_variable
    aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 499, in _true_getter
    aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 911, in _get_single_variable
    aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 213, in __call__
    return cls._variable_v1_call(*args, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 176, in _variable_v1_call
    aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 155, in <lambda>
    previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2495, in default_variable_creator
    expected_shape=expected_shape, import_scope=import_scope)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 217, in __call__
    return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1395, in __init__
    constraint=constraint)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1547, in _init_from_args
    validate_shape=validate_shape).op
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\state_ops.py", line 223, in assign
    validate_shape=validate_shape)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 64, in assign
    use_locking=use_locking, name=name)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
    op_def=op_def)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[3840] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign (defined at G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


Traceback (most recent call last):
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call
    return fn(*args)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[3840] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread
    debug=debug)
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\models\ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 623, in on_initialize
    model.init_weights()
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\layers\Saveable.py", line 104, in init_weights
    nn.init_weights(self.get_weights())
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\ops\__init__.py", line 48, in init_weights
    nn.tf_sess.run (ops)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 929, in run
    run_metadata_ptr)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run
    run_metadata)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[3840] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign (defined at G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


Caused by op 'src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign', defined at:
  File "threading.py", line 884, in _bootstrap
  File "threading.py", line 916, in _bootstrap_inner
  File "threading.py", line 864, in run
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread
    debug=debug)
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\models\ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 328, in on_initialize
    self.src_dst_opt.initialize_variables (self.src_dst_trainable_weights, vars_on_cpu=optimizer_vars_on_cpu, lr_dropout_on_cpu=self.options['lr_dropout']=='cpu')
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 37, in initialize_variables
    vs = { v.name : tf.get_variable ( f'vs_{v.name}'.replace(':','_'), v.shape, dtype=v.dtype, initializer=tf.initializers.constant(0.0), trainable=False) for v in trainable_weights }
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 37, in <dictcomp>
    vs = { v.name : tf.get_variable ( f'vs_{v.name}'.replace(':','_'), v.shape, dtype=v.dtype, initializer=tf.initializers.constant(0.0), trainable=False) for v in trainable_weights }
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1479, in get_variable
    aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1220, in get_variable
    aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 547, in get_variable
    aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 499, in _true_getter
    aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 911, in _get_single_variable
    aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 213, in __call__
    return cls._variable_v1_call(*args, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 176, in _variable_v1_call
    aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 155, in <lambda>
    previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2495, in default_variable_creator
    expected_shape=expected_shape, import_scope=import_scope)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 217, in __call__
    return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1395, in __init__
    constraint=constraint)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1547, in _init_from_args
    validate_shape=validate_shape).op
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\state_ops.py", line 223, in assign
    validate_shape=validate_shape)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 64, in assign
    use_locking=use_locking, name=name)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
    op_def=op_def)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[3840] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign (defined at G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
回复 支持 反对

使用道具 举报

1

主题

192

帖子

2538

积分

初级丹圣

Rank: 8Rank: 8

积分
2538
发表于 2022-3-2 22:28:35 | 显示全部楼层
-UDT_288_WF 160W纯女性预
回复 支持 反对

使用道具 举报

0

主题

8

帖子

378

积分

初级丹师

Rank: 3Rank: 3

积分
378
发表于 2022-3-2 22:26:44 | 显示全部楼层
初始化模型失败 总是卡在80% 。
inter_AB已删

求大佬指点:
启动训练程序.

选择一个模型, 或者输入一个名称去新建模型。
[r] : 重命名
[d] : 删除

[0] : TT-LIAE - latest
:
0
加载名为 TT-LIAE_SAEHD 的模型...

可用设备列表:

[CPU] : CPU
  [0] : GeForce RTX 2070

[0] 选择哪一个设备? :
0

两秒内按回车可以修改模型参数...
[0] 几个小时备份一次? Autobackup every N hour ( 0..24 ?:help ) :
0
[n] 保存预览历史记录 Write preview history ( y/n ?:help ) :
n
[1800000] 目标迭代次数 Target iteration :
1800000
[n] 随机翻转源人脸 Flip SRC faces randomly ( y/n ?:help ) :
n
[y] 随机翻转目标人脸 Flip DST faces randomly ( y/n ?:help ) :
y
[12] 批量大小 Batch_size ( ?:help ) : 8
8
[y] 训练遮罩 Masked training ( y/n ?:help ) :
y
[y] 眼睛和嘴巴优先 Eyes and mouth priority ( y/n ?:help ) :
y
[n] 侧脸优化 Uniform yaw distribution of samples ( y/n ?:help ) :
n
[y] 将模型和优化器放在GPU上 Place models and optimizer on GPU ( y/n ?:help ) :
y
[y] 使用信仰优化器 Use AdaBelief optimizer? ( y/n ?:help ) :
y
[n] 使用学习率dropout Use learning rate dropout ( n/y/cpu ?:help ) :
n
[n] 随机扭曲 Enable random warp of samples ( y/n ?:help ) : y
[0.0] GAN强度 GAN power ( 0.0 .. 5.0 ?:help ) :
0.0
[0.0] 人脸风格强度 Face style power ( 0.0..100.0 ?:help ) :
0.0
[0.0] 背景风格强度 Background style power ( 0.0..100.0 ?:help ) :
0.0
[none] 颜色转换模式 Color transfer for src faceset ( none/rct/lct/mkl/idt/sot ?:help ) :
none
[y] 启用梯度剪裁 Enable gradient clipping ( y/n ?:help ) :
y
[y] 启用预训练 Enable pretraining mode ( y/n ?:help ) : n
初始化模型:  80%|#########################################################6              | 4/5 [00:00<00:00,  3.83it/s]
回复 支持 反对

使用道具 举报

51

主题

1231

帖子

7668

积分

高级丹圣

Rank: 13Rank: 13Rank: 13Rank: 13

积分
7668
发表于 2022-3-2 20:44:03 | 显示全部楼层
WinKK 发表于 2022-3-2 20:34
按你说的,还是有错误发生。我贴在下面,。
我显卡是RX6600 8G,是不是有什么没设对的地方? ...

我把BS改成8,是不是还有其他参数也要减小啊? 新手,真不明白啊
回复 支持 反对

使用道具 举报

51

主题

1231

帖子

7668

积分

高级丹圣

Rank: 13Rank: 13Rank: 13Rank: 13

积分
7668
发表于 2022-3-2 20:34:29 | 显示全部楼层
按你说的,还是有错误发生。我贴在下面,。
我显卡是RX6600 8G,是不是有什么没设对的地方?
屏幕截图 2022-03-02 203035 New 1.png
屏幕截图 2022-03-02 203109 NEW2.png
回复 支持 反对

使用道具 举报

51

主题

1231

帖子

7668

积分

高级丹圣

Rank: 13Rank: 13Rank: 13Rank: 13

积分
7668
发表于 2022-3-2 20:18:38 | 显示全部楼层
Hedwig 发表于 2022-3-2 18:42
先把inter_ab删除,第一次训练把pretrain改成false,我发的时候还是预训练状态 ...

这样啊,多谢指教!
回复 支持 反对

使用道具 举报

QQ|Archiver|手机版|deepfacelab中文网 |网站地图

GMT+8, 2024-6-6 16:47 , Processed in 0.100930 second(s), 11 queries , Redis On.

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表