LIAE-UDT_288_WF 160W纯女性预训练模型

WinKK · 发表于 2022-3-3 08:36:08

Hedwig 发表于 2022-3-3 08:31
OOM显存不够，bs开太高了，试试4或者2。还不行就把gpu优化关了。这个模型参数本来就比较大了 ...

多谢！看来是参数太高，我的显卡顶不住了。再试试。没办法，只有RX6600价格还可以。有机会上4080

Hedwig · 发表于 2022-3-3 08:33:13

小鲤鱼发表于 2022-3-2 22:29
完整的报错信息弹出来了。求大佬解读：
初始化模型: 80%|############################################## ...

看显示错误OOM内存或者显存不够，试试把bs调小，4或者2，还不行就关闭gpu优化

Hedwig · 发表于 2022-3-3 08:31:06

WinKK 发表于 2022-3-2 20:34
按你说的，还是有错误发生。我贴在下面，。
我显卡是RX6600 8G，是不是有什么没设对的地方？ ...

OOM显存不够，bs开太高了，试试4或者2。还不行就把gpu优化关了。这个模型参数本来就比较大了

tktkl6 · 发表于 2022-3-3 07:09:25

感谢分享

小鲤鱼 · 发表于 2022-3-2 22:29:36

完整的报错信息弹出来了。求大佬解读：
初始化模型:  80%|#########################################################6             | 4/5 [03:20<00:50, 50.22s/it]
Error: OOM when allocating tensor with shape[3840] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
      [[node src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign (defined at G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Caused by op 'src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign', defined at:
  File "threading.py", line 884, in _bootstrap
  File "threading.py", line 916, in _bootstrap_inner
  File "threading.py", line 864, in run
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread
debug=debug)
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\models\ModelBase.py", line 193, in __init__
self.on_initialize()
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 328, in on_initialize
self.src_dst_opt.initialize_variables (self.src_dst_trainable_weights, vars_on_cpu=optimizer_vars_on_cpu, lr_dropout_on_cpu=self.options['lr_dropout']=='cpu')
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 37, in initialize_variables
vs = { v.name : tf.get_variable ( f'vs_{v.name}'.replace(':','_'), v.shape, dtype=v.dtype, initializer=tf.initializers.constant(0.0), trainable=False) for v in trainable_weights }
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 37, in <dictcomp>
vs = { v.name : tf.get_variable ( f'vs_{v.name}'.replace(':','_'), v.shape, dtype=v.dtype, initializer=tf.initializers.constant(0.0), trainable=False) for v in trainable_weights }
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1479, in get_variable
aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1220, in get_variable
aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 547, in get_variable
aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 499, in _true_getter
aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 911, in _get_single_variable
aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 213, in __call__
return cls._variable_v1_call(*args, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 176, in _variable_v1_call
aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 155, in <lambda>
previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2495, in default_variable_creator
expected_shape=expected_shape, import_scope=import_scope)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 217, in __call__
return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1395, in __init__
constraint=constraint)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1547, in _init_from_args
validate_shape=validate_shape).op
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\state_ops.py", line 223, in assign
validate_shape=validate_shape)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 64, in assign
use_locking=use_locking, name=name)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
op_def=op_def)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in __init__
self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[3840] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
      [[node src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign (defined at G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Traceback (most recent call last):
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call
return fn(*args)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[3840] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
      [[{{node src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread
debug=debug)
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\models\ModelBase.py", line 193, in __init__
self.on_initialize()
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 623, in on_initialize
model.init_weights()
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\layers\Saveable.py", line 104, in init_weights
nn.init_weights(self.get_weights())
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\ops\__init__.py", line 48, in init_weights
nn.tf_sess.run (ops)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 929, in run
run_metadata_ptr)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run
run_metadata)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[3840] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
      [[node src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign (defined at G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Caused by op 'src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign', defined at:
  File "threading.py", line 884, in _bootstrap
  File "threading.py", line 916, in _bootstrap_inner
  File "threading.py", line 864, in run
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread
debug=debug)
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\models\ModelBase.py", line 193, in __init__
self.on_initialize()
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 328, in on_initialize
self.src_dst_opt.initialize_variables (self.src_dst_trainable_weights, vars_on_cpu=optimizer_vars_on_cpu, lr_dropout_on_cpu=self.options['lr_dropout']=='cpu')
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 37, in initialize_variables
vs = { v.name : tf.get_variable ( f'vs_{v.name}'.replace(':','_'), v.shape, dtype=v.dtype, initializer=tf.initializers.constant(0.0), trainable=False) for v in trainable_weights }
  File "G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 37, in <dictcomp>
vs = { v.name : tf.get_variable ( f'vs_{v.name}'.replace(':','_'), v.shape, dtype=v.dtype, initializer=tf.initializers.constant(0.0), trainable=False) for v in trainable_weights }
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1479, in get_variable
aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1220, in get_variable
aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 547, in get_variable
aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 499, in _true_getter
aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 911, in _get_single_variable
aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 213, in __call__
return cls._variable_v1_call(*args, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 176, in _variable_v1_call
aggregation=aggregation)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 155, in <lambda>
previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2495, in default_variable_creator
expected_shape=expected_shape, import_scope=import_scope)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 217, in __call__
return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1395, in __init__
constraint=constraint)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1547, in _init_from_args
validate_shape=validate_shape).op
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\state_ops.py", line 223, in assign
validate_shape=validate_shape)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 64, in assign
use_locking=use_locking, name=name)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
op_def=op_def)
  File "G:\DFL-1120-030122\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in __init__
self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[3840] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
      [[node src_dst_opt/vs_inter_B/upscale1/conv1/bias_0/Assign (defined at G:\DFL-1120-030122\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

大牛666 · 发表于 2022-3-2 22:28:35

-UDT_288_WF 160W纯女性预

小鲤鱼 · 发表于 2022-3-2 22:26:44

初始化模型失败总是卡在80% 。
inter_AB已删

求大佬指点：
启动训练程序.

选择一个模型, 或者输入一个名称去新建模型。
[r] : 重命名
[d] : 删除

[0] : TT-LIAE - latest
:
0
加载名为 TT-LIAE_SAEHD 的模型...

可用设备列表：

[CPU] : CPU
[0] : GeForce RTX 2070

[0] 选择哪一个设备？ :
0

两秒内按回车可以修改模型参数...
[0] 几个小时备份一次? Autobackup every N hour ( 0..24 ?:help ) :
0
[n] 保存预览历史记录 Write preview history ( y/n ?:help ) :
n
[1800000] 目标迭代次数 Target iteration :
1800000
[n] 随机翻转源人脸 Flip SRC faces randomly ( y/n ?:help ) :
n
[y] 随机翻转目标人脸 Flip DST faces randomly ( y/n ?:help ) :
y
[12] 批量大小 Batch_size ( ?:help ) : 8
8
[y] 训练遮罩 Masked training ( y/n ?:help ) :
y
[y] 眼睛和嘴巴优先 Eyes and mouth priority ( y/n ?:help ) :
y
[n] 侧脸优化 Uniform yaw distribution of samples ( y/n ?:help ) :
n
[y] 将模型和优化器放在GPU上 Place models and optimizer on GPU ( y/n ?:help ) :
y
[y] 使用信仰优化器 Use AdaBelief optimizer? ( y/n ?:help ) :
y
[n] 使用学习率dropout Use learning rate dropout ( n/y/cpu ?:help ) :
n
[n] 随机扭曲 Enable random warp of samples ( y/n ?:help ) : y
[0.0] GAN强度 GAN power ( 0.0 .. 5.0 ?:help ) :
0.0
[0.0] 人脸风格强度 Face style power ( 0.0..100.0 ?:help ) :
0.0
[0.0] 背景风格强度 Background style power ( 0.0..100.0 ?:help ) :
0.0
[none] 颜色转换模式 Color transfer for src faceset ( none/rct/lct/mkl/idt/sot ?:help ) :
none
[y] 启用梯度剪裁 Enable gradient clipping ( y/n ?:help ) :
y
[y] 启用预训练 Enable pretraining mode ( y/n ?:help ) : n
初始化模型: 80%|#########################################################6 | 4/5 [00:00<00:00, 3.83it/s]

WinKK · 发表于 2022-3-2 20:44:03

WinKK 发表于 2022-3-2 20:34
按你说的，还是有错误发生。我贴在下面，。
我显卡是RX6600 8G，是不是有什么没设对的地方？ ...

我把BS改成8，是不是还有其他参数也要减小啊？新手，真不明白啊

WinKK · 发表于 2022-3-2 20:34:29

按你说的，还是有错误发生。我贴在下面，。
我显卡是RX6600 8G，是不是有什么没设对的地方？

WinKK · 发表于 2022-3-2 20:18:38

Hedwig 发表于 2022-3-2 18:42
先把inter_ab删除，第一次训练把pretrain改成false，我发的时候还是预训练状态 ...

这样啊，多谢指教！

		自动登录	找回密码
密码			立即注册（仅限QQ邮箱）

LIAE-UDT_288_WF 160W纯女性预训练模型

可爱萌新勋章

荣誉会员勋章

小有贡献勋章