deepfacelab中文网

 找回密码
 立即注册(仅限QQ邮箱)
查看: 1047|回复: 8

初始化模型报错 Error: OOM,昨天还能用的 显存4g

[复制链接]

1

主题

3

帖子

40

积分

初级丹童

Rank: 1

积分
40
 楼主| 发表于 2022-11-7 21:45:06 | 显示全部楼层 |阅读模式
星级打分
  • 1
  • 2
  • 3
  • 4
  • 5
平均分:NAN  参与人数:0  我的评分:未评
训练读取中.....

[new] 没有找到保存的模型。输入一个新模型的名称 : 128
128

Model first run.

选择一个或几个GPU的设备(用逗号分隔)

[CPU] : CPU
  [0] : NVIDIA GeForce RTX 3050 Ti Laptop GPU

[0] 选择哪些GPU设备? : 0
0

[] Session name?
读取记录档summary.txt的名称 ( ?:help ) :

[0] Autobackup every N hour
每几小时备份一次? ( 0..24 ?:help ) :
0
[24] Maximum N backups
最大的备份数量? ( ?:help ) :
24
[n] Write preview history
储存预览历史? ( y/n ?:help ) :
n
[4] Number of samples to preview
预览的视窗样本数? ( 1 - 16 ?:help ) :
4
[n] Use old preview panel?
使用旧的预览面板吗? ( y/n ) :
n
[0] Target iteration
指定训练的目标迭代,0则不限制 :
0
[n] Retrain high loss samples?
是否重新训练高损耗样品? ( y/n ?:help ) :
n
[n] Flip SRC faces randomly
选择是否随机翻转SRC面? ( y/n ?:help ) :
n
[y] Flip DST faces randomly
选择是否随机翻转DST面? ( y/n ?:help ) :
y
[4] Batch_size
批量训练规模?(别玩太大会OOM) ( ?:help ) :
4
[n] Use fp16
是否使用FP16半精度浮点数?(默认N) ( y/n ?:help ) :
n
[8] Max cpu cores to use.
使用的最大cpu核心数辅助?(默认8) ( 1 - 256 ?:help ) :
8
[128] Resolution ( 64-640 ?:help ) :
128
[f] Face type ( h/mf/f/wf/head/custom ?:help ) :
f
[liae-ud] AE architecture ( ?:help ) :
liae-ud
[256] AutoEncoder dimensions ( 32-1024 ?:help ) :
256
[64] Encoder dimensions ( 16-256 ?:help ) :
64
[64] Decoder dimensions ( 16-256 ?:help ) :
64
[22] Decoder mask dimensions ( 16-256 ?:help ) :
22
[n] Eyes priority
训练时是否眼睛优先? ( y/n ?:help ) :
n
[n] Mouth priority
训练时是否口腔优先? ( y/n ?:help ) :
n
[n] Uniform yaw distribution of samples
是否启用均匀化样本中各角度的素材? ( y/n ?:help ) :
n
[n] Blur out mask
是否训练遮罩边缘区域的模糊蒙版? ( y/n ?:help ) :
n
[y] Place models and optimizer on GPU
是否将将模型和优化器放在GPU上运行? ( y/n ?:help ) :
y
[y] Use AdaBelief optimizer
使用AdaBelief优化器?(显存不足时请关闭) ( y/n ?:help ) :
y
[n] Use learning rate dropout
是否启用学习率衰减?(建议后期) ( n/y/cpu ?:help ) :
n
[SSIM] Loss function
图像质量评估的变化损失函数(建议默认不要改变) ( SSIM/MS-SSIM/MS-SSIM+L1 ?:help ) :
SSIM
[5e-05] Learning rate
学习率典型的精细估值(建议默认不要改变) ( 0.0 .. 1.0 ?:help ) :
5e-05
[y] Enable random warp of samples
是否启用样品的随机扭曲变化训练?(中后期不建议启用) ( y/n ?:help ) :
y
[0.0] Random hue/saturation/light intensity
随机色调/饱和度/光照强度? ( 0.0 .. 0.3 ?:help ) :
0.0
[n] Enable random downsample of samples
启用样本的随机调降功能 ( y/n ?:help ) :
n
[n] Enable random noise added to samples
启用添加到样本中的随机噪声 ( y/n ?:help ) :
n
[n] Enable random blur of samples
启用随机模糊的样本 ( y/n ?:help ) :
n
[n] Enable random jpeg compression of samples
启用样本的随机jpeg压缩 ( y/n ?:help ) :
n
[none] Enable random shadows and highlights of samples
启用样本的随机阴影和高光 ( none/src/dst/all ?:help ) :
none
[0.0] GAN power
GAN生成对抗学习的强度(建议后期0.1开始) ( 0.0 .. 10.0 ?:help ) :
0.0
[0.0] Background power
是否学习遮罩外的区域,帮助抹平遮罩边界附近的区域? ( 0.0..1.0 ?:help ) :
0.0
[0.0] Face style power
是否学习脸部明暗色彩学习强度? ( 0.0..100.0 ?:help ) :
0.0
[0.0] Background style power
是否学习背景明暗色彩学习强度? ( 0.0..100.0 ?:help ) :
0.0
[none] Color transfer for src faceset
是否为src进行调色? ( none/rct/lct/mkl/idt/sot/fs-aug ?:help ) :
none
[n] Random color
启动随机调色? ( y/n ?:help ) :
n
[n] Enable gradient clipping
使用梯度剪裁(想防止模型崩溃请开启,训练速度缓降坡训练较慢) ( y/n ?:help ) :
n
[n] Enable pretraining mode
使用预训练模式吗?(正式训练默认为N) ( y/n ?:help ) :
n
初始化模型...:  80%|#######################################################2             | 4/5 [02:11<00:32, 32.90s/it]
when allocating tensor with shape[2048] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node src_dst_opt/ms_inter_B/upscale1/conv1/bias_0/Assign (defined at D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


Original stack trace for 'src_dst_opt/ms_inter_B/upscale1/conv1/bias_0/Assign':
  File "threading.py", line 884, in _bootstrap
  File "threading.py", line 916, in _bootstrap_inner
  File "threading.py", line 864, in run
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\mainscripts\Trainer.py", line 110, in trainerThread
    reduce_clutter= kwargs.get('reduce_clutter', False)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\models\ModelBase.py", line 265, in __init__
    self.on_initialize()
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 444, in on_initialize    self.src_dst_opt.initialize_variables (self.src_dst_saveable_weights, vars_on_cpu=optimizer_vars_on_cpu, lr_dropout_on_cpu=self.options['lr_dropout']=='cpu')
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 37, in initialize_variables
    ms = { v.name : tf.get_variable ( f'ms_{v.name}'.replace(':','_'), v.shape, dtype=v.dtype, initializer=tf.initializers.constant(0.0), trainable=False) for v in trainable_weights }
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 37, in <dictcomp>
    ms = { v.name : tf.get_variable ( f'ms_{v.name}'.replace(':','_'), v.shape, dtype=v.dtype, initializer=tf.initializers.constant(0.0), trainable=False) for v in trainable_weights }
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1595, in get_variable
    aggregation=aggregation)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1338, in get_variable
    aggregation=aggregation)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 593, in get_variable
    aggregation=aggregation)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 545, in _true_getter
    aggregation=aggregation)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 963, in _get_single_variable
    aggregation=aggregation)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 266, in __call__
    return cls._variable_v1_call(*args, **kwargs)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 227, in _variable_v1_call
    shape=shape)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 205, in <lambda>
    previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2642, in default_variable_creator
    shape=shape)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 270, in __call__
    return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1670, in __init__
    shape=shape)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1853, in _init_from_args
    validate_shape=validate_shape).op
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\state_ops.py", line 358, in assign
    validate_shape=validate_shape)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 59, in assign
    use_locking=use_locking, name=name)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 750, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3569, in _create_op_internal
    op_def=op_def)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 2045, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)

Traceback (most recent call last):
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1375, in _do_call
    return fn(*args)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1360, in _run_fn
    target_list, run_metadata)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1453, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[2048] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node src_dst_opt/ms_inter_B/upscale1/conv1/bias_0/Assign}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\mainscripts\Trainer.py", line 110, in trainerThread
    reduce_clutter= kwargs.get('reduce_clutter', False)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\models\ModelBase.py", line 265, in __init__
    self.on_initialize()
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 861, in on_initialize    model.init_weights()
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\core\leras\layers\Saveable.py", line 106, in init_weights
    nn.init_weights(self.get_weights())
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\core\leras\ops\__init__.py", line 48, in init_weights
    nn.tf_sess.run (ops)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 968, in run
    run_metadata_ptr)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1191, in _run
    feed_dict_tensor, options, run_metadata)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1369, in _do_run
    run_metadata)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1394, in _do_call
    raise type(e)(node_def, op, message)  # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[2048] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node src_dst_opt/ms_inter_B/upscale1/conv1/bias_0/Assign (defined at D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:37) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


Original stack trace for 'src_dst_opt/ms_inter_B/upscale1/conv1/bias_0/Assign':
  File "threading.py", line 884, in _bootstrap
  File "threading.py", line 916, in _bootstrap_inner
  File "threading.py", line 864, in run
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\mainscripts\Trainer.py", line 110, in trainerThread
    reduce_clutter= kwargs.get('reduce_clutter', False)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\models\ModelBase.py", line 265, in __init__
    self.on_initialize()
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 444, in on_initialize    self.src_dst_opt.initialize_variables (self.src_dst_saveable_weights, vars_on_cpu=optimizer_vars_on_cpu, lr_dropout_on_cpu=self.options['lr_dropout']=='cpu')
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 37, in initialize_variables
    ms = { v.name : tf.get_variable ( f'ms_{v.name}'.replace(':','_'), v.shape, dtype=v.dtype, initializer=tf.initializers.constant(0.0), trainable=False) for v in trainable_weights }
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 37, in <dictcomp>
    ms = { v.name : tf.get_variable ( f'ms_{v.name}'.replace(':','_'), v.shape, dtype=v.dtype, initializer=tf.initializers.constant(0.0), trainable=False) for v in trainable_weights }
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1595, in get_variable
    aggregation=aggregation)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1338, in get_variable
    aggregation=aggregation)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 593, in get_variable
    aggregation=aggregation)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 545, in _true_getter
    aggregation=aggregation)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 963, in _get_single_variable
    aggregation=aggregation)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 266, in __call__
    return cls._variable_v1_call(*args, **kwargs)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 227, in _variable_v1_call
    shape=shape)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 205, in <lambda>
    previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2642, in default_variable_creator
    shape=shape)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 270, in __call__
    return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1670, in __init__
    shape=shape)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1853, in _init_from_args
    validate_shape=validate_shape).op
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\state_ops.py", line 358, in assign
    validate_shape=validate_shape)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 59, in assign
    use_locking=use_locking, name=name)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 750, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3569, in _create_op_internal
    op_def=op_def)
  File "D:\faceAI\RTX3000-4090_2022_10_14\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 2045, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)

explorer_svGLpHlBMG.png
Taskmgr_L6AZmbZVGV.png
回复

使用道具 举报

1

主题

3

帖子

40

积分

初级丹童

Rank: 1

积分
40
 楼主| 发表于 2022-11-7 21:48:08 | 显示全部楼层
求救,faceswap装了半个月失败,转战DeepFaceLab,又给我爆我的智商理解不了的错误
回复 支持 反对

使用道具 举报

215

主题

1996

帖子

66万

积分

管理员

Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96Rank: 96

积分
666840

隐世金马甲勋章超级版主勋章可爱萌新勋章见习版主勋章荣誉会员勋章男同管理员-无尚荣耀勋章优质版主勋章小有贡献勋章

发表于 2022-11-7 23:31:19 | 显示全部楼层
就是很常见的显存不足
提供数字人直播服务、文字/音频驱动数字人服务,有意者联系我QQ563861181
全站默认解压密码dfldata.xyz
DFL交流QQ群五群974612885
AI绘画交流QQ群710238550
我的B站账号:特看科技的滚石   其他自称彦祖的不是我,请勿上当
回复 支持 反对

使用道具 举报

17

主题

125

帖子

1000

积分

初级丹圣

Rank: 8Rank: 8

积分
1000
发表于 2022-11-7 23:41:27 | 显示全部楼层
每个设置都会改变显存占用率
回复 支持 反对

使用道具 举报

1

主题

170

帖子

941

积分

高级丹师

Rank: 5Rank: 5

积分
941
发表于 2022-11-7 23:56:24 | 显示全部楼层
听哥一句劝,3050ti别玩liae了,你这配置跑自闭都很勉强的,跑liae的别人用的都是3090,P100,你一个3050ti凑什么热闹
回复 支持 反对

使用道具 举报

3

主题

228

帖子

5169

积分

高级丹圣

Rank: 13Rank: 13Rank: 13Rank: 13

积分
5169
发表于 2022-11-8 08:48:35 | 显示全部楼层
chc5101614 发表于 2022-11-7 23:56
听哥一句劝,3050ti别玩liae了,你这配置跑自闭都很勉强的,跑liae的别人用的都是3090,P100,你一个3050ti ...

他的还是移动版,他真的,我哭死
回复 支持 反对

使用道具 举报

2

主题

33

帖子

637

积分

高级丹师

Rank: 5Rank: 5

积分
637
发表于 2022-11-9 11:59:56 | 显示全部楼层
显存不够啊
回复 支持 反对

使用道具 举报

2

主题

7

帖子

64

积分

高级丹童

Rank: 2

积分
64
发表于 2023-3-13 19:56:45 | 显示全部楼层
我也是这样的,哭死
回复 支持 反对

使用道具 举报

16

主题

108

帖子

1642

积分

初级丹圣

Rank: 8Rank: 8

积分
1642
发表于 2023-3-13 20:07:03 | 显示全部楼层
zibi2.0 256 df 的3050ti 跑起来没问题(偶尔会崩溃),liae 还是算了吧,兄弟
回复 支持 反对

使用道具 举报

QQ|Archiver|手机版|deepfacelab中文网 |网站地图

GMT+8, 2024-9-22 20:17 , Processed in 0.103927 second(s), 11 queries , Redis On.

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表