|
发表于 2021-11-11 16:20:45
|
显示全部楼层
3060ti遮罩训练报错,bs开8,显存占用上不去都是2g以内,跑SAEHD和quick96都没问题,就是遮罩训练出问题,有时是一直卡住第一张出不来,有时是直接报错oom,求大佬解答
模型名字: XSeg
当前迭代: 1335305
---------------------模型选项----------------------
face_type: wf
batch_size: 8
pretrain: False
猫の汉化: http://t.hk.uy/4ks
商业合作: 出售仙丹
联系方式: QQ微信:564646676
---------------------运行信息----------------------
设备编号: 0
设备名称: NVIDIA GeForce RTX 3060 Ti
显存大小: 6.46GB
===============================================
猫之汉化
出售模型、商业换脸程序开发、商业换脸视频定制
QQ\微信:564646676
淘宝店地址:http://t.hk.uy/4ks
Deepfacelab官方中文论坛:dfldata.cc
=============================================
启动中. 按回车键停止训练并保存进度。
保存时间|迭代次数|单次时间|SRC损失|DST损失
Error: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[8,32,256,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node Square (defined at C:\0906\_internal\DeepFaceLab\core\leras\layers\FRNorm2D.py:34) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[concat_7/concat/_95]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
(1) Resource exhausted: OOM when allocating tensor with shape[8,32,256,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node Square (defined at C:\0906\_internal\DeepFaceLab\core\leras\layers\FRNorm2D.py:34) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
0 successful operations.
0 derived errors ignored.
Errors may have originated from an input operation.
Input Source operations connected to node Square:
Add (defined at C:\0906\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:107)
Input Source operations connected to node Square:
Add (defined at C:\0906\_internal\DeepFaceLab\core\leras\layers\Conv2D.py:107)
Original stack trace for 'Square':
File "threading.py", line 884, in _bootstrap
File "threading.py", line 916, in _bootstrap_inner
File "threading.py", line 864, in run
File "C:\0906\_internal\DeepFaceLab\mainscripts\Trainer.py", line 63, in trainerThread
debug=debug)
File "C:\0906\_internal\DeepFaceLab\models\Model_XSeg\Model.py", line 17, in __init__
super().__init__(*args, force_model_class_name='XSeg', **kwargs)
File "C:\0906\_internal\DeepFaceLab\models\ModelBase.py", line 199, in __init__
self.on_initialize()
File "C:\0906\_internal\DeepFaceLab\models\Model_XSeg\Model.py", line 106, in on_initialize
gpu_pred_logits_t, gpu_pred_t = self.model.flow(gpu_input_t, pretrain=self.pretrain)
File "C:\0906\_internal\DeepFaceLab\facelib\XSegNet.py", line 85, in flow
return self.model(x, pretrain=pretrain)
File "C:\0906\_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in __call__
return self.forward(*args, **kwargs)
File "C:\0906\_internal\DeepFaceLab\core\leras\models\XSeg.py", line 96, in forward
x = self.conv01(x)
File "C:\0906\_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in __call__
return self.forward(*args, **kwargs)
File "C:\0906\_internal\DeepFaceLab\core\leras\models\XSeg.py", line 16, in forward
x = self.frn(x)
File "C:\0906\_internal\DeepFaceLab\core\leras\layers\LayerBase.py", line 14, in __call__
return self.forward(*args, **kwargs)
File "C:\0906\_internal\DeepFaceLab\core\leras\layers\FRNorm2D.py", line 34, in forward
nu2 = tf.reduce_mean(tf.square(x), axis=nn.conv2d_spatial_axes, keepdims=True)
File "C:\0906\_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 10174, in square
"Square", x=x, name=name)
File "C:\0906\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 750, in _apply_op_helper
attrs=attr_protos, op_def=op_def)
File "C:\0906\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3536, in _create_op_internal
op_def=op_def)
File "C:\0906\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 1990, in __init__
self._traceback = tf_stack.extract_stack() |
|