用神农MVE報錯...請大佬們幫忙看一下是什麼原因, 謝謝

cat750515 · 发表于 2024-3-4 18:58:47

星级打分

1
2
3
4
5

平均分:NAN 参与人数:0 我的评分:未评

用神农MVE的 (训练 SAEHD 原版模型) 使用五彩320WF丹是報錯找不到训练数据

STR跟SRC都有检查过位置都正确

把丹转化成神农ME的丹数据是有读取到了变成以下报错

請大佬們幫忙看看是什麼原因謝謝!

[保存时间][迭代次数][单次时间][源损失][目标损失]
Error: Graph execution error:

Detected at node 'mul_186' defined at (most recent call last):
File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 937, in _bootstrap
   self._bootstrap_inner()
File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 980, in _bootstrap_inner
   self.run()
File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 917, in run
   self._target(*self._args, **self._kwargs)
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 93, in trainerThread
   model = models.import_model(model_class_name)(
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 344, in __init__
   self.on_initialize()
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1679, in on_initialize
   src_dst_loss_gv_op = self.src_dst_opt.get_update_op(
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 64, in get_update_op
   v_t = self.beta_2*vs + (1.0-self.beta_2) * tf.square(g-m_t)
Node: 'mul_186'
Detected at node 'mul_186' defined at (most recent call last):
File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 937, in _bootstrap
   self._bootstrap_inner()
File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 980, in _bootstrap_inner
   self.run()
File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 917, in run
   self._target(*self._args, **self._kwargs)
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 93, in trainerThread
   model = models.import_model(model_class_name)(
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 344, in __init__
   self.on_initialize()
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1679, in on_initialize
   src_dst_loss_gv_op = self.src_dst_opt.get_update_op(
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 64, in get_update_op
   v_t = self.beta_2*vs + (1.0-self.beta_2) * tf.square(g-m_t)
Node: 'mul_186'
2 root error(s) found.
  (0) RESOURCE_EXHAUSTED: failed to allocate memory
      [[{{node mul_186}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

      [[concat_5/concat/_1157]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

  (1) RESOURCE_EXHAUSTED: failed to allocate memory
      [[{{node mul_186}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

0 successful operations.
0 derived errors ignored.

Original stack trace for 'mul_186':
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 937, in _bootstrap
self._bootstrap_inner()
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 980, in _bootstrap_inner
self.run()
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 93, in trainerThread
model = models.import_model(model_class_name)(
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 344, in __init__
self.on_initialize()
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1679, in on_initialize
src_dst_loss_gv_op = self.src_dst_opt.get_update_op(
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 64, in get_update_op
v_t = self.beta_2*vs + (1.0-self.beta_2) * tf.square(g-m_t)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\variables.py", line 1074, in _run_op
return tensor_oper(a.value(), *args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 150, in error_handler
return fn(*args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1442, in r_binary_op_wrapper
return func(x, y, name=name)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1767, in _mul_dispatch
return multiply(x, y, name=name)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 150, in error_handler
return fn(*args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\util\dispatch.py", line 1176, in op_dispatch_handler
return dispatch_target(*args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\math_ops.py", line 529, in multiply
return gen_math_ops.mul(x, y, name)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 6588, in mul
_, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 797, in _apply_op_helper
op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\framework\ops.py", line 3800, in _create_op_internal
ret = Operation(

Traceback (most recent call last):
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 1378, in _do_call
return fn(*args)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 1361, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 1454, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
  (0) RESOURCE_EXHAUSTED: failed to allocate memory
      [[{{node mul_186}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

      [[concat_5/concat/_1157]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

  (1) RESOURCE_EXHAUSTED: failed to allocate memory
      [[{{node mul_186}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 235, in trainerThread
iter, iter_time = model.train_one_iter()
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 905, in train_one_iter
losses = self.onTrainOneIter()
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 2196, in onTrainOneIter
src_loss, dst_loss = self.src_dst_train(
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1704, in src_dst_train
s, d = nn.tf_sess.run(
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 968, in run
result = self._run(None, fetches, feed_dict, options_ptr,
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 1191, in _run
results = self._do_run(handle, final_targets, final_fetches,
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 1371, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 1397, in _do_call
raise type(e)(node_def, op, message)  # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.ResourceExhaustedError: Graph execution error:

Detected at node 'mul_186' defined at (most recent call last):
File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 937, in _bootstrap
   self._bootstrap_inner()
File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 980, in _bootstrap_inner
   self.run()
File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 917, in run
   self._target(*self._args, **self._kwargs)
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 93, in trainerThread
   model = models.import_model(model_class_name)(
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 344, in __init__
   self.on_initialize()
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1679, in on_initialize
   src_dst_loss_gv_op = self.src_dst_opt.get_update_op(
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 64, in get_update_op
   v_t = self.beta_2*vs + (1.0-self.beta_2) * tf.square(g-m_t)
Node: 'mul_186'
Detected at node 'mul_186' defined at (most recent call last):
File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 937, in _bootstrap
   self._bootstrap_inner()
File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 980, in _bootstrap_inner
   self.run()
File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 917, in run
   self._target(*self._args, **self._kwargs)
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 93, in trainerThread
   model = models.import_model(model_class_name)(
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 344, in __init__
   self.on_initialize()
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1679, in on_initialize
   src_dst_loss_gv_op = self.src_dst_opt.get_update_op(
File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 64, in get_update_op
   v_t = self.beta_2*vs + (1.0-self.beta_2) * tf.square(g-m_t)
Node: 'mul_186'
2 root error(s) found.
  (0) RESOURCE_EXHAUSTED: failed to allocate memory
      [[{{node mul_186}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

      [[concat_5/concat/_1157]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

  (1) RESOURCE_EXHAUSTED: failed to allocate memory
      [[{{node mul_186}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.

0 successful operations.
0 derived errors ignored.

Original stack trace for 'mul_186':
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 937, in _bootstrap
self._bootstrap_inner()
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 980, in _bootstrap_inner
self.run()
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 93, in trainerThread
model = models.import_model(model_class_name)(
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 344, in __init__
self.on_initialize()
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1679, in on_initialize
src_dst_loss_gv_op = self.src_dst_opt.get_update_op(
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 64, in get_update_op
v_t = self.beta_2*vs + (1.0-self.beta_2) * tf.square(g-m_t)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\variables.py", line 1074, in _run_op
return tensor_oper(a.value(), *args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 150, in error_handler
return fn(*args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1442, in r_binary_op_wrapper
return func(x, y, name=name)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1767, in _mul_dispatch
return multiply(x, y, name=name)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 150, in error_handler
return fn(*args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\util\dispatch.py", line 1176, in op_dispatch_handler
return dispatch_target(*args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\math_ops.py", line 529, in multiply
return gen_math_ops.mul(x, y, name)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 6588, in mul
_, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 797, in _apply_op_helper
op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\framework\ops.py", line 3800, in _create_op_internal
ret = Operation(

wtxx8888 · 发表于 2024-3-4 19:50:26

本帖最后由 wtxx8888 于 2024-3-4 20:01 编辑

这么长的报错信息，跳了好几次的 00M标记，看不到呗？炸显存了。
你显卡带不起来，此丹的当前参数。
自行尝试减低batch_size项目，或关闭其他消耗显存的项目。
看不懂上句解决办法的，就去看基础教程--参数详解。

cat750515 · 发表于 2024-3-4 22:13:54

wtxx8888 发表于 2024-3-4 19:50
这么长的报错信息，跳了好几次的 00M标记，看不到呗？炸显存了。
你显卡带不起来，此丹的当前参数。
自行尝 ...

原來如此! 非常感謝回復

		自动登录	找回密码
密码			立即注册（仅限QQ邮箱）

用神农MVE報錯...請大佬們幫忙看一下是什麼原因, 謝謝

浏览过的版块

真我风采勋章

万事如意节日勋章