deepfacelab中文网

 找回密码
 立即注册(仅限QQ邮箱)
查看: 259|回复: 2

用神农MVE報錯...請大佬們幫忙看一下是什麼原因, 謝謝

[复制链接]

1

主题

6

帖子

334

积分

初级丹师

Rank: 3Rank: 3

积分
334
 楼主| 发表于 2024-3-4 18:58:47 | 显示全部楼层 |阅读模式
星级打分
  • 1
  • 2
  • 3
  • 4
  • 5
平均分:NAN  参与人数:0  我的评分:未评

用神农MVE的 (训练 SAEHD 原版模型) 使用五彩320WF丹是報錯找不到训练数据


STR跟SRC都有检查过位置都正确


把丹转化成神农ME的丹数据是有读取到了变成以下报错



請大佬們幫忙看看是什麼原因謝謝!


[保存时间][迭代次数][单次时间][源损失][目标损失]
Error: Graph execution error:


Detected at node 'mul_186' defined at (most recent call last):
    File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 937, in _bootstrap
      self._bootstrap_inner()
    File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 980, in _bootstrap_inner
      self.run()
    File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 917, in run
      self._target(*self._args, **self._kwargs)
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 93, in trainerThread
      model = models.import_model(model_class_name)(
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 344, in __init__
      self.on_initialize()
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1679, in on_initialize
      src_dst_loss_gv_op = self.src_dst_opt.get_update_op(
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 64, in get_update_op
      v_t = self.beta_2*vs + (1.0-self.beta_2) * tf.square(g-m_t)
Node: 'mul_186'
Detected at node 'mul_186' defined at (most recent call last):
    File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 937, in _bootstrap
      self._bootstrap_inner()
    File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 980, in _bootstrap_inner
      self.run()
    File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 917, in run
      self._target(*self._args, **self._kwargs)
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 93, in trainerThread
      model = models.import_model(model_class_name)(
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 344, in __init__
      self.on_initialize()
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1679, in on_initialize
      src_dst_loss_gv_op = self.src_dst_opt.get_update_op(
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 64, in get_update_op
      v_t = self.beta_2*vs + (1.0-self.beta_2) * tf.square(g-m_t)
Node: 'mul_186'
2 root error(s) found.
  (0) RESOURCE_EXHAUSTED: failed to allocate memory
         [[{{node mul_186}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


         [[concat_5/concat/_1157]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


  (1) RESOURCE_EXHAUSTED: failed to allocate memory
         [[{{node mul_186}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


0 successful operations.
0 derived errors ignored.


Original stack trace for 'mul_186':
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 937, in _bootstrap
    self._bootstrap_inner()
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 980, in _bootstrap_inner
    self.run()
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 93, in trainerThread
    model = models.import_model(model_class_name)(
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 344, in __init__
    self.on_initialize()
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1679, in on_initialize
    src_dst_loss_gv_op = self.src_dst_opt.get_update_op(
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 64, in get_update_op
    v_t = self.beta_2*vs + (1.0-self.beta_2) * tf.square(g-m_t)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\variables.py", line 1074, in _run_op
    return tensor_oper(a.value(), *args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1442, in r_binary_op_wrapper
    return func(x, y, name=name)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1767, in _mul_dispatch
    return multiply(x, y, name=name)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\util\dispatch.py", line 1176, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\math_ops.py", line 529, in multiply
    return gen_math_ops.mul(x, y, name)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 6588, in mul
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 797, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\framework\ops.py", line 3800, in _create_op_internal
    ret = Operation(


Traceback (most recent call last):
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 1378, in _do_call
    return fn(*args)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 1361, in _run_fn
    return self._call_tf_sessionrun(options, feed_dict, fetch_list,
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 1454, in _call_tf_sessionrun
    return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
  (0) RESOURCE_EXHAUSTED: failed to allocate memory
         [[{{node mul_186}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


         [[concat_5/concat/_1157]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


  (1) RESOURCE_EXHAUSTED: failed to allocate memory
         [[{{node mul_186}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


0 successful operations.
0 derived errors ignored.


During handling of the above exception, another exception occurred:


Traceback (most recent call last):
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 235, in trainerThread
    iter, iter_time = model.train_one_iter()
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 905, in train_one_iter
    losses = self.onTrainOneIter()
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 2196, in onTrainOneIter
    src_loss, dst_loss = self.src_dst_train(
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1704, in src_dst_train
    s, d = nn.tf_sess.run(
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 968, in run
    result = self._run(None, fetches, feed_dict, options_ptr,
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 1191, in _run
    results = self._do_run(handle, final_targets, final_fetches,
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 1371, in _do_run
    return self._do_call(_run_fn, feeds, fetches, targets, options,
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\client\session.py", line 1397, in _do_call
    raise type(e)(node_def, op, message)  # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.ResourceExhaustedError: Graph execution error:


Detected at node 'mul_186' defined at (most recent call last):
    File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 937, in _bootstrap
      self._bootstrap_inner()
    File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 980, in _bootstrap_inner
      self.run()
    File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 917, in run
      self._target(*self._args, **self._kwargs)
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 93, in trainerThread
      model = models.import_model(model_class_name)(
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 344, in __init__
      self.on_initialize()
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1679, in on_initialize
      src_dst_loss_gv_op = self.src_dst_opt.get_update_op(
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 64, in get_update_op
      v_t = self.beta_2*vs + (1.0-self.beta_2) * tf.square(g-m_t)
Node: 'mul_186'
Detected at node 'mul_186' defined at (most recent call last):
    File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 937, in _bootstrap
      self._bootstrap_inner()
    File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 980, in _bootstrap_inner
      self.run()
    File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 917, in run
      self._target(*self._args, **self._kwargs)
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 93, in trainerThread
      model = models.import_model(model_class_name)(
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 344, in __init__
      self.on_initialize()
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1679, in on_initialize
      src_dst_loss_gv_op = self.src_dst_opt.get_update_op(
    File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 64, in get_update_op
      v_t = self.beta_2*vs + (1.0-self.beta_2) * tf.square(g-m_t)
Node: 'mul_186'
2 root error(s) found.
  (0) RESOURCE_EXHAUSTED: failed to allocate memory
         [[{{node mul_186}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


         [[concat_5/concat/_1157]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


  (1) RESOURCE_EXHAUSTED: failed to allocate memory
         [[{{node mul_186}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.


0 successful operations.
0 derived errors ignored.


Original stack trace for 'mul_186':
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 937, in _bootstrap
    self._bootstrap_inner()
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 980, in _bootstrap_inner
    self.run()
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\mainscripts\Trainer.py", line 93, in trainerThread
    model = models.import_model(model_class_name)(
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\ModelBase.py", line 344, in __init__
    self.on_initialize()
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\models\Model_ME\Model.py", line 1679, in on_initialize
    src_dst_loss_gv_op = self.src_dst_opt.get_update_op(
  File "E:\DFL-MVE-SN-1.6.1\_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 64, in get_update_op
    v_t = self.beta_2*vs + (1.0-self.beta_2) * tf.square(g-m_t)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\variables.py", line 1074, in _run_op
    return tensor_oper(a.value(), *args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1442, in r_binary_op_wrapper
    return func(x, y, name=name)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1767, in _mul_dispatch
    return multiply(x, y, name=name)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\util\dispatch.py", line 1176, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\math_ops.py", line 529, in multiply
    return gen_math_ops.mul(x, y, name)
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 6588, in mul
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 797, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "E:\DFL-MVE-SN-1.6.1\_internal\python-3.9.18\lib\site-packages\tensorflow\python\framework\ops.py", line 3800, in _create_op_internal
    ret = Operation(

回复

使用道具 举报

9

主题

1931

帖子

1万

积分

高级丹圣

Rank: 13Rank: 13Rank: 13Rank: 13

积分
10696

真我风采勋章万事如意节日勋章

发表于 2024-3-4 19:50:26 | 显示全部楼层
本帖最后由 wtxx8888 于 2024-3-4 20:01 编辑

这么长的报错信息,跳了好几次的 00M标记,看不到呗?炸显存了。
你显卡带不起来,此丹的当前参数。
自行尝试减低batch_size项目,或关闭其他消耗显存的项目。
看不懂上句解决办法的,就去看基础教程--参数详解。

回复 支持 反对

使用道具 举报

1

主题

6

帖子

334

积分

初级丹师

Rank: 3Rank: 3

积分
334
 楼主| 发表于 2024-3-4 22:13:54 | 显示全部楼层
wtxx8888 发表于 2024-3-4 19:50
这么长的报错信息,跳了好几次的 00M标记,看不到呗?炸显存了。
你显卡带不起来,此丹的当前参数。
自行尝 ...

原來如此! 非常感謝回復
回复 支持 反对

使用道具 举报

QQ|Archiver|手机版|deepfacelab中文网 |网站地图

GMT+8, 2024-5-18 22:18 , Processed in 0.082913 second(s), 10 queries , Redis On.

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表