|
楼主 |
发表于 2024-10-9 15:57:33
|
显示全部楼层
目前看没有技术难度,目前主要问题是处理速度并不理想。directml版本的pytorch缺少memory efficient attention的实现,在运行时会OOM。我也参考了comfyui采纳的--sub-quad-attention,但在我的卡上运行很慢,并且生成的图像出现了一些破损。所以我目前不准备发布这个半成品的directml版本。如果大家有更好的可以替代torch.nn.functional.scaled_dot_product的directml方案,请告知我,我会尽快发布可用的版本。
Currently, there is no technical difficulty, but the main issue is that the processing speed is not ideal. The DirectML version of PyTorch lacks the implementation of memory-efficient attention, which results in out-of-memory (OOM) errors during runtime. I also referenced the --sub-quad-attention adopted by ComfyUI, but it runs very slowly on my GPU, and the generated images have some artifacts. Therefore, I am not planning to release this unfinished DirectML version for now. If anyone has a better DirectML solution to replace torch.nn.functional.scaled_dot_product, please let me know, and I will release a usable version as soon as possible. |
|