哪位老兄帮忙哈，脸部特征点提取的问题

day270010678 · 发表于 2026-1-13 11:01:58

星级打分

1
2
3
4
5

平均分:NAN 参与人数:0 我的评分:未评

本帖最后由 day270010678 于 2026-1-21 10:18 编辑

彻底换了deepfacelab框架，基于pytorch框架构建的，丫的光这个面部识别参数调整了一个小时了，愣是不能完美。

昨晚空闲又折腾了会这个，终于找到原因，解决了这个问题，导致这个问题的核心是权重文件加载导致的，因为我是转换的pytorch格式，所以导致里面很多层没有正确映射，尤其是3dfan.npy和2dfan.npy，他们的层太他妈复杂，尤其是bn层和归一层，大家在转换的时候记得把特殊层处理好，所以转换的时候大家小心点。

任何软件，只要他用的是s3fd检测器和fan特征点提取，只要权重文件加载不正常，都会导致上面偏离真实位置。权重文件加载不正常导致的后果是什么？那就是你手动画的头像提取都是不完美的，虽然不至于白画，但是也差不了多少，合成的时候会出现抖动等等一系列问题。

我建议你们尽量不要手动画，尽量让软件自动提取，真要手动画，那就全部手动画，就怕手动和自动结合。手动自动结合，你画的越完美，出现的问题就越多。这是软件本身造成的。

model_path = Path(__file__).parent / ( "2DFAN.pth" if not landmarks_3D else "3DFAN.pth")
if not model_path.exists():
raise Exception("Unable to load FANExtractor model")

复制代码

# Load weights - now using standard PyTorch format
state_dict = torch.load(model_path, map_location='cpu', weights_only=False)
self.model.load_state_dict(state_dict, strict=True)
print(f"Loaded {len(state_dict)} weights from PyTorch format")
# Move model to device
self.model.to(self.device)
self.model.eval()

复制代码

忘记说效率了，我用这个小视频，切割成162张图片，用原始的没改动的所谓的TensorFlow2.7+cuda11.8的那个提取头像，耗时2分多钟，我用python10+TensorFlow2.10，测试这162张，耗时1分20秒，用重构的pytorch框架，多次测试，长的耗时40多秒，短的30多秒。我显卡是3050，6g的测试的。

day270010678 · 发表于 2026-1-13 11:02:59

精细化了我发现还是有问题
# Apply a small adjustment to reduce the spread slightly
# Based on historical experience, use a factor between 0.95 and 0.98
fine_tune_factor = 0.97 # Reduce spread by 3%

day270010678 · 发表于 2026-1-13 11:26:18

def transform(self, point, center, scale, resolution):
"""
Transform a point from model space to image space
"""
pt = np.array([point[0], point[1], 1.0], dtype=np.float32) # Ensure float32
h = 200.0 * scale
m = np.eye(3, dtype=np.float32) # Ensure float32
m[0,0] = resolution / h
m[1,1] = resolution / h
m[0,2] = resolution * ( -center[0] / h + 0.5 )
m[1,2] = resolution * ( -center[1] / h + 0.5 )
m = np.linalg.inv(m)
# Apply a small adjustment to the transformed points based on actual face size
# This is based on empirical observation that PyTorch version tends to produce slightly smaller landmarks
adjusted_pt = np.matmul(m, pt)[0:2].astype(np.float32)
# Calculate the actual face width and height from the bounding box
face_width = scale * 195.0 # This is the actual face width in pixels
# Use a reference face size (e.g., 200 pixels) as baseline
reference_face_size = 200.0
# Calculate adjustment factor based on actual face size
adjustment_factor = face_width / reference_face_size
# Apply a small adjustment to reduce the spread slightly
fine_tune_factor = 0.98 # Reduce by 2%
# Apply adjustment to the transformed point
if adjustment_factor > 0:
# Calculate centroid of landmarks (approximate)
centroid = np.array([center[0], center[1]], dtype=np.float32)
# Scale distances from centroid by the adjustment factor
adjusted_pt = centroid + (adjusted_pt - centroid) * adjustment_factor * fine_tune_factor
return adjusted_pt

复制代码

在transform方法中添加一个基于实际人脸尺寸的动态调整机制也不行，草了。

day270010678 · 发表于 2026-1-13 11:29:58

基于热图峰值位置的精确调整，也不行

def get_pts_from_predict(self, a, center, scale):
      """
      Extract points from the model prediction heatmaps
      """
      a_ch, a_h, a_w = a.shape

      print(f"get_pts_from_predict: Input shape = ({a_ch}, {a_h}, {a_w}), dtype = {a.dtype}")

      # Make sure we have the right number of channels (should be 68 for landmarks)
      if a_ch != 68:
         print(f"Warning: Expected 68 channels for landmarks, got {a_ch}. Using fallback landmarks.")
         base_landmarks = LandmarksProcessor.landmarks_2D
         # Create fallback landmarks based on center
         fallback_pts = base_landmarks.astype(np.float32) * 100 + np.array([center[0], center[1]], dtype=np.float32)
         return fallback_pts

      b = a.reshape((a_ch, a_h*a_w))
      c = b.argmax(1).reshape((a_ch, 1)).repeat(2, axis=1).astype(np.float32)  # Force to float32
      c[:,0] %= a_w
      c[:,1] = np.apply_along_axis(lambda x: np.floor(x / a_w), 0, c[:,1])

      for i in range(a_ch):
         pX, pY = int(c[i,0]), int(c[i,1])
         # Use dynamic boundary check instead of hardcoding 63
         if pX > 0 and pX < a_w-1 and pY > 0 and pY < a_h-1:
            diff = np.array([a[i,pY,pX+1]-a[i,pY,pX-1], a[i,pY+1,pX]-a[i,pY-1,pX]], dtype=np.float32)  # Force to float32
            c[i] += np.sign(diff)*0.25

      c += 0.5

      # Apply a scaling factor to adjust landmark size if needed
      pts = np.array([self.transform(c[i], center, scale, a_w) for i in range(a_ch)], dtype=np.float32)  # Ensure output is float32

      # Add a small offset to adjust for potential bias in the model's output
      # This is based on empirical observation that FAN model tends to produce slightly oversized landmarks
      offset_factor = 0.95  # Reduce by 5%

      # Calculate centroid of landmarks
      centroid = np.mean(pts, axis=0)
      # Scale distances from centroid by the offset factor
      pts = centroid + (pts - centroid) * offset_factor

      return pts

别看我我不在 · 发表于 2026-1-14 18:52:18

day270010678 发表于 2026-1-13 11:29
基于热图峰值位置的精确调整，也不行

def get_pts_from_predict(self, a, center, scale):

厉害呀大神。我也遇到过。面部识别不正确感觉是无解的。只能通过剪辑把难以识别的部分切掉。

lhs · 发表于 2026-1-20 08:27:16

看不懂你说的什么问题。但是你说的是特征点不准确的话似乎在输入的时候就已经有了，只是进行了转换。在模型预测输出的热力图（heatmap）。你应该更换一个库（模型）来识别，比如insightface 而不是后处理调整精度

day270010678 · 发表于 2026-1-21 10:30:35

lhs 发表于 2026-1-20 08:27
看不懂你说的什么问题。但是你说的是特征点不准确的话似乎在输入的时候就已经有了，只是进行了转换。在模型 ...

更换什么库都一样，不要动不动就想用三方库来解决，尤其是这种高度完整的框架里面，就说deepfacelab中的LandmarksProcessor.py，我重构框架的时候就遇到了三方库的问题， opencv，你没看错，就是这个视觉库老大，cv2.getAffineTransform()在传递给他的数值合理的情况下，会抛出异常，中断程序运行。刚开始我以为是传递给他的数据不合理，我加了条件验证过滤，结果折腾了半天没用，执行流我过了一遍，都找不出原因，最后我只能考虑第三方库兼容问题，最后还真对了，使用np.asarray()确保数据类型为float32，使用np.ascontiguousarray()确保数组在内存中是连续的，才解决这个问题。为什么会出现这个兼容问题？本质就是在传递给他的数据在传递的时候是正常的，但是因为底层的问题，会有不可控的因素导致他直接变为float64了，谁能想到传递的时候是32，结果因为一些因素在他手里莫名其妙的32变成64了。

		自动登录	找回密码
密码			立即注册（仅限QQ邮箱）

哪位老兄帮忙哈，脸部特征点提取的问题

万事如意节日勋章