Add Metal support for GGML_OP_DIAG_MASK_INF for CLIP on Apple GPUs #1395

taradaidv · 2025-12-07T22:32:21Z

Summary

Add a Metal implementation of GGML_OP_DIAG_MASK_INF so the op can run on Apple GPUs.

Details

Previously, the Metal backend didn't support GGML_OP_DIAG_MASK_INF, which caused models using this op to fail with:

unsupported op 'DIAG_MASK_INF'

For example, stable-diffusion.cpp’s SDXL CLIP path could not run with keep_clip_on_cpu = false on Metal.

This PR:

Implements a Metal kernel for GGML_OP_DIAG_MASK_INF matching the existing CPU/CUDA semantics.
Wires it into ggml-metal dispatch so attention masks are applied correctly for batched tensors on Apple GPUs.

Related: leejet/stable-diffusion.cpp#1040

…ence on Apple GPUs

Add full Metal support for GGML_OP_DIAG_MASK_INF, enabling CLIP infer…

3ca2ea6

…ence on Apple GPUs

taradaidv mentioned this pull request Dec 7, 2025

[Bug] Metal backend fails with GGML_OP_DIAG_MASK_INF - op not implemented for Metal leejet/stable-diffusion.cpp#1040

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Metal support for GGML_OP_DIAG_MASK_INF for CLIP on Apple GPUs #1395

Add Metal support for GGML_OP_DIAG_MASK_INF for CLIP on Apple GPUs #1395

taradaidv commented Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add Metal support for GGML_OP_DIAG_MASK_INF for CLIP on Apple GPUs #1395

Are you sure you want to change the base?

Add Metal support for GGML_OP_DIAG_MASK_INF for CLIP on Apple GPUs #1395

Conversation

taradaidv commented Dec 7, 2025

Summary

Details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant