Skip to content

Conversation

@libinta
Copy link

@libinta libinta commented Dec 8, 2025

  1. Fix the crash issue below as inputs_embeds. is expected to be 2D but the current logic returns inputs_embeds.shape=torch.Size([1, 1024, 5120]) when bs=1. bs>1 is not enabled yet for the flow

"/root/litang/github/qwen3/vllm/vllm/model_executor/models/qwen3_vl.py", line 1563, in _compute_deepstack_embeds
(EngineCore_DP0 pid=202) ERROR 12-08 15:59:28 [v1/engine/core.py:845] deepstack_input_embeds = deepstack_input_embeds.view(
(EngineCore_DP0 pid=202) ERROR 12-08 15:59:28 [v1/engine/core.py:845] RuntimeError: shape '[1, 3, 5120]' is invalid for input of size 3072
2. Enable multi-modal bucket warmup for qwen3-vl

@libinta libinta force-pushed the libinta/qwen3-vl_enable branch from 91e8049 to 8bab89c Compare December 10, 2025 00:14
@github-actions
Copy link

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

@libinta libinta changed the title fix crash for qwen3-vl enablement qwen3-vl enablement Dec 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant