vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and prior to 0.8.5 are affected by a critical performance vulnerability in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces placeholder tokens (e.g., <|audio_|>, <|image_|>) with repeated tokens based on precomputed lengths. Due to ​​inefficient list concatenation operations​​, the algorithm exhibits ​​quadratic time complexity (O(n²))​​, allowing malicious actors to trigger resource exhaustion via specially crafted inputs. This issue has been patched in version 0.8.5.
History

Wed, 30 Apr 2025 00:45:00 +0000

Type Values Removed Values Added
Description vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and prior to 0.8.5 are affected by a critical performance vulnerability in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces placeholder tokens (e.g., <|audio_|>, <|image_|>) with repeated tokens based on precomputed lengths. Due to ​​inefficient list concatenation operations​​, the algorithm exhibits ​​quadratic time complexity (O(n²))​​, allowing malicious actors to trigger resource exhaustion via specially crafted inputs. This issue has been patched in version 0.8.5.
Title vLLM phi4mm: Quadratic Time Complexity in Input Token Processing​ leads to denial of service
Weaknesses CWE-1333
References
Metrics cvssV3_1

{'score': 6.5, 'vector': 'CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H'}


cve-icon MITRE

Status: PUBLISHED

Assigner: GitHub_M

Published: 2025-04-30T00:24:53.750Z

Updated: 2025-04-30T00:24:53.750Z

Reserved: 2025-04-24T21:10:48.174Z

Link: CVE-2025-46560

cve-icon Vulnrichment

No data.

cve-icon NVD

Status : Received

Published: 2025-04-30T01:15:52.097

Modified: 2025-04-30T01:15:52.097

Link: CVE-2025-46560

cve-icon Redhat

No data.