You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Score Explanation:
The vulnerability has a high confidentiality impact (C:H) because a successful timing attack can leak sensitive information. The issue is exploitable remotely (AV:N), though it requires significant effort and precision (AC:H). No privileges are required (PR:N), and the attack does not require user interaction (UI:N). Integrity is unaffected (I:N), but availability can be minimally impacted due to computational overhead (A:L).
Executive Summary
DeepSeek-V3, a Mixture-of-Experts (MoE) LLM, is vulnerable to a timing attack against secret verification due to its use of a non-constant-time comparison routine in token processing. This flaw could allow an attacker to infer secret values, such as authentication tokens or cryptographic keys, by measuring response times. While DeepSeek-V3 delivers state-of-the-art performance, this issue poses a serious risk to applications that rely on it for secure processing of sensitive inputs.
Detail Finding:
While performing a security code review of DeepSeek, we observed the noted code repo is susceptible to a Timing Attack Against Secret. Specifically, the vulnerability arises from how prompt_mask is computed and used in DeepSeek-V3's inference pipeline. Particularly, the comparison logic at lines 59 and 68 in generate.py introduces timing discrepancies based on token values.
Line 59 (prompt_mask = tokens != -1) checks whether a token exists but does not use a constant-time approach, leading to processing time variations based on token content.
Line 68 (finished |= torch.logical_and(~prompt_mask[:, cur_pos], next_token == eos_id)) introduces further timing variations when checking the end-of-sequence (EOS) token.
The timing attack vulnerability we’ve identified in DeepSeek-V3 aligns with the following OWASP Top 10 for Large Language Model (LLM) Applications 2025 categories:
LLM02:2025 Sensitive Information Disclosure: This category addresses scenarios where LLMs inadvertently expose confidential data, including personal identifiable information (PII), financial details, or proprietary business information. In the context of DeepSeek-V3, the timing attack could allow attackers to infer sensitive information by analyzing response times, leading to unauthorized data access and privacy violations.
LLM08:2025 Vector and Embedding Weaknesses: This risk pertains to vulnerabilities in systems utilizing vectors and embeddings, especially in Retrieval Augmented Generation (RAG) setups. Weaknesses in how vectors and embeddings are generated, stored, or retrieved can be exploited to inject harmful content, manipulate model outputs, or access sensitive information. In DeepSeek-V3, the non-constant-time verification routine could be exploited to manipulate embeddings, leading to potential data leakage or unauthorized access.
Impact:
An attacker could craft special inputs, measure response delays, and infer private data by exploiting these timing inconsistencies. Additionally, an attacker could perform the following attacks:
Sensitive token leakage: Attackers could extract secret keys, authentication tokens, or model-internal data through statistical analysis of response times.
Potential model poisoning: If DeepSeek is used in a multi-tenant environment, adversaries could deduce how different input sequences affect the model’s state.
Increased risk in security-critical deployments: AI-driven access control mechanisms, chatbots handling confidential queries, or secure computations are at risk.
While DeepSeek-V3 offers efficient inference and high performance, this timing flaw could undermine its security, especially in sensitive use cases.
Clone the impacted repo: git clone https://github.com/deepseek-ai/DeepSeek-V3
Navigate into the noted repo
Open the impacted file(s)
Got to impacted line
Mitigation/Remediation:
We recommend implementing constant-time operations for secret-dependent comparisons. Instead of directly comparing tokens, leverage PyTorch’s optimized cryptographic-safe operations. Additionally, we recommend the following hardening steps:
Use padding techniques to mask timing variances.
Normalize execution times to reduce distinguishability.
Conduct differential analysis to detect timing discrepancies in responses.
Fix (Using Constant-Time Masking)
Modify generate.py to ensure constant-time computation:
importtorch# Secure comparison to avoid timing leaksdefconstant_time_mask_comparison(tokens, value):
returntorch.eq(tokens, value).to(dtype=torch.uint8)
# Fix for line 59prompt_mask=constant_time_mask_comparison(tokens, -1)
# Fix for line 68finished|=torch.logical_and(~prompt_mask[:, cur_pos], next_token==eos_id)
For more information & context, please see the reference section below.
Please ensure that the above patch is applied to all affected software, services, applications, instances or systems managed by the team.
Risk: High (7.5)
CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:N/A:L
Score Explanation:
The vulnerability has a high confidentiality impact (C:H) because a successful timing attack can leak sensitive information. The issue is exploitable remotely (AV:N), though it requires significant effort and precision (AC:H). No privileges are required (PR:N), and the attack does not require user interaction (UI:N). Integrity is unaffected (I:N), but availability can be minimally impacted due to computational overhead (A:L).
Executive Summary
DeepSeek-V3, a Mixture-of-Experts (MoE) LLM, is vulnerable to a timing attack against secret verification due to its use of a non-constant-time comparison routine in token processing. This flaw could allow an attacker to infer secret values, such as authentication tokens or cryptographic keys, by measuring response times. While DeepSeek-V3 delivers state-of-the-art performance, this issue poses a serious risk to applications that rely on it for secure processing of sensitive inputs.
Detail Finding:
While performing a security code review of
DeepSeek
, we observed the noted code repo is susceptible to a Timing Attack Against Secret. Specifically, the vulnerability arises from how prompt_mask is computed and used in DeepSeek-V3's inference pipeline. Particularly, the comparison logic at lines 59 and 68 ingenerate.py
introduces timing discrepancies based on token values.prompt_mask = tokens != -1
) checks whether a token exists but does not use a constant-time approach, leading to processing time variations based on token content.finished |= torch.logical_and(~prompt_mask[:, cur_pos], next_token == eos_id)
) introduces further timing variations when checking the end-of-sequence (EOS) token.The timing attack vulnerability we’ve identified in DeepSeek-V3 aligns with the following OWASP Top 10 for Large Language Model (LLM) Applications 2025 categories:
Impact:
An attacker could craft special inputs, measure response delays, and infer private data by exploiting these timing inconsistencies. Additionally, an attacker could perform the following attacks:
While DeepSeek-V3 offers efficient inference and high performance, this timing flaw could undermine its security, especially in sensitive use cases.
Affected Assets:
Affected File(s):
Evidence:
DeepSeek-V3/inference/generate.py:59
DeepSeek-V3/inference/generate.py:68
Replicate Finding:
git clone https://github.com/deepseek-ai/DeepSeek-V3
Mitigation/Remediation:
We recommend implementing constant-time operations for secret-dependent comparisons. Instead of directly comparing tokens, leverage PyTorch’s optimized cryptographic-safe operations. Additionally, we recommend the following hardening steps:
Fix (Using Constant-Time Masking)
Modify
generate.py
to ensure constant-time computation:For more information & context, please see the reference section below.
Please ensure that the above patch is applied to all affected software, services, applications, instances or systems managed by the team.
References:
The text was updated successfully, but these errors were encountered: