[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]  elvis [@omarsar0](/creator/twitter/omarsar0) on x 254.7K followers Created: 2025-07-14 15:17:11 UTC "Master keys" break LLM judges Simple, generic lead-ins (e.g., “Let’s solve this step by step”) and even punctuation marks can elicit false YES judgments from top reward models. This manipulation works across models (GPT-4o, Claude-4, Qwen2.5, etc.), tasks (math and general reasoning), and prompt formats, reaching up to XX% false positive rates in some cases.  XXXXX engagements  **Related Topics** [manipulation](/topic/manipulation) [marks](/topic/marks) [llm](/topic/llm) [elvis](/topic/elvis) [Post Link](https://x.com/omarsar0/status/1944778206504231201)
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
elvis @omarsar0 on x 254.7K followers
Created: 2025-07-14 15:17:11 UTC
"Master keys" break LLM judges
Simple, generic lead-ins (e.g., “Let’s solve this step by step”) and even punctuation marks can elicit false YES judgments from top reward models.
This manipulation works across models (GPT-4o, Claude-4, Qwen2.5, etc.), tasks (math and general reasoning), and prompt formats, reaching up to XX% false positive rates in some cases.
XXXXX engagements
Related Topics manipulation marks llm elvis
/post/tweet::1944778206504231201