Pro tip to reduce Time-to-First-Token (TTFT) for long prompts via API: warm up the prompt cache.
Send your system prompt ahead of the user prompt. Claude will cache it without generating a response.
When the actual user request arrives, it will hit the “warmed” cache, significantly speeding up your response time. 🏋️♂️
