Anthropic last month reduced the TTL (time to live) for the Claude Code prompt cache from one hour to five minutes for many requests, but said this should not increase costs despite users reporting ...
If you've been going through your token budget faster than ever, this change might be why.
The CPU’s cache reduces memory latency when data is accessed from the main system memory. Developers can and should take advantage of CPU cache to improve application performance. Modern CPUs ...