How We Cut LLM Costs by 59% With Prompt Caching

Executive Summary

This article details ProjectDiscovery's optimization of their autonomous security testing platform, Neo. The platform utilizes multi-agent workflows involving large language models (LLMs) to conduct vulnerability assessments and code reviews. Initially, operational costs were prohibitive, with complex tasks consuming up to 60 million tokens using Opus 4.5. To mitigate these costs, the team implemented prompt caching strategies. This adjustment resulted in a 59% reduction in LLM expenses, enabling more scalable continuous testing across the development lifecycle. From a threat intelligence perspective, this report contains no information regarding active threat actors, malware families, or adversarial tactics. It serves as an operational update on defensive tooling efficiency rather than an incident response or threat analysis report. Organizations should note this as a vendor update regarding security automation capabilities.

Summary

At ProjectDiscovery, we've been building Neo, an autonomous security testing platform that runs multi-agent, multi-step workflows, routinely executing 20-40+ LLM steps per task. Vulnerability assessments, code reviews, and security audits at scale, enabling continuous testing across the entire development lifecycle. When we launched, our LLM costs were staggering. A single complex task with Opus 4.5 could consume 60 million tokens. Then we implemented prompt caching. Here's what changed:

Published Analysis

This article details ProjectDiscovery's optimization of their autonomous security testing platform, Neo. The platform utilizes multi-agent workflows involving large language models (LLMs) to conduct vulnerability assessments and code reviews. Initially, operational costs were prohibitive, with complex tasks consuming up to 60 million tokens using Opus 4.5. To mitigate these costs, the team implemented prompt caching strategies. This adjustment resulted in a 59% reduction in LLM expenses, enabling more scalable continuous testing across the development lifecycle. From a threat intelligence perspective, this report contains no information regarding active threat actors, malware families, or adversarial tactics. It serves as an operational update on defensive tooling efficiency rather than an incident response or threat analysis report. Organizations should note this as a vendor update regarding security automation capabilities. At ProjectDiscovery, we've been building Neo, an autonomous security testing platform that runs multi-agent, multi-step workflows, routinely executing 20-40+ LLM steps per task. Vulnerability assessments, code reviews, and security audits at scale, enabling continuous testing across the entire development lifecycle. When we launched, our LLM costs were staggering. A single complex task with Opus 4.5 could consume 60 million tokens. Then we implemented prompt caching. Here's what changed: At ProjectDiscovery, we've been building Neo, an autonomous security testing platform that runs multi-agent, multi-step workflows, routinely executing 20-40+ LLM steps per task. Vulnerability assessments, code reviews, and security audits at scale, enabling continuous testing across the entire development lifecycle. When we launched, our LLM costs were staggering. A single complex task with Opus 4.5 could consume 60 million tokens. Then we implemented prompt caching. Here's what changed: