Beyond the Scan: The Hidden Reality of Unfixed Security Risks Revealed by Pentesting Data

Beyond the Scan: The Hidden Reality of Unfixed Security Risks Revealed by Pentesting Data

Cybersecurity is a critical concern for organizations of all sizes, yet a significant gap often exists between the perception of security and the reality of exploitable risks. While most organizations express confidence in their security posture and ability to meet compliance requirements, pentesting data tells a more complex story. Automated scanners may provide a baseline, but they frequently miss hidden vulnerabilities, leaving organizations exposed to real, exploitable threats that only expert-led pentesting can uncover.

Drawing on data from thousands of pentests conducted via the Cobalt Offensive Security Platform and a survey of 450 security leaders and practitioners, the State of Pentesting Report 2025 offers crucial insights into the current landscape of identified vulnerabilities and, more importantly, the challenging reality of resolving them.

The Essential Role of Pentesting

Pentesting is widely viewed as an essential part of modern security programs and a major business driver. For 94% of survey respondents, pentests are foundational to ensuring a strong security posture, highlighting their role in providing assurance that defenses are truly solid. Beyond foundational security, pentesting is also seen as important for compliance (91% of respondents), organizational strategy and senior leadership objectives (92%), improving customer trust (over three-quarters), and reducing corporate and personal liability related to unaddressed issues. Third-party pentest reports are frequently requested by customers and regulators, even more so than vulnerability scans or compliance certifications.

Crucially, pentest findings are not theoretical; they are proven, exploitable vulnerabilities discovered by human expertise in real-world scenarios. While automated tools might flag potential weaknesses, pentesters confirm whether these can actually be exploited given relevant circumstances and mitigating controls. This process typically yields a smaller set of real risks compared to scanner results. Pentests conducted via Cobalt found at least one reportable finding in every test, with a median of six findings per test.

Unmasking the Top Vulnerabilities Across Diverse Assets

Pentesting reveals a variety of vulnerabilities depending on the type of asset being tested, using methodologies like OWASP Top 10 lists for different technologies and OSSTMM for networks.

  • Web Applications and APIs: These assets are frequently exposed to attackers. The most common findings include Server Security Misconfiguration (28.4%), reflecting issues like default settings, unnecessary services, and missing patches. Missing Access Control is the second most common (19.2%), indicating improper authentication and authorization enforcement allowing unauthorized access. Other prevalent issues are Server-Side Injection, Sensitive Data Exposure, and Authentication/Sessions vulnerabilities.
  • Mobile Applications: Modern mobile apps are often tightly coupled with web services, leading to similar findings like security misconfigurations and missing access control. Unique mobile findings include a Lack of Binary Hardening, which allows attackers to reverse engineer and modify code. Mobile Security Misconfiguration is a broad category encompassing lack of SSL pinning, no jailbreak/root detection, and unnecessary components.
  • AI and LLMs: The rapid adoption of AI introduces new and unique threats. Nearly all organizations (98%) are integrating genAI into their products, and securing genAI tops their list of concerns. Cobalt's AI testing finds more vulnerabilities than any other type of test. Notably, AI and LLM pentests have a significantly higher proportion of serious findings (32%) compared to the overall average (13%). The most prevalent LLM-specific findings include:
    • Insecure Output Handling (19.4% of LLM findings): Improper validation of model responses leading to data leaks or injection attacks.
    • Prompt Injection (9.7%): Manipulating model inputs to bypass safeguards or alter behavior.
    • Model Denial of Service (11.5%): Overloading or disrupting LLMs. The concept of Unbounded Consumption, including Denial of Wallet (DoW), is an emerging concern in this area.
    • Sensitive Information Disclosure (14.5%): LLMs inadvertently exposing confidential data.
    • Overreliance (4.2%): Risks stemming from misinformation or inappropriate content generated by LLMs.
    • AI applications are also still prone to common application vulnerabilities.

Overall, about 13% of all non-informational pentest findings are rated as "serious" (high or very high likelihood and impact). Common vulnerability types among these serious findings include Missing Access Control, Cross-Site Scripting (XSS), Server Security Misconfiguration, Sensitive Data Exposure, and Components with Known Vulnerabilities.

The Remediation Gap: Why Many Findings Remain Unfixed

Despite the consensus that pentests demand attention, less than half (48%) of all pentest findings actually get resolved. While the resolution rate improves to 69% for serious findings, this still means many high-risk, exploitable vulnerabilities never get fixed. The proportion of serious findings resolved has hovered at around 55-60% for several years, indicating a kind of stalemate where organizations are not making significant forward progress. Over half of organizations do resolve 90% or more of their serious findings, showing it's achievable, but 15% resolve 10% or less.

Why aren't findings acted upon? The sources point to several reasons:

  • Organizations may only do what is required for compliance or third-party approval, with risk remediation being a lower priority.
  • Organizational issues, such as development teams responsible for fixes being in different groups with different priorities than the security team that ordered the test.
  • The complexity of remediation processes for vulnerabilities that fall outside standard patch management can challenge less mature teams.
  • Technology roadblocks, including reliance on legacy or fragile systems that are difficult to change.
  • Resource constraints impacting all the above.

Conversely, findings get resolved for several key reasons:

  • Criticality of finding is the top driver (36%).
  • Ease of fix is a significant factor (21%), as fixing easy issues delivers fast value.
  • Compliance requirement (16%).
  • Risk related to PII or data exposure (15%).
  • Asset sensitivity (12%).

The Time Lag: Perception vs. Reality in Resolution Speed

The perception of how quickly vulnerabilities should be fixed often clashes with reality. Three-quarters of organizations have SLAs requiring fixes within two weeks. However, the median time to resolve (MTTR) stands at 67 days for all findings. For serious findings, the MTTR is faster at 50 days, but still five times longer than the typical two-week SLA. While the MTTR for serious findings has improved significantly over the past decade (dropping from 112 days in 2017 to 37 days in 2024), organizations still fall short of their stated goals.

Measuring MTTR only accounts for issues that are actually resolved. A more realistic measure of true progress is the half-life, which tracks how long it takes to resolve 50% of all identified findings, including those that remain unfixed. The overall half-life for pentest findings is approximately 3.2 years. For serious findings, the half-life is much shorter at 104 days, but still longer than the MTTR, highlighting that even among high-risk issues, many remain open for extended periods. Non-serious findings have an astonishingly long half-life of 1,781 days (about 5 years).

Factors Influencing Resolution Speed and Rate

Several factors beyond criticality influence how quickly and how often findings are resolved:

  • Type of Pentest: Resolution rates for serious findings vary significantly. Web (73.5%) and API (75.5%) pentests have the highest serious finding resolution rates, often because product/development teams ordering the tests are motivated to fix issues. AI and LLM tests show the lowest proportion of serious findings resolved (21.1%), likely due to the newness of the field, lack of expertise, and reliance on model developers for some fixes. In terms of speed (MTTR), serious AI/LLM findings that are fixed tend to be resolved relatively quickly (MTTR 19 days), suggesting organizations fix the easy minority quickly while leaving the rest. Internal network issues, conversely, have a notably longer MTTR (44 days).
  • Organizational Size: Smaller firms (1-100 employees) are better at fixing findings, resolving a higher proportion (80.8%) and boasting significantly faster MTTRs (27 days) for serious findings compared to larger organizations (e.g., 61 days for 5,001+ employees). Larger organizations face challenges with complex environments, processes, and legacy systems.
  • Industry Sector: Both the proportion of serious findings resolved and the MTTR for serious findings vary by sector. Utilities, Hospitality, Education, and Healthcare have the lowest serious finding resolution rates, which is concerning given the potential impact on human safety. Manufacturing (122 days) and Education (80 days) have unusually long MTTRs. Hospitality has a very fast MTTR (20 days) but a low resolution rate, suggesting they fix a few issues quickly while leaving many open.

Recommendations for Accelerating Risk Reduction

The data underscores the urgent need for organizations to move beyond ad hoc testing and compliance checkboxes towards a more proactive and effective approach to offensive security and remediation.

  1. Build a Programmatic Approach: Shift from irregular testing to a structured offensive security program. Prioritize testing based on risk, starting with critical assets. Expand testing beyond web applications to include network, cloud, and increasingly, AI/LLM solutions, working with experienced pentesters in new areas.
  2. Establish Processes for Resolving Findings: A significant disconnect exists between testing and addressing issues, with less than 70% of serious findings resolved. Implement an annual pentest calendar and align testing with product development roadmaps for continuous security integration. Use a centralized system to track findings and streamline the remediation workflow.
  3. Foster Collaboration and Alignment: Effective remediation requires close cooperation between security and development teams. Align security goals with engineering metrics to encourage shared responsibility. Escalate significant security concerns to leadership, using data on past incidents and vulnerabilities to highlight real risks. Build strong relationships with cross-functional peers, defining clear ownership for fixes, and agreeing on realistic yet effective SLAs to ensure accountability and continuous improvement.

Conclusion

The State of Pentesting Report 2025 clearly demonstrates that pentesting is invaluable for revealing the real, exploitable vulnerabilities that lurk beneath the surface of perceived security postures. However, the persistent challenge lies in translating these findings into actual risk reduction. Low resolution rates and lengthy remediation timelines, especially for non-critical issues, mean that organizations often remain exposed to known threats for months or even years. By adopting a programmatic approach to offensive security, establishing robust resolution processes, and fostering genuine collaboration between security and development teams, organizations can close the remediation gap and build a truly resilient defense against the evolving threat landscape, including the rapidly emerging risks in AI and LLMs.

Read more