Overview
This page explains how Vibe Check public detection records are produced and what the current corpus does and does not prove.
This page explains how public detection records are calculated, how the multi-pass validation chain works, what the measurements mean, and how failed scans and safe-control validation are handled.
This page explains how Vibe Check public detection records are produced and what the current corpus does and does not prove.
A detection record is a public accounting of measured Vibe Check results on a defined validation corpus. It reports scanner output on that corpus only. It does not publish comparative claims or competitor framing.
This corpus is useful for cross-framework calibration, persistence quality, exploitability-retention review, and pipeline discipline.
It is not a substitute for production-code validation on ordinary maintained repositories.
275 artifact rows into 269 unique classified findings.manifest capture = credited seeded cases / total seeded casesframework template applied identifies the framework-aware scan profile selected for that target.Pass 1 classifies persisted findings against broad CVE, CWE, and OWASP reference sets. It is intentionally permissive and catches likely pattern matches quickly.
Possible outputs:
PATTERN_MATCHLIKELY_FALSE_POSITIVEUNMATCHEDNOVEL_FINDINGTRUE_POSITIVEPass 2 re-checks findings with fuller file context and benchmark awareness. It is narrower than Pass 1 and is used to separate intentionally vulnerable training material from weaker broad labels.
G1 used a different, adversarial methodology to challenge both earlier passes and surface any novelty or false-positive blind spots. It sampled findings across severity levels and framework families rather than simply repeating earlier broad classification logic.
I1 normalized the full corpus into provenance-aware labels:
TRAINING_APP_INTENTIONALTEST_FIXTUREThat normalization step is what turns this corpus from a broad vulnerability list into an honest account of what kind of material was actually scanned.
The remaining escalation queue was then resolved by deterministic provenance-plus-context review so that code-level classification decisions did not require founder review.
The final normalized counts are:
TRAINING_APP_INTENTIONAL: 205TEST_FIXTURE: 64PATTERN_MATCH: 0LIKELY_FALSE_POSITIVE: 0UNMATCHED: 0NOVEL_FINDING: 0This corpus is built from intentionally vulnerable training apps, public benchmarks, and synthetic fixtures we control. Those targets are designed to surface known or deliberately planted vulnerability classes.
A result of 0 NOVEL_FINDING on this corpus is therefore the expected honest outcome. It is not evidence that the validation chain failed.
Those questions require production-code validation, which is tracked separately.
n/a (scan failed), not 0.0%.Taxonomy normalization distinguishes between findings on deliberately vulnerable training material and findings on ordinary software. That distinction prevents the public record from overstating what the corpus demonstrates.
In this record, normalization means:
TRAINING_APP_INTENTIONALTEST_FIXTURETRUE_POSITIVE or NOVEL_FINDINGThe Java Spring rows currently shown in the public record come from pre-fix production scan IDs captured before the later Java fetch-window correction. They remain historically accurate, but they are not the final word on post-fix Java coverage.
Until Q1 completes, Java Spring stays qualified.
Production-code validation is a separate evidence stream from the training corpus. Firefox and later maintained production repositories are the right place to evaluate:
27562694 seeded vulnerabilities and Vibe Check catches all 4, manifest capture is 100%.4 seeded vulnerabilities, Vibe Check catches 2, misses 2, and also finds 3 additional non-manifest issues, manifest capture is still 50%.TRAINING_APP_INTENTIONAL, because the repository is intentionally vulnerable by design.tests/comparison_corpus/repositories.jsonAnyone can rerun the corpus and verify: 1. the same targets and refs 2. persisted findings and severity mix 3. exploitability-retention outcomes 4. normalized taxonomy counts 5. the distinction between corpus validation and production-code validation