Probes

Jailbreak (DAN + variants). Encoding attacks. Data leakage. Toxicity elicitation. Continuously growing library.

Advertisement

Detectors

Per probe, detector checks if attack succeeded. Rules-based + ML classifiers.

Advertisement

Reports

Vulnerability score per category. Suitable for management dashboards.