Modules
Start with the behavior. Keep the test.
A module is the reusable standard Peeld runs against models and endpoints: checks, prompts, scoring, review logic, and evidence in one versioned package.
01
Define the concern
02
Lock the module version
03
Run it anywhere
04
Export the report
Most teams start here
Turn a policy, product risk, or workflow into repeatable checks.
Peeld is not limited to generic benchmarks. The useful module is usually the one that captures what your product, domain, or review team already cares about.
Domain knowledge
Specialist checks for finance, legal, medical, technical, or customer-specific standards.
Compliance
Source-led modules from regulations, policies, control frameworks, and internal procedures.
Agent workflow
Tool use, recovery paths, handoffs, confirmations, and irreversible action boundaries.
Code behavior
Correctness, hidden tests, safe errors, runtime behavior, and security expectations.
Standard safety
Refusals, hallucination, instruction following, escalation, reliability, and tone.