Modules

Start with the behavior. Keep the test.

A module is the reusable standard Peeld runs against models and endpoints: checks, prompts, scoring, review logic, and evidence in one versioned package.

01
Define the concern
02
Lock the module version
03
Run it anywhere
04
Export the report
Most teams start here

Turn a policy, product risk, or workflow into repeatable checks.

Peeld is not limited to generic benchmarks. The useful module is usually the one that captures what your product, domain, or review team already cares about.

Domain knowledge

Specialist checks for finance, legal, medical, technical, or customer-specific standards.

Compliance

Source-led modules from regulations, policies, control frameworks, and internal procedures.

Agent workflow

Tool use, recovery paths, handoffs, confirmations, and irreversible action boundaries.

Code behavior

Correctness, hidden tests, safe errors, runtime behavior, and security expectations.

Standard safety

Refusals, hallucination, instruction following, escalation, reliability, and tone.