Inference benchmarking · configuration recommendation · auditable exports

Make inference configuration decisions with evidence, not guesswork.

Sigilant Labs runs controlled benchmarks across candidate configurations (quantization, context, batch, and runtime parameters) and produces a recommendation with the supporting artifacts you need for review and reproducibility.

View pricing Read docs Request access

Outputs include per-variant metrics, gate results, and exportable JSON/CSV suitable for internal sign‑off and iteration tracking.

What you receive

Recommendation with the top configuration for the declared target profile.
Variant breakdown (latency/throughput/memory + quality gates) for each candidate.
Exports: JSON + CSV artifacts designed for audits and reproducible comparisons.

Supported focus

GGUF / llama.cpp deployments
CPU and cloud target profiles
Multiple quantizations and context/batch ladders

How it works

1) Define the run

Select a model artifact and specify a target hardware profile (e.g., CPU class or cloud flavor). Choose candidate quants and constraints.

2) Execute controlled benchmarks

Sigilant evaluates variants under consistent conditions to reduce run-to-run variance and surface tradeoffs.

3) Review and export

Inspect metrics and gates, then export artifacts (JSON/CSV) for documentation, sharing, and future comparisons.

Operational principles

Reproducibility first: artifacts are structured so teams can re-run, compare, and explain decisions.
Transparent constraints: memory and quality gates are explicit, not implicit.
Buyer-friendly commerce: payments, taxes, and refunds are handled by Paddle as Merchant of Record.

FAQ

Is this a subscription? Not initially. We currently support prepaid credit packs. Subscription plans may be introduced later.

What consumes credits? Credits are consumed when a run is executed. Estimated consumption is shown before confirmation.

Do results vary? Yes. Performance depends on hardware, model, and workload. We provide controlled settings and report variance where applicable.

How do I get access? Use the contact page to request access; we onboard accounts and provide console credentials.