Submission Guide
A complete walkthrough from applying for access to seeing your model on the leaderboard. The whole process typically takes less than a week, with most of the time spent on the 1–3 business day review.
Prerequisites
- → An organization doing legitimate veterinary AI development or research
- → A model with a REST API endpoint (HTTPS), or a containerized inference adapter
- → Agreement to our Data Access Policy and Acceptable Use Policy
End-to-End Process
Submit an access request
Go to the Request Access form on the homepage or call POST /api/applications directly. You'll need:
- → Your name and work email
- → Your organization name and type (vendor, academic, etc.)
- → A brief description of the AI tool you want to evaluate and why
You'll receive an immediate confirmation email. Our team reviews within 1–3 business days.
Review and approval
Our team reviews all requests for legitimacy, appropriate use case, and alignment with benchmark governance. We may contact you with follow-up questions. If approved, you'll receive an email with a link to sign the Participant Agreement.
Sign the Participant Agreement
Before receiving API credentials, all participants must sign our Data Access Agreement and Acceptable Use Policy. This is a legal and ethical requirement — violations result in immediate access revocation.
Create an API key
Once approved, log in to your dashboard and create an API key with the benchmark:run scope:
curl -X POST https://animl.health/api/api-keys \
-H "Authorization: Bearer vault_sk_live_..." \
-H "Content-Type: application/json" \
-d '{
"name": "CI benchmark key",
"scopes": ["benchmark:run", "benchmark:read", "model:write"]
}'Register your model
Register the inference endpoint you want VAULT to call:
curl -X POST https://animl.health/api/models \
-H "Authorization: Bearer vault_sk_live_..." \
-H "Content-Type: application/json" \
-d '{
"name": "Acme Vet Summarizer v2",
"endpointType": "REST_API",
"endpointUrl": "https://api.acmevetai.com/summarize",
"authHeaderName": "X-API-Key",
"authHeaderValue": "my-model-api-key"
}'Your endpoint must accept POST with a JSON case input and return a JSON summary. See the API Docs for the exact contract.
Trigger a benchmark run
curl -X POST https://animl.health/api/benchmark-runs \
-H "Authorization: Bearer vault_sk_live_..." \
-H "Content-Type: application/json" \
-H "Idempotency-Key: acme-v2-run-001" \
-d '{
"modelId": "clxyz...",
"benchmarkSuiteId": "clsuite_clinical_summarization_v1_3",
"label": "v2.1 release candidate",
"notifyWebhook": "https://your-server.com/hooks/vault"
}'The run is queued immediately. VAULT will call your endpoint once per case, in random order, within a sandboxed environment. Your model never receives the case identifiers.
Poll for results or wait for webhook
# Poll every 60 seconds curl https://animl.health/api/benchmark-runs/<runId> \ -H "Authorization: Bearer vault_sk_live_..." # Or configure a webhook (step 5) to be notified automatically
When status reaches COMPLETE, your scores are available under metrics. A detailed report is generated and accessible from your dashboard.
Typical run duration for the full 5,000-case suite depends on your model's latency. Estimate: cases × latency, with parallelism applied at VAULT's discretion up to your endpoint's concurrency limits.
Request leaderboard publication
Your results are private by default. If you want to appear on the public leaderboard:
- → Consent to publication from your dashboard
- → Our team reviews the submission (typically within 2 business days)
- → Once approved, your entry is added to the leaderboard
Common Questions
What if my model times out on some cases?
Each case has a 30-second timeout. Timed-out cases are scored as failures and reduce your composite score. Ensure your endpoint can reliably respond within 30 seconds under load.
Can I re-run with a newer model version?
Yes — register a new model (or update an existing one) and trigger a new run. Each run is independently scored. You can have multiple runs in your history.
What if my endpoint needs to warm up?
VAULT doesn't send a pre-warm request before the benchmark. Ensure your service is already warmed and ready before triggering a run. Cold-start latency counts against your median latency score.
Is there a rate limit on benchmark runs?
Beta participants can run up to 3 benchmark runs per calendar month. Contact us if you need more runs for iterative development.
Ready to benchmark?
Submit your access request — approval typically takes 1–3 business days.