The ingest endpoint is the entry point for all LLM event data you send to LumiqTrace. You post newline-delimited JSON (NDJSON) — one event object per line — and the platform buffers and processes the data asynchronously. Most integrations use the LumiqTrace SDK, which handles batching, compression, and retries automatically. Use this endpoint directly if you are building a custom integration or sending events from an environment where the SDK is not available.Documentation Index
Fetch the complete documentation index at: https://docs.lumiqtrace.com/llms.txt
Use this file to discover all available pages before exploring further.
Endpoint
x-api-key header (see Authentication)
Request Headers
| Header | Value | Required |
|---|---|---|
x-api-key | lqt_your_api_key | Yes |
Content-Type | application/x-ndjson | Yes |
Content-Encoding | gzip | Recommended |
Gzip compression is strongly recommended. LLM event payloads compress very well — you can expect 5–10x size reduction, which reduces latency and bandwidth costs significantly.
Request Body
The body must be newline-delimited JSON: one complete JSON event object per line, separated by\n. The maximum is 100 events per request. If you have more than 100 events to send, split them into multiple requests.
Event Schema
A UUIDv4 uniquely identifying this event. Used for deduplication — submitting the same
event_id twice will not create a duplicate record.Groups related spans into a single trace. All spans in one user request should share the same
trace_id.Uniquely identifies this span within the trace.
The
span_id of the parent span. Omit for root spans. Used to build the nested span tree in the Trace viewer.ISO 8601 UTC timestamp for when the LLM call began. Example:
"2026-04-20T10:00:00.000Z".The LLM provider. One of
"openai", "anthropic", "google", or "custom".The model identifier as returned by the provider. Example:
"gpt-4o", "claude-sonnet-4-6", "gemini-2.5-flash".The type of LLM operation. One of
"chat", "embed", "image", "tts", or "custom".Total time in milliseconds from request sent to full response received.
Time to first token in milliseconds. Only relevant for streaming responses.
Number of tokens in the prompt / input.
Number of tokens in the model’s response.
Number of input tokens served from the provider’s prompt cache. Used to calculate cache savings.
Total cost of this call in US dollars, calculated by your SDK using current provider pricing.
Outcome of the call. One of
"success", "error", "timeout", "rate_limited", or "cancelled".Provider error code, if any. Example:
"context_length_exceeded".Human-readable error message from the provider.
The reason the model stopped generating. Example:
"stop", "length", "tool_calls".Whether this call used streaming (
true) or a blocking response (false).The deployment environment. Example:
"production", "staging", "development".Your application’s user identifier. Used for per-user cost and usage analytics.
Identifier for a user session grouping multiple traces.
Free-form key-value string pairs for custom segmentation. Example:
{ "feature": "summarizer", "team": "growth" }.Names of any tools or functions called during this LLM turn.
Custom evaluation scores as a map of metric name to numeric value. Example:
{ "faithfulness": 0.94 }.A hash of the prompt template, used to track prompt version performance over time.
The version of the LumiqTrace SDK sending this event. Example:
"1.4.2".Example Request
The example below sends two events as gzip-compressed NDJSON.Response
202 Accepted — The batch was accepted for asynchronous processing. Theaccepted count reflects how many events were queued.
The number of events accepted into the processing queue.
Error Responses
| Status | Condition |
|---|---|
400 | More than 100 events in the request body. |
401 | Missing or invalid x-api-key. |
429 | Monthly event quota exceeded for your organization. |