The TypeScript SDK wraps your existing LLM client instances and automatically captures latency, token counts, costs, and errors for every call — with no changes to your application logic beyond initialization.Documentation Index
Fetch the complete documentation index at: https://docs.lumiqtrace.com/llms.txt
Use this file to discover all available pages before exploring further.
Installation
Install the package
Initialize LumiqTrace
Call
lumiqtrace.init() once at application startup, before any LLM calls are made.Init options
lumiqtrace.init(options) accepts the following configuration:
Your LumiqTrace API key. Must start with
lqt_. Find this in your project settings.Environment label attached to every event. Use
"staging" or "development" to separate traces by environment.When
true, prompt text and completion text are stored alongside traces. Disabled by default for privacy. Enable only after reviewing your data retention policy.Fraction of events to send, between
0.0 and 1.0. Set to 0.1 to trace 10% of calls. Useful for high-volume production applications.Keys whose values are redacted before any data is stored. Applied regardless of
storePrompts.When
true, logs internal errors to the console. Enable during integration testing.Number of events to accumulate before flushing. Events are also flushed every
flushInterval milliseconds.Milliseconds between automatic flushes. The SDK also flushes on process exit.
Override the API base URL. Use this only if you are self-hosting the ingest endpoint.
When
true, also exports spans to an OpenTelemetry-compatible backend. Requires otelEndpoint or a configured OTEL exporter.Wrapping providers
OpenAI
wrapOpenAI(client) patches client.chat.completions.create for both streaming and non-streaming calls. Cached token counts are extracted from prompt_tokens_details.cached_tokens.
TTFT is measured from the call start to the first non-empty chunk and appears as
ttft_ms in your LumiqTrace dashboard.Anthropic
wrapAnthropic(client) patches client.messages.create. Token fields are extracted from result.usage.input_tokens, output_tokens, and cache_read_input_tokens.
Prompt cache read tokens (
cache_read_input_tokens) are tracked separately and appear as cached_tokens in traces. This lets you see the real savings from Anthropic’s prompt caching.Google Generative AI
wrapGoogle(client) patches the getGenerativeModel() method and captures token counts from response.usageMetadata.
OpenRouter
wrapOpenRouter(client) wraps OpenRouter clients, which share the OpenAI SDK interface.
AWS Bedrock
wrapBedrock(client) patches the converse method on an AWS Bedrock runtime client. Token counts are extracted from result.usage.inputTokens, outputTokens, and cacheReadInputTokens.
wrapBedrock patches client.converse. The converseStream method is not currently wrapped — streaming Bedrock calls are not traced.Mistral
wrapMistral(client) patches client.chat.complete. Token counts are extracted from result.usage.promptTokens and completionTokens.
Groq
wrapGroq(client) patches client.chat.completions.create. Token counts are extracted from result.usage.prompt_tokens and completion_tokens.
LiteLLM
wrapLiteLLM(options) creates an OpenAI-compatible client pointed at your LiteLLM proxy and wraps it with the OpenAI wrapper. Use this when you route multiple providers through a LiteLLM gateway.
The base URL of your LiteLLM proxy server.
Optional API key for your LiteLLM proxy. Defaults to
"litellm" if omitted.Context enrichment
UsewithLumiqtraceContext to attach a userId, sessionId, or custom tags to all traces generated within a function scope. This is useful for associating LLM calls with a specific user session or request.
withLumiqTrace is an alias for withLumiqtraceContext and behaves identically.