API Reference¶
Complete reference for all LambdaLLM public APIs.
Core¶
handler(model, timeout_strategy, timeout_buffer, max_retries, fallback_model, middleware, session, router)¶
Decorator that wraps a Lambda handler with LLM orchestration.
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | Model or str | CLAUDE_3_HAIKU | Default model to use |
| timeout_strategy | str | "fail-fast" | How to handle timeout: fail-fast, truncate, checkpoint |
| timeout_buffer | int | 5 | Seconds to reserve before Lambda timeout |
| max_retries | int | 3 | Retries on transient model errors |
| fallback_model | Model or str | None | Model to use if primary fails |
| middleware | list | None | Middleware instances to apply |
| session | Session | None | Session config for state management |
| router | Router | None | Cost-aware model router |
Example:
@handler(model=Model.CLAUDE_3_HAIKU, max_retries=3, fallback_model=Model.TITAN_TEXT_EXPRESS)
def lambda_handler(event, context):
result = context.invoke("Hello {name}", name="World")
return {"statusCode": 200, "body": result}
Prompt(template, input_schema, output_schema, system_prompt, model, max_tokens, temperature, name, version)¶
Type-safe prompt template with validation and structured output.
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
| template | str | required | Prompt template with {variable} placeholders |
| input_schema | dict | None | Expected input types: {"var": type} |
| output_schema | dict | None | Expected output structure (enables JSON parsing) |
| system_prompt | str | None | System prompt prepended to template |
| model | Model | None | Override default model for this prompt |
| max_tokens | int | 1024 | Maximum response tokens |
| temperature | float | 0.7 | Model temperature |
| name | str | None | Prompt name (for analytics tracking) |
| version | str | "1.0.0" | Prompt version (for A/B testing) |
Methods:
| Method | Returns | Description |
|---|---|---|
| format(**kwargs) | str | Format template with variables |
| invoke(_context, **kwargs) | str or dict | Invoke LLM and return result |
| to_dict() | dict | Serialize prompt to dictionary |
| from_dict(data) | Prompt | Deserialize from dictionary |
| from_yaml(path) | Prompt | Load from YAML file |
Model (Enum)¶
Supported model identifiers.
| Value | Model ID |
|---|---|
| Model.CLAUDE_3_HAIKU | anthropic.claude-3-haiku-20240307-v1:0 |
| Model.CLAUDE_3_SONNET | anthropic.claude-3-sonnet-20240229-v1:0 |
| Model.CLAUDE_3_5_SONNET | anthropic.claude-3-5-sonnet-20241022-v2:0 |
| Model.CLAUDE_3_OPUS | anthropic.claude-3-opus-20240229-v1:0 |
| Model.TITAN_TEXT_EXPRESS | amazon.titan-text-express-v1 |
| Model.TITAN_TEXT_LITE | amazon.titan-text-lite-v1 |
| Model.LLAMA3_8B | meta.llama3-8b-instruct-v1:0 |
| Model.LLAMA3_70B | meta.llama3-70b-instruct-v1:0 |
LambdaLLMContext¶
Context object passed to handler functions. Provides model invocation and state access.
Properties:
| Property | Type | Description |
|---|---|---|
| remaining_time_ms | int | Milliseconds before Lambda timeout |
| should_checkpoint | bool | Whether to save progress now |
| total_cost | float | Cumulative cost in USD |
| session | Session | Loaded session state |
| metrics | Metrics | Metrics collector |
Methods:
| Method | Returns | Description |
|---|---|---|
| invoke(prompt, **kwargs) | str | Invoke LLM with prompt template |
| invoke_structured(prompt, schema, **kwargs) | dict | Invoke and parse JSON response |
| get_raw_client(service) | boto3.Client | Escape hatch: raw AWS client |
Chains¶
Chain(name, steps, timeout_strategy, max_total_cost)¶
Declarative multi-step LLM pipeline.
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
| name | str | required | Chain identifier |
| steps | list[Step] | required | Ordered list of steps |
| timeout_strategy | str | "fail-fast" | checkpoint, truncate, or fail-fast |
| max_total_cost | float | None | USD limit for entire chain |
Methods:
| Method | Returns | Description |
|---|---|---|
| run(context, **kwargs) | ChainResult | Execute the chain |
| get_step(name) | Step | Get step by name |
Step(name, prompt, func, model, output_schema, condition)¶
A single step in a chain.
| Parameter | Type | Description |
|---|---|---|
| name | str | Unique step identifier |
| prompt | str | LLM prompt template (use {var} and {step.output}) |
| func | Callable | Python transform function (alternative to prompt) |
| output_schema | dict | Expected JSON output structure |
| condition | Callable | Skip step if returns False |
State¶
Session(store, ttl_hours, memory, max_messages)¶
Conversation session with persistent state.
| Parameter | Type | Default | Description |
|---|---|---|---|
| store | str | "dynamodb" | State backend: "dynamodb" or "memory" |
| ttl_hours | int | 24 | Session expiration time |
| memory | MemoryStrategy | SLIDING_WINDOW | How to manage history |
| max_messages | int | 20 | Max messages to keep |
Methods:
| Method | Description |
|---|---|
| load(session_id) | Load session from store |
| save() | Persist session (only if modified) |
| add_message(role, content) | Add message to history |
| get_history() | Get messages as list of dicts |
| format_history() | Get messages as formatted string |
| clear() | Remove all messages |
Agents¶
Agent(name, system_prompt, tools, max_iterations, timeout_buffer, max_cost_usd)¶
ReAct-style AI agent with tool usage.
| Parameter | Type | Default | Description |
|---|---|---|---|
| name | str | required | Agent identifier |
| system_prompt | str | required | Agent instructions |
| tools | list | required | List of @Tool decorated functions |
| max_iterations | int | 10 | Max reasoning loops |
| timeout_buffer | int | 30 | Seconds reserved before timeout |
| max_cost_usd | float | None | Cost limit per execution |
Methods:
| Method | Returns | Description |
|---|---|---|
| run(query, context) | AgentResult | Execute agent reasoning loop |
@Tool(description, name, timeout_seconds)¶
Decorator to register a function as an agent tool.
@Tool(description="Search documents")
def search(query: str, max_results: int = 5) -> list:
"""Search the knowledge base.
Args:
query: Search query string.
max_results: Max results to return.
"""
pass
Parameters and descriptions are auto-extracted from the function signature and docstring.
Middleware¶
Middleware (Base Class)¶
class MyMiddleware(Middleware):
def before_invoke(self, event, context):
# Process before handler
return event
def after_invoke(self, event, result, context):
# Process after handler
return result
def on_error(self, event, error, context):
# Handle errors
pass
Built-in Middleware¶
| Class | Description |
|---|---|
| LoggingMiddleware | Structured JSON logging |
| CostTrackingMiddleware | Budget enforcement |
Observability¶
Tracer¶
from lambdallm.observability import Tracer
tracer = Tracer()
with tracer.span("model.invoke") as span:
span.set_attribute("model_id", "claude-3-haiku")
# ... do work
MetricsEmitter¶
from lambdallm.observability import MetricsEmitter
emitter = MetricsEmitter(namespace="MyApp")
emitter.record("custom.metric", 42.0, unit="Count")
emitter.flush()
CostTracker¶
from lambdallm.observability import CostTracker
tracker = CostTracker(daily_budget=50.0)
tracker.check_budget() # Raises BudgetExceededError if over
Testing¶
MockProvider¶
from lambdallm.testing import MockProvider, mock_model, MockLambdaContext
@mock_model(responses=["Test response"])
def test_my_handler():
result = my_handler({"body": '{"text": "test"}'}, MockLambdaContext())
assert result["statusCode"] == 200