LambdaLLM¶
Serverless-native LLM orchestration framework for AWS Lambda.
The Problem¶
Existing LLM frameworks (LangChain, LlamaIndex) assume long-running servers. They break on Lambda:
- :material-timer-alert: Cold starts: 500MB+ dependency trees add seconds
- :material-memory: Stateless: No conversation memory between invocations
- :material-clock-alert: 15-min timeout: Long agent loops crash
- :material-package-variant: 250MB limit: LangChain alone exceeds this
The Solution¶
from lambdallm import handler, Prompt, Model
summarize = Prompt(
template="Summarize in {max_words} words:\n\n{document}",
output_schema={"summary": str, "key_points": list}
)
@handler(model=Model.CLAUDE_3_HAIKU)
def lambda_handler(event, context):
return summarize.invoke(
_context=context,
document=event["body"]["text"],
max_words=100
)
Install¶
TypeScript Quick Start¶
import { handler, Model, Chain, Step } from 'substrai-lambdallm';
export const lambdaHandler = handler(
{ model: Model.CLAUDE_3_HAIKU },
async (event, context) => {
const result = await context.invoke('Summarize: {text}', { text: event.body.text });
return { statusCode: 200, body: { result, cost: context.totalCost } };
}
);
Key Features¶
| Feature | Description |
|---|---|
| < 5MB package | vs 400MB+ for LangChain |
| Cold-start optimized | Lazy imports, connection pooling |
| DynamoDB state | Conversation memory that persists |
| Multi-step chains | Checkpoint/resume on timeout |
| AI Agents | ReAct loop with tool sandboxing |
| Cost-aware routing | Auto-select cheapest model |
| One-command deploy | lambdallm deploy |
| A/B testing | Compare prompt versions |
| Full observability | X-Ray + CloudWatch built-in |
Quick Start¶
Then test:
Built by SubstrAI — Open-source GenAI frameworks for serverless infrastructure.
Author: Gaurav Kumar Sinha (gaurav@substrai.dev)