docs/features/chat/architecture.md

Chat Architecture

Technical deep-dive into the chat system architecture.

System Diagram

Package Responsibilities

@portfolio/chat-contract

Defines shared types with Zod:

// Request schema
const ChatRequestSchema = z.object({
  message: z.string().min(1).max(2000),
  conversationId: z.string().optional(),
});

// Document schema
const DocumentSchema = z.object({
  id: z.string(),
  content: z.string(),
  metadata: z.object({
    source: z.string(),
    timestamp: z.string().optional(),
  }),
  embedding: z.array(z.number()).optional(),
});

@portfolio/chat-data

Implements hybrid search:

// Create search index
const index = createSearchIndex(documents, {
  fields: ['content', 'title'],
  storeFields: ['content', 'metadata'],
});

// Hybrid search combining lexical + semantic
const results = search(index, {
  query: userMessage,
  embedding: queryEmbedding,
  topK: 8,
  weights: {
    textWeight: 0.3,
    semanticWeight: 0.5,
    recencyLambda: 0.05,
  },
});

@portfolio/chat-orchestrator

Manages the LLM pipeline:

// Create orchestrator
const orchestrator = createOrchestrator({
  openai: openaiClient,
  config: chatConfig,
  documents: loadedDocuments,
});

// Stream response
async function* chat(message: string, context: Context) {
  // Plan retrieval
  const retrievalPlan = await planQuery(message);

  // Retrieve documents
  const docs = await retrieveDocuments(retrievalPlan);

  // Build prompt
  const prompt = buildPrompt(message, docs, persona);

  // Stream response
  for await (const chunk of streamCompletion(prompt)) {
    yield chunk;
  }
}

@portfolio/chat-next-api

Handles HTTP layer:

export async function POST(request: NextRequest) {
  // Validate request
  const body = await request.json();
  const { message } = ChatRequestSchema.parse(body);

  // Check rate limit
  await checkRateLimit(request);

  // Check budget
  await checkBudget();

  // Stream response
  const stream = createStreamingResponse(
    orchestrator.chat(message)
  );

  // Track cost after completion
  trackCost(stream.usage);

  return new Response(stream);
}

@portfolio/chat-next-ui

React integration:

export function useChat() {
  const [messages, setMessages] = useState([]);
  const [isStreaming, setIsStreaming] = useState(false);

  const sendMessage = async (message: string) => {
    setIsStreaming(true);

    const response = await fetch('/api/chat', {
      method: 'POST',
      body: JSON.stringify({ message }),
    });

    const reader = response.body.getReader();
    const parser = createEventStreamParser();

    for await (const event of parseStream(reader, parser)) {
      if (event.type === 'chunk') {
        appendToMessage(event.content);
      }
    }

    setIsStreaming(false);
  };

  return { messages, sendMessage, isStreaming };
}

Retrieval Strategy

Document Sources

| Source | Content | Update Frequency | |--------|---------|------------------| | Profile | Bio, skills | Build time | | Resume | Work experience | Build time | | Projects | Project descriptions | Build time | | Persona | AI personality | Build time |

Scoring Algorithm

Documents are scored using weighted combination:

score = (textWeight × lexicalScore)
      + (semanticWeight × semanticScore)
      × exp(-recencyLambda × ageInDays)

Relevance Filtering

Documents below minRelevanceScore threshold are excluded:

retrieval:
  minRelevanceScore: 0.3  # 30% of top match

Streaming Architecture

Lambda Function URL

Chat uses Lambda Function URL with response streaming:

// open-next.config.ts
functions: {
  chat: {
    patterns: ['/api/chat', '/api/chat/*'],
    override: {
      wrapper: 'aws-lambda-streaming',
    },
  },
}

SSE Format

Server-Sent Events with JSON payloads:

data: {"type":"chunk","content":"Hello"}
data: {"type":"chunk","content":" world"}
data: {"type":"done","usage":{"prompt":100,"completion":50}}

Cost Management

Budget Enforcement

async function checkBudget(): Promise<boolean> {
  const monthlySpend = await getMonthlySpend();
  const budget = config.cost.budgetUsd;

  if (monthlySpend >= budget) {
    throw new BudgetExceededError();
  }

  return true;
}

Cost Tracking

Costs tracked in DynamoDB:

| Column | Description | |--------|-------------| | owner_env | Owner/environment | | year_month | YYYY-MM | | total_cost | Running total | | request_count | Number of requests |

CloudWatch Metrics

Published after each request:

await cloudwatch.putMetricData({
  Namespace: 'PortfolioChat/OpenAI',
  MetricData: [{
    MetricName: 'EstimatedCost',
    Value: requestCost,
    Unit: 'None',
  }],
});

Error Handling

Validation Errors

Return 400 with error details:

{
  "error": "validation_error",
  "message": "Message too long",
  "field": "message"
}

Rate Limiting

Return 429 with retry information:

{
  "error": "rate_limited",
  "retryAfter": 60
}

Budget Exceeded

Return 503 with explanation:

{
  "error": "budget_exceeded",
  "message": "Monthly chat budget reached"
}