Skip to content

RAG Implementation Plan

Created: November 23, 2025 Status: Phase 0 (Data Collection) → Ready to implement Next Review: January 6, 2026 (before Phase 8.5)


Executive Summary

This document outlines the phased implementation of Retrieval-Augmented Generation (RAG) for CodeSlick's AI-powered fix suggestions.

Goal: Use RAG to improve fix quality, consistency, and team-specific customization WITHOUT replacing the core LLM.

Approach: Start with lightweight data collection (Phase 0), prove ROI with code style learning (Phase 1), then expand to repository context and documentation retrieval (Phases 2-3).

Timeline: - Phase 0 (NOW): Data collection infrastructure ✅ - Phase 1 (Q1 2026): Lightweight RAG POC (2 weeks) - Phase 2 (Q2 2026): Repository context (3 weeks) - Phase 3 (Q3 2026): Documentation retrieval (2 weeks)


Problem Statement

Current Limitations

  1. No Repository Context (300-line limit)
  2. AI analyzes files in isolation
  3. Doesn't understand project structure, dependencies, or patterns
  4. Fixes are generic (Stack Overflow-style)

  5. No Code Style Awareness

  6. AI doesn't learn team preferences (async/await vs .then(), single vs double quotes)
  7. Developers must manually edit fixes to match style
  8. Lowers acceptance rate

  9. No Documentation Grounding

  10. AI may suggest outdated fixes
  11. Missing framework-specific best practices
  12. No citations (OWASP, CVE, docs)

Impact on Metrics

Without RAG (Current): - Acceptance rate: 70% (target: 85%+) - False positive rate: 5% (target: <3%) - Modification rate: 25% (fixes need manual edits) - Team NPS: 7.5/10 (target: 8.5+)

With RAG (Projected): - Acceptance rate: 85%+ (↑15pp) - False positive rate: <3% (↓2pp) - Modification rate: <10% (↓15pp) - Team NPS: 8.5+/10 (↑1 point)


Solution: Three-Phased RAG Implementation

Overview

Phase Focus Timeline ROI Risk
Phase 0 Data Collection NOW Foundation Low
Phase 1 Code Style Learning Q1 2026 (2 weeks) High Low
Phase 2 Repository Context Q2 2026 (3 weeks) Very High Medium
Phase 3 Documentation Retrieval Q3 2026 (2 weeks) Medium Low

Why This Phasing?

  1. Phase 1 first → Creates data flywheel (the more fixes, the better the next fixes)
  2. Phase 2 second → Solves 300-line limit (biggest pain point for large files)
  3. Phase 3 last → Nice-to-have, not critical for PMF

Phase 0: Data Collection Infrastructure ✅ COMPLETE

Goal

Build infrastructure to collect high-quality labeled data for future RAG.

Implementation

Completed (Nov 23, 2025): - ✅ Database schema (schema-fix-suggestions.ts) - ✅ Data collector service (fix-suggestion-collector.ts) - ✅ Helper functions (extractRepoContext, detectCodeStyle) - ✅ Documentation (src/lib/rag/README.md)

Next Steps (This Week): 1. Run migration to create database tables 2. Integrate into Apply Fix API (collect when user accepts fix) 3. Integrate into AI fix generation (collect when AI creates suggestion) 4. Add reject endpoint (collect when user rejects fix)

Data Schema

fix_suggestions table stores: - Vulnerability details (type, severity, CVSS, OWASP, CWE) - Code context (original, fixed, language, framework) - AI model details (provider, name, tokens, generation time) - User actions (accepted, rejected, modified, ignored) - RAG context: - repoContext: Dependencies, architecture, related files - codeStylePatterns: Quotes, async style, destructuring, etc. - relatedDocs: OWASP, CVE, framework docs links

Success Criteria

  • ✅ 100+ fix suggestions collected in first 2 weeks
  • ✅ User actions captured (accept/reject/modify)
  • ✅ Code style patterns detected automatically
  • ✅ No performance impact on Apply Fix API (<50ms overhead)

Decision Gate: If data collection works reliably for 2 weeks → Proceed to Phase 1


Phase 1: Code Style Learning (Lightweight RAG POC)

Goal

Prove RAG value with minimal investment: Learn team code style from past accepted fixes.

Timeline

Duration: 2 weeks (Q1 2026) Prerequisites: 100+ accepted fixes in database

Implementation

Week 1: Embeddings & Retrieval

Day 1-2: Add pgvector

-- Enable pgvector extension
CREATE EXTENSION vector;

-- Migrate fix_embeddings.embedding from JSONB to vector
ALTER TABLE fix_embeddings
  ADD COLUMN embedding_vector vector(1536);

-- Create index for similarity search
CREATE INDEX ON fix_embeddings
  USING ivfflat (embedding_vector vector_cosine_ops);

Day 3-4: Generate Embeddings

// src/lib/rag/embeddings-generator.ts
import OpenAI from 'openai';

export async function generateFixEmbedding(fix: FixSuggestion): Promise<number[]> {
  const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

  // Embedding text: combine context for best retrieval
  const embeddingText = `
    Language: ${fix.language}
    Framework: ${fix.framework}
    Vulnerability: ${fix.vulnerabilityType}
    Original Code: ${fix.originalCode}
    Fixed Code: ${fix.suggestedFix}
    Explanation: ${fix.explanation}
  `.trim();

  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small', // $0.0001 per 1k tokens
    input: embeddingText,
  });

  return response.data[0].embedding; // 1536 dimensions
}

// Batch process: Embed all past accepted fixes
export async function backfillEmbeddings() {
  const acceptedFixes = await db.query.fixSuggestions.findMany({
    where: eq(fixSuggestions.userAction, 'accepted'),
  });

  for (const fix of acceptedFixes) {
    const embedding = await generateFixEmbedding(fix);

    await db.insert(fixEmbeddings).values({
      fixSuggestionId: fix.id,
      embedding: embedding, // pgvector will handle this
      embeddingText: `${fix.originalCode}${fix.suggestedFix}`,
      embeddingType: 'fix_pair',
    });
  }
}

Day 5: Similarity Retrieval

// src/lib/rag/fix-retriever.ts
export async function retrieveSimilarFixes(
  issue: VulnerabilityIssue,
  teamId: string,
  topK: number = 3
): Promise<FixSuggestion[]> {
  // 1. Generate embedding for current issue
  const currentEmbedding = await generateFixEmbedding({
    language: issue.language,
    framework: issue.framework,
    vulnerabilityType: issue.type,
    originalCode: issue.code,
  });

  // 2. Find top-K similar past fixes using cosine similarity
  const similarFixes = await db.execute(sql`
    SELECT fs.*, fe.embedding_vector <=> ${currentEmbedding}::vector AS distance
    FROM fix_suggestions fs
    JOIN fix_embeddings fe ON fe.fix_suggestion_id = fs.id
    WHERE fs.team_id = ${teamId}
      AND fs.user_action IN ('accepted', 'modified')
      AND fs.language = ${issue.language}
      AND fs.acceptance_score > 75
    ORDER BY fe.embedding_vector <=> ${currentEmbedding}::vector
    LIMIT ${topK}
  `);

  return similarFixes;
}

Week 2: Integration & A/B Testing

Day 1-2: Update AI Prompt

// src/lib/ai/fix-generator.ts
export async function generateFixWithRAG(issue: VulnerabilityIssue, teamId: string) {
  // Retrieve similar past fixes
  const similarFixes = await retrieveSimilarFixes(issue, teamId, 3);

  // Enhanced prompt with RAG context
  const prompt = `
You are a security engineer fixing vulnerabilities in code.

**Past fixes in this repository** (use as style guide):
${similarFixes.map((fix, i) => `
  Example ${i + 1}:
  - File: ${fix.filePath}
  - Vulnerability: ${fix.vulnerabilityType}
  - Original Code:
    ${fix.originalCode}
  - Fixed Code:
    ${fix.suggestedFix}
  - User Action: ${fix.userAction} (acceptance score: ${fix.acceptanceScore}/100)
  - Code Style: ${JSON.stringify(fix.codeStylePatterns)}
`).join('\n')}

**Current issue**:
- File: ${issue.filePath}
- Vulnerability: ${issue.type}
- Severity: ${issue.severity}
- Code:
  ${issue.code}

**Instructions**:
1. Fix the vulnerability following the style of past accepted fixes
2. Match the code style patterns (quotes, async style, destructuring, etc.)
3. Provide a diff and explanation

Respond in JSON format: { fixedCode, explanation, confidence }
`;

  const response = await aiModel.chat({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [{ role: 'user', content: prompt }],
  });

  return parseAIResponse(response);
}

Day 3-4: A/B Test

// Randomly assign 50% to RAG, 50% to baseline
const useRAG = Math.random() < 0.5;

const fix = useRAG
  ? await generateFixWithRAG(issue, teamId)
  : await generateFixBaseline(issue);

// Store which version was used
await FixSuggestionCollector.collectSuggestion({
  ...fix,
  metadata: {
    ragEnabled: useRAG,
    similarFixesCount: useRAG ? similarFixes.length : 0,
  },
});

Day 5: Analyze Results

-- Compare acceptance rates: RAG vs Baseline
SELECT
  metadata->>'ragEnabled' as rag_enabled,
  COUNT(*) as total_suggestions,
  SUM(CASE WHEN user_action = 'accepted' THEN 1 ELSE 0 END) as accepted,
  ROUND(100.0 * SUM(CASE WHEN user_action = 'accepted' THEN 1 ELSE 0 END) / COUNT(*), 2) as acceptance_rate,
  AVG(acceptance_score) as avg_acceptance_score
FROM fix_suggestions
WHERE created_at >= NOW() - INTERVAL '7 days'
GROUP BY metadata->>'ragEnabled';

Success Criteria

Metrics (measured over 2 weeks): - ✅ Acceptance rate: 70% → 80%+ (↑10pp minimum) - ✅ Modification rate: 25% → <15% (↓10pp minimum) - ✅ User NPS: +0.5 points - ✅ Cost per fix: <$0.10 additional (embeddings are cheap)

Qualitative: - ✅ Fixes match team code style without manual edits - ✅ No performance regression (<10s generation time maintained) - ✅ User feedback: "Fixes feel more tailored to our codebase"

Decision Gate: If all criteria met → Proceed to Phase 2. If not → Investigate root cause before expanding.

Cost Analysis

Embedding Generation:
- Model: text-embedding-3-small ($0.0001 per 1k tokens)
- Avg fix: 500 tokens = $0.00005 per embedding
- 1000 fixes/month = $0.05/month

Storage (pgvector):
- 1536 dimensions × 4 bytes = 6.14 KB per embedding
- 1000 embeddings = 6.14 MB
- Cost: negligible

Retrieval:
- Cosine similarity search: <10ms (with ivfflat index)
- 1000 retrievals/month = negligible compute cost

Total Additional Cost: <$5/month (includes buffer)

Phase 2: Repository Context

Goal

Break the 300-line limit by retrieving relevant context from the entire repository.

Timeline

Duration: 3 weeks (Q2 2026) Prerequisites: Phase 1 success validated

Implementation

Week 1: Repository Indexing

Day 1-2: Index Repository Structure

// src/lib/rag/repo-indexer.ts
export interface RepoIndex {
  files: FileIndex[];
  dependencies: Record<string, string>;
  architecture: string; // "REST API", "GraphQL", "Microservices"
}

export interface FileIndex {
  path: string;
  language: string;
  imports: string[]; // Imported modules
  exports: string[]; // Exported functions/classes
  functions: FunctionSignature[];
  astHash: string; // For change detection
}

export async function indexRepository(
  owner: string,
  repo: string,
  branch: string
): Promise<RepoIndex> {
  // 1. Fetch all source files from GitHub
  const files = await fetchAllSourceFiles(owner, repo, branch);

  // 2. Parse each file to extract structure
  const fileIndexes = await Promise.all(
    files.map(async (file) => {
      const ast = await parseFileAST(file.content, file.language);
      return {
        path: file.path,
        language: file.language,
        imports: extractImports(ast),
        exports: extractExports(ast),
        functions: extractFunctionSignatures(ast),
        astHash: hashAST(ast),
      };
    })
  );

  // 3. Detect architecture pattern
  const architecture = detectArchitecture(fileIndexes);

  // 4. Extract dependencies
  const dependencies = await extractDependencies(owner, repo, branch);

  return {
    files: fileIndexes,
    dependencies,
    architecture,
  };
}

Day 3-4: Embed File Chunks

// Chunk strategy: Embed each file in 200-line chunks with 50-line overlap
export async function embedRepositoryFiles(repoIndex: RepoIndex): Promise<void> {
  for (const file of repoIndex.files) {
    const fileContent = await fetchFileContent(file.path);
    const chunks = chunkFile(fileContent, 200, 50); // 200 lines, 50 overlap

    for (const chunk of chunks) {
      const embedding = await generateEmbedding({
        path: file.path,
        language: file.language,
        imports: file.imports,
        code: chunk.code,
      });

      await db.insert(repoEmbeddings).values({
        repoId: repoIndex.id,
        filePath: file.path,
        chunkIndex: chunk.index,
        chunkCode: chunk.code,
        embedding: embedding,
      });
    }
  }
}

Day 5: Implement Caching

// Cache repo index for 24 hours (invalidate on push)
export async function getRepoIndex(
  owner: string,
  repo: string,
  branch: string
): Promise<RepoIndex> {
  const cacheKey = `repo:${owner}/${repo}:${branch}`;
  const cached = await redis.get(cacheKey);

  if (cached) {
    return JSON.parse(cached);
  }

  const repoIndex = await indexRepository(owner, repo, branch);
  await redis.setex(cacheKey, 86400, JSON.stringify(repoIndex)); // 24 hours
  return repoIndex;
}

Week 2: Context Retrieval

Day 1-3: Retrieve Related Code

// src/lib/rag/context-retriever.ts
export async function retrieveRelevantContext(
  issue: VulnerabilityIssue,
  repoIndex: RepoIndex,
  topK: number = 5
): Promise<CodeContext> {
  // 1. Embed the current issue
  const issueEmbedding = await generateEmbedding({
    path: issue.filePath,
    language: issue.language,
    code: issue.code,
  });

  // 2. Find top-K similar code chunks in repo
  const similarChunks = await db.execute(sql`
    SELECT * FROM repo_embeddings
    WHERE repo_id = ${repoIndex.id}
      AND file_path != ${issue.filePath} -- Exclude current file
    ORDER BY embedding <=> ${issueEmbedding}::vector
    LIMIT ${topK}
  `);

  // 3. Also retrieve explicitly related files (imports, callers)
  const relatedFiles = await findRelatedFiles(issue.filePath, repoIndex);

  return {
    similarCode: similarChunks,
    imports: relatedFiles.imports,
    callers: relatedFiles.callers,
    architecture: repoIndex.architecture,
    dependencies: repoIndex.dependencies,
  };
}

Day 4-5: Update AI Prompt with Context

const context = await retrieveRelevantContext(issue, repoIndex);

const prompt = `
You are a security engineer fixing vulnerabilities.

**Repository Architecture**: ${context.architecture}

**Dependencies**: ${JSON.stringify(context.dependencies)}

**Related Code in Repository** (similar patterns):
${context.similarCode.map((chunk) => `
  File: ${chunk.filePath}
  Code:
  ${chunk.chunkCode}
`).join('\n')}

**Imported Modules** (how they're used elsewhere):
${context.imports.map((imp) => `
  ${imp.name} from ${imp.path}
  Usage: ${imp.sampleUsage}
`).join('\n')}

**Current Issue**:
File: ${issue.filePath}
Line: ${issue.line}
Vulnerability: ${issue.type}
Code:
${issue.code}

Fix this vulnerability while maintaining consistency with the repository's architecture and existing patterns.
`;

Week 3: Integration & Testing

Day 1-2: Performance Optimization

// Parallel retrieval (don't block on embeddings)
const [similarFixes, repoContext] = await Promise.all([
  retrieveSimilarFixes(issue, teamId, 3), // Phase 1
  retrieveRelevantContext(issue, repoIndex, 5), // Phase 2
]);

// Combine contexts
const prompt = buildPrompt({
  similarFixes, // Past team fixes
  repoContext, // Repository structure
  issue,
});

Day 3-5: End-to-End Testing

Test cases: 1. Large file (500+ lines) → Can AI now fix using context from repo? 2. Cross-file vulnerability (function called from multiple files) → Does AI understand callers? 3. Framework-specific pattern → Does AI use repo's existing auth/DB patterns?

Success Criteria

Metrics: - ✅ Can analyze files >300 lines (by retrieving relevant chunks) - ✅ Cross-file awareness: AI understands how functions are called elsewhere - ✅ Acceptance rate: 80% → 85%+ (↑5pp) - ✅ Generation time: <15s (including retrieval overhead)

Qualitative: - ✅ Fixes match repository architecture (REST vs GraphQL vs Microservices) - ✅ Fixes use existing helper functions (don't reinvent the wheel) - ✅ Enterprise feedback: "Understands our entire codebase"

Cost Analysis

Repository Indexing:
- One-time per repo: ~500 files × 200 lines = 100K lines
- Embedding cost: 100K lines ≈ 2M tokens × $0.0001 = $0.20 per repo

Incremental Updates:
- Only re-index changed files (GitHub webhook)
- Avg 10 files/day = 2K lines = $0.004/day = $0.12/month

Storage:
- 500 files × 5 chunks/file × 6 KB/chunk = 15 MB per repo
- 100 repos = 1.5 GB = $0.50/month (S3/Postgres)

Retrieval:
- <50ms per query (with ivfflat index)

Total Additional Cost: ~$1-2/month per active repo

Phase 3: Documentation Retrieval

Goal

Ground AI fixes in official security guidelines and framework documentation.

Timeline

Duration: 2 weeks (Q3 2026) Prerequisites: Phase 2 complete

Implementation

Week 1: Index Documentation

Day 1-2: Scrape & Index OWASP

// src/lib/rag/docs-indexer.ts
export async function indexOWASPDocs(): Promise<void> {
  const owaspTopTen = await fetchOWASPTopTen2025();

  for (const category of owaspTopTen) {
    const embedding = await generateEmbedding({
      title: category.title, // e.g., "A03:2025 - Injection"
      description: category.description,
      examples: category.examples,
      mitigation: category.mitigation,
    });

    await db.insert(docEmbeddings).values({
      docType: 'owasp',
      docId: category.id,
      title: category.title,
      content: category.fullText,
      embedding,
    });
  }
}

Day 3-4: Index Framework Docs

// Index popular framework security docs
const frameworks = [
  'express', 'django', 'flask', 'spring-boot',
  'react', 'vue', 'angular', 'next.js'
];

for (const framework of frameworks) {
  const securityDocs = await fetchFrameworkSecurityDocs(framework);
  // Embed and store...
}

Day 5: Index CVE Database

// Daily sync with NVD (National Vulnerability Database)
export async function syncCVEDatabase(): Promise<void> {
  const recentCVEs = await fetchRecentCVEs(30); // Last 30 days

  for (const cve of recentCVEs) {
    const embedding = await generateEmbedding({
      cveId: cve.id,
      description: cve.description,
      affectedPackages: cve.affectedPackages,
      mitigation: cve.mitigation,
    });

    await db.insert(cveEmbeddings).values({
      cveId: cve.id,
      severity: cve.severity,
      description: cve.description,
      embedding,
    });
  }
}

Week 2: Retrieval & Integration

Day 1-3: Retrieve Relevant Docs

export async function retrieveRelevantDocs(
  issue: VulnerabilityIssue,
  framework?: string
): Promise<Documentation> {
  const issueEmbedding = await generateEmbedding({
    vulnerabilityType: issue.type,
    description: issue.message,
    code: issue.code,
  });

  // Retrieve top-3 OWASP docs
  const owaspDocs = await retrieveSimilarDocs(issueEmbedding, 'owasp', 3);

  // Retrieve top-2 framework docs (if framework detected)
  const frameworkDocs = framework
    ? await retrieveSimilarDocs(issueEmbedding, framework, 2)
    : [];

  // Retrieve top-2 related CVEs
  const cveDocs = await retrieveSimilarDocs(issueEmbedding, 'cve', 2);

  return {
    owasp: owaspDocs,
    framework: frameworkDocs,
    cve: cveDocs,
  };
}

Day 4-5: Update AI Prompt

const docs = await retrieveRelevantDocs(issue, issue.framework);

const prompt = `
You are a security engineer fixing vulnerabilities.

**Official Security Guidelines**:

${docs.owasp.map(doc => `
  [${doc.title}](${doc.url})
  ${doc.description}
  Recommended Fix: ${doc.mitigation}
`).join('\n')}

**Framework Best Practices** (${issue.framework}):
${docs.framework.map(doc => `
  ${doc.title}: ${doc.content}
`).join('\n')}

**Related CVEs**:
${docs.cve.map(doc => `
  ${doc.cveId} (${doc.severity}): ${doc.description}
  Affected: ${doc.affectedPackages}
`).join('\n')}

**Current Issue**:
${issue.code}

Fix following official guidelines. Include citation links in your explanation.
`;

Success Criteria

Metrics: - ✅ Fix explanations include citations (OWASP, CVE, docs) - ✅ Suggestions align with framework best practices - ✅ User trust: "Fixes feel authoritative" (qualitative feedback) - ✅ Acceptance rate: 85% → 88%+ (↑3pp)

Enterprise Value: - ✅ Compliance reports include OWASP/CVE mappings - ✅ Option to add internal company security docs (self-hosted)

Cost Analysis

One-Time Indexing:
- OWASP Top 10: 10 categories × 5K tokens = $0.05
- Framework Docs: 8 frameworks × 10K tokens = $0.80
- CVE Database: 1000 recent CVEs × 1K tokens = $1.00
Total: $1.85 (one-time)

Daily Sync:
- New CVEs: ~5/day × 1K tokens = $0.0005/day = $0.15/month

Storage:
- 1000 docs × 6 KB = 6 MB = negligible

Total Recurring Cost: <$1/month

Privacy & Security

Data Retention Policy

Tier Retention Auto-Delete Export
Free 30 days Yes JSON
Team 90 days Yes JSON, CSV
Enterprise 1 year (configurable) Optional JSON, CSV, SQL dump

PII Redaction

Automatic Redaction (before storage):

export function redactPII(code: string): string {
  return code
    .replace(/\b[\w._%+-]+@[\w.-]+\.[A-Z]{2,}\b/gi, '[EMAIL_REDACTED]')
    .replace(/\b[A-Z0-9]{20,}\b/g, '[API_KEY_REDACTED]')
    .replace(/\bsk_live_[A-Za-z0-9]+/g, '[STRIPE_KEY_REDACTED]')
    .replace(/\bAKIA[0-9A-Z]{16}/g, '[AWS_KEY_REDACTED]');
}

On-Premise Deployment

Enterprise Option: - Self-host RAG database in customer VPC - Embeddings generated on-premise (local OpenAI proxy) - No code leaves customer infrastructure - Still benefit from public docs (OWASP, CVE)

Hybrid Mode: - Private code → On-premise embeddings - Public docs → CodeSlick cloud embeddings

GDPR Compliance

User Rights: - Right to access: Export all stored suggestions via API - Right to deletion: DELETE /api/teams/{id}/rag-data - Right to portability: Download JSON dump

Data Processing Agreement: - Code snippets = "pseudonymized data" (no PII) - Encrypted at rest (AES-256) - Encrypted in transit (TLS 1.3) - SOC2 Type II compliant (by Q4 2026)


Cost-Benefit Analysis

Total Cost (per team, per month)

Phase Indexing Retrieval Storage Total
Phase 0 (Data Collection) $0 $0 $0.10 $0.10
Phase 1 (Code Style) $0.05 $0.02 $0.15 $0.22
Phase 2 (Repo Context) $0.20 $0.10 $0.50 $0.80
Phase 3 (Docs) $0.10 $0.05 $0.05 $0.20

Total Additional Cost: ~$1.50/team/month (all phases combined)

Revenue Impact

Without RAG (Current): - Acceptance rate: 70% - Churn rate: 5%/month - MRR per team: €99

With RAG (Projected): - Acceptance rate: 85% → Higher perceived value - Churn rate: 3%/month → Better retention (-2pp) - MRR per team: €99 (same price, higher value)

Churn Reduction Impact:

100 teams × €99/month = €9,900 MRR
Without RAG: 5% churn = -€495/month lost revenue
With RAG: 3% churn = -€297/month lost revenue
Net Impact: +€198/month retained revenue

Annual Impact: +€2,376/year per 100 teams
Cost: €150/year (100 teams × $1.50/month)

ROI: 1,484% (€2,376 / €150)

Non-Financial Benefits

  1. Competitive Moat: Data flywheel (more usage = better fixes)
  2. Enterprise Sales: "Learns your codebase" is a killer feature
  3. User Delight: Fixes that feel "magical" (exactly what they would've written)
  4. Reduced Support: Fewer "why did AI suggest this?" questions

Rollout Strategy

Gradual Rollout

Phase 1 (Code Style RAG): - Week 1: Internal testing (CodeSlick team repos) - Week 2: 10% of teams (early adopters) - Week 3: 50% of teams (if metrics good) - Week 4: 100% of teams (if no issues)

Phase 2 (Repo Context): - Week 1: 5 beta teams (large repos >100 files) - Week 2: 20% of teams - Week 3: 100% of teams

Feature Flags:

const ragConfig = {
  codeStyleRAG: process.env.ENABLE_CODE_STYLE_RAG === 'true',
  repoContextRAG: process.env.ENABLE_REPO_CONTEXT_RAG === 'true',
  docsRAG: process.env.ENABLE_DOCS_RAG === 'true',
};

// Per-team override
const teamConfig = await getTeamRAGConfig(teamId);
const useRAG = teamConfig?.enableRAG ?? ragConfig.codeStyleRAG;

Monitoring & Alerts

Key Metrics (track in real-time): - Acceptance rate (RAG vs baseline) - False positive rate - Generation time (p50, p95, p99) - Cost per fix - User NPS (weekly survey)

Alerts: - Acceptance rate drops >5pp → Investigate immediately - Generation time >15s → Scale retrieval infrastructure - Cost per fix >$0.50 → Optimize embeddings

Rollback Plan

If metrics regress: 1. Disable RAG via feature flag (instant rollback) 2. Investigate root cause (bad embeddings? Retrieval quality?) 3. Fix issue in staging environment 4. Re-enable for 10% of teams (test fix) 5. Gradual rollout again


Success Metrics Summary

Metric Baseline Phase 1 Phase 2 Phase 3 Target
Acceptance Rate 70% 80% 85% 88% 85%+
False Positive Rate 5% 3% 2.5% 2% <3%
Modification Rate 25% 15% 10% 8% <10%
Generation Time 8s 9s 12s 13s <15s
Cost per Fix $0.05 $0.10 $0.20 $0.25 <$0.30
User NPS 7.5 8.0 8.5 9.0 8.5+
Churn Rate 5% 4% 3% 2.5% <3%

Decision Gates

Phase 0 → Phase 1

Criteria: - ✅ 100+ fix suggestions collected - ✅ User actions tracked reliably - ✅ Code style patterns detected automatically - ✅ No performance impact

Decision: If criteria met → Proceed to Phase 1 (Q1 2026)

Phase 1 → Phase 2

Criteria: - ✅ Acceptance rate: 70% → 80%+ (↑10pp) - ✅ Modification rate: 25% → <15% (↓10pp) - ✅ User NPS: +0.5 points - ✅ Cost per fix: <$0.10

Decision: If criteria met → Proceed to Phase 2 (Q2 2026)

Phase 2 → Phase 3

Criteria: - ✅ Can analyze files >300 lines - ✅ Cross-file awareness validated - ✅ Acceptance rate: 80% → 85%+ (↑5pp) - ✅ Generation time: <15s

Decision: If criteria met → Proceed to Phase 3 (Q3 2026)

Phase 3 → Private Model

Criteria: - ✅ 10,000+ high-quality labeled examples - ✅ Acceptance rate: 85%+ sustained - ✅ Cost per fix: >$0.20 (makes fine-tuning worth it) - ✅ Enterprise demand: 50+ customers

Decision: Evaluate in Q4 2026


Next Steps (This Week)

  1. ✅ Create database schema (schema-fix-suggestions.ts)
  2. ✅ Create data collector (fix-suggestion-collector.ts)
  3. Run migration: npx drizzle-kit push:pg
  4. Integrate into Apply Fix API: Collect user actions
  5. Integrate into AI fix generation: Collect suggestions
  6. Add reject endpoint: Collect rejections

Timeline: Complete by November 30, 2025


Questions & Assumptions

Assumptions

  1. pgvector performance: Cosine similarity search scales to 100K+ embeddings
  2. Embedding cost: OpenAI pricing remains stable (~$0.0001/1k tokens)
  3. Acceptance rate improvement: Users value code style consistency
  4. Repository size: Avg repo has <500 source files

Open Questions

  1. Multi-repo teams: Index all repos or just active ones?
  2. Decision: Index top-5 most active repos per team
  3. Embedding model: text-embedding-3-small vs text-embedding-3-large?
  4. Decision: Start with small (cheaper, faster), upgrade if needed
  5. Retrieval quality: How many similar fixes to retrieve (topK)?
  6. Decision: Start with topK=3, A/B test 3 vs 5 vs 10
  7. Private model: Which base model to fine-tune?
  8. Decision: Defer to Q4 2026, depends on availability and cost

Appendix: Technical Details

Database Schema

See: src/lib/db/schema-fix-suggestions.ts

API Endpoints

POST /api/teams/{id}/collect-fix-suggestion - Stores AI-generated suggestions - Called from AI fix generation pipeline

POST /api/teams/{id}/apply-fix - Updates user action to 'accepted' - Existing endpoint, add one line

POST /api/teams/{id}/reject-fix - Updates user action to 'rejected' - NEW endpoint (create in Phase 0)

GET /api/teams/{id}/rag-analytics - Returns acceptance rates, false positive rates - Dashboard analytics

pgvector Setup

-- Install extension
CREATE EXTENSION vector;

-- Migrate embeddings from JSONB to vector
ALTER TABLE fix_embeddings
  ADD COLUMN embedding_vector vector(1536);

-- Populate from JSONB (one-time migration)
UPDATE fix_embeddings
SET embedding_vector = embedding::text::vector;

-- Create index (IMPORTANT for performance)
CREATE INDEX ON fix_embeddings
  USING ivfflat (embedding_vector vector_cosine_ops)
  WITH (lists = 100);

-- Drop old JSONB column
ALTER TABLE fix_embeddings DROP COLUMN embedding;

Document Owner: CTO Last Updated: November 23, 2025 Next Review: January 6, 2026 (before Phase 8.5 kickoff)