RAG Implementation Plan¶

Created: November 23, 2025 Status: Phase 0 (Data Collection) → Ready to implement Next Review: January 6, 2026 (before Phase 8.5)

Executive Summary¶

This document outlines the phased implementation of Retrieval-Augmented Generation (RAG) for CodeSlick's AI-powered fix suggestions.

Goal: Use RAG to improve fix quality, consistency, and team-specific customization WITHOUT replacing the core LLM.

Approach: Start with lightweight data collection (Phase 0), prove ROI with code style learning (Phase 1), then expand to repository context and documentation retrieval (Phases 2-3).

Timeline: - Phase 0 (NOW): Data collection infrastructure ✅ - Phase 1 (Q1 2026): Lightweight RAG POC (2 weeks) - Phase 2 (Q2 2026): Repository context (3 weeks) - Phase 3 (Q3 2026): Documentation retrieval (2 weeks)

Problem Statement¶

Current Limitations¶

No Repository Context (300-line limit)
AI analyzes files in isolation
Doesn't understand project structure, dependencies, or patterns
Fixes are generic (Stack Overflow-style)
No Code Style Awareness
AI doesn't learn team preferences (async/await vs .then(), single vs double quotes)
Developers must manually edit fixes to match style
Lowers acceptance rate
No Documentation Grounding
AI may suggest outdated fixes
Missing framework-specific best practices
No citations (OWASP, CVE, docs)

Impact on Metrics¶

Without RAG (Current): - Acceptance rate: 70% (target: 85%+) - False positive rate: 5% (target: <3%) - Modification rate: 25% (fixes need manual edits) - Team NPS: 7.5/10 (target: 8.5+)

With RAG (Projected): - Acceptance rate: 85%+ (↑15pp) - False positive rate: <3% (↓2pp) - Modification rate: <10% (↓15pp) - Team NPS: 8.5+/10 (↑1 point)

Solution: Three-Phased RAG Implementation¶

Overview¶

Phase	Focus	Timeline	ROI	Risk
Phase 0	Data Collection	NOW	Foundation	Low
Phase 1	Code Style Learning	Q1 2026 (2 weeks)	High	Low
Phase 2	Repository Context	Q2 2026 (3 weeks)	Very High	Medium
Phase 3	Documentation Retrieval	Q3 2026 (2 weeks)	Medium	Low

Why This Phasing?¶

Phase 1 first → Creates data flywheel (the more fixes, the better the next fixes)
Phase 2 second → Solves 300-line limit (biggest pain point for large files)
Phase 3 last → Nice-to-have, not critical for PMF

Phase 0: Data Collection Infrastructure ✅ COMPLETE¶

Goal¶

Build infrastructure to collect high-quality labeled data for future RAG.

Implementation¶

Completed (Nov 23, 2025): - ✅ Database schema (schema-fix-suggestions.ts) - ✅ Data collector service (fix-suggestion-collector.ts) - ✅ Helper functions (extractRepoContext, detectCodeStyle) - ✅ Documentation (src/lib/rag/README.md)

Next Steps (This Week): 1. Run migration to create database tables 2. Integrate into Apply Fix API (collect when user accepts fix) 3. Integrate into AI fix generation (collect when AI creates suggestion) 4. Add reject endpoint (collect when user rejects fix)

Data Schema¶

fix_suggestions table stores: - Vulnerability details (type, severity, CVSS, OWASP, CWE) - Code context (original, fixed, language, framework) - AI model details (provider, name, tokens, generation time) - User actions (accepted, rejected, modified, ignored) - RAG context: - repoContext: Dependencies, architecture, related files - codeStylePatterns: Quotes, async style, destructuring, etc. - relatedDocs: OWASP, CVE, framework docs links

Success Criteria¶

✅ 100+ fix suggestions collected in first 2 weeks
✅ User actions captured (accept/reject/modify)
✅ Code style patterns detected automatically
✅ No performance impact on Apply Fix API (<50ms overhead)

Decision Gate: If data collection works reliably for 2 weeks → Proceed to Phase 1

Phase 1: Code Style Learning (Lightweight RAG POC)¶

Goal¶

Prove RAG value with minimal investment: Learn team code style from past accepted fixes.

Timeline¶

Duration: 2 weeks (Q1 2026) Prerequisites: 100+ accepted fixes in database

Implementation¶

Week 1: Embeddings & Retrieval¶

Day 1-2: Add pgvector

-- Enable pgvector extension
CREATE EXTENSION vector;

-- Migrate fix_embeddings.embedding from JSONB to vector
ALTER TABLE fix_embeddings
  ADD COLUMN embedding_vector vector(1536);

-- Create index for similarity search
CREATE INDEX ON fix_embeddings
  USING ivfflat (embedding_vector vector_cosine_ops);

Day 3-4: Generate Embeddings

// src/lib/rag/embeddings-generator.ts
import OpenAI from 'openai';

export async function generateFixEmbedding(fix: FixSuggestion): Promise<number[]> {
  const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

  // Embedding text: combine context for best retrieval
  const embeddingText = `
    Language: ${fix.language}
    Framework: ${fix.framework}
    Vulnerability: ${fix.vulnerabilityType}
    Original Code: ${fix.originalCode}
    Fixed Code: ${fix.suggestedFix}
    Explanation: ${fix.explanation}
  `.trim();

  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small', // $0.0001 per 1k tokens
    input: embeddingText,
  });

  return response.data[0].embedding; // 1536 dimensions
}

// Batch process: Embed all past accepted fixes
export async function backfillEmbeddings() {
  const acceptedFixes = await db.query.fixSuggestions.findMany({
    where: eq(fixSuggestions.userAction, 'accepted'),
  });

  for (const fix of acceptedFixes) {
    const embedding = await generateFixEmbedding(fix);

    await db.insert(fixEmbeddings).values({
      fixSuggestionId: fix.id,
      embedding: embedding, // pgvector will handle this
      embeddingText: `${fix.originalCode} → ${fix.suggestedFix}`,
      embeddingType: 'fix_pair',
    });
  }
}

Day 5: Similarity Retrieval

// src/lib/rag/fix-retriever.ts
export async function retrieveSimilarFixes(
  issue: VulnerabilityIssue,
  teamId: string,
  topK: number = 3
): Promise<FixSuggestion[]> {
  // 1. Generate embedding for current issue
  const currentEmbedding = await generateFixEmbedding({
    language: issue.language,
    framework: issue.framework,
    vulnerabilityType: issue.type,
    originalCode: issue.code,
  });

  // 2. Find top-K similar past fixes using cosine similarity
  const similarFixes = await db.execute(sql`
    SELECT fs.*, fe.embedding_vector <=> ${currentEmbedding}::vector AS distance
    FROM fix_suggestions fs
    JOIN fix_embeddings fe ON fe.fix_suggestion_id = fs.id
    WHERE fs.team_id = ${teamId}
      AND fs.user_action IN ('accepted', 'modified')
      AND fs.language = ${issue.language}
      AND fs.acceptance_score > 75
    ORDER BY fe.embedding_vector <=> ${currentEmbedding}::vector
    LIMIT ${topK}
  `);

  return similarFixes;
}

Week 2: Integration & A/B Testing¶

Day 1-2: Update AI Prompt

// src/lib/ai/fix-generator.ts
export async function generateFixWithRAG(issue: VulnerabilityIssue, teamId: string) {
  // Retrieve similar past fixes
  const similarFixes = await retrieveSimilarFixes(issue, teamId, 3);

  // Enhanced prompt with RAG context
  const prompt = `
You are a security engineer fixing vulnerabilities in code.

**Past fixes in this repository** (use as style guide):
${similarFixes.map((fix, i) => `
  Example ${i + 1}:
  - File: ${fix.filePath}
  - Vulnerability: ${fix.vulnerabilityType}
  - Original Code:
    ${fix.originalCode}
  - Fixed Code:
    ${fix.suggestedFix}
  - User Action: ${fix.userAction} (acceptance score: ${fix.acceptanceScore}/100)
  - Code Style: ${JSON.stringify(fix.codeStylePatterns)}
`).join('\n')}

**Current issue**:
- File: ${issue.filePath}
- Vulnerability: ${issue.type}
- Severity: ${issue.severity}
- Code:
  ${issue.code}

**Instructions**:
1. Fix the vulnerability following the style of past accepted fixes
2. Match the code style patterns (quotes, async style, destructuring, etc.)
3. Provide a diff and explanation

Respond in JSON format: { fixedCode, explanation, confidence }
`;

  const response = await aiModel.chat({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [{ role: 'user', content: prompt }],
  });

  return parseAIResponse(response);
}

Day 3-4: A/B Test

// Randomly assign 50% to RAG, 50% to baseline
const useRAG = Math.random() < 0.5;

const fix = useRAG
  ? await generateFixWithRAG(issue, teamId)
  : await generateFixBaseline(issue);

// Store which version was used
await FixSuggestionCollector.collectSuggestion({
  ...fix,
  metadata: {
    ragEnabled: useRAG,
    similarFixesCount: useRAG ? similarFixes.length : 0,
  },
});

Day 5: Analyze Results

-- Compare acceptance rates: RAG vs Baseline
SELECT
  metadata->>'ragEnabled' as rag_enabled,
  COUNT(*) as total_suggestions,
  SUM(CASE WHEN user_action = 'accepted' THEN 1 ELSE 0 END) as accepted,
  ROUND(100.0 * SUM(CASE WHEN user_action = 'accepted' THEN 1 ELSE 0 END) / COUNT(*), 2) as acceptance_rate,
  AVG(acceptance_score) as avg_acceptance_score
FROM fix_suggestions
WHERE created_at >= NOW() - INTERVAL '7 days'
GROUP BY metadata->>'ragEnabled';

Success Criteria¶

Metrics (measured over 2 weeks): - ✅ Acceptance rate: 70% → 80%+ (↑10pp minimum) - ✅ Modification rate: 25% → <15% (↓10pp minimum) - ✅ User NPS: +0.5 points - ✅ Cost per fix: <$0.10 additional (embeddings are cheap)

Qualitative: - ✅ Fixes match team code style without manual edits - ✅ No performance regression (<10s generation time maintained) - ✅ User feedback: "Fixes feel more tailored to our codebase"

Decision Gate: If all criteria met → Proceed to Phase 2. If not → Investigate root cause before expanding.

Cost Analysis¶

Embedding Generation:
- Model: text-embedding-3-small ($0.0001 per 1k tokens)
- Avg fix: 500 tokens = $0.00005 per embedding
- 1000 fixes/month = $0.05/month

Storage (pgvector):
- 1536 dimensions × 4 bytes = 6.14 KB per embedding
- 1000 embeddings = 6.14 MB
- Cost: negligible

Retrieval:
- Cosine similarity search: <10ms (with ivfflat index)
- 1000 retrievals/month = negligible compute cost

Total Additional Cost: <$5/month (includes buffer)

Phase 2: Repository Context¶

Goal¶

Break the 300-line limit by retrieving relevant context from the entire repository.

Timeline¶

Duration: 3 weeks (Q2 2026) Prerequisites: Phase 1 success validated

Implementation¶

Week 1: Repository Indexing¶

Day 1-2: Index Repository Structure

// src/lib/rag/repo-indexer.ts
export interface RepoIndex {
  files: FileIndex[];
  dependencies: Record<string, string>;
  architecture: string; // "REST API", "GraphQL", "Microservices"
}

export interface FileIndex {
  path: string;
  language: string;
  imports: string[]; // Imported modules
  exports: string[]; // Exported functions/classes
  functions: FunctionSignature[];
  astHash: string; // For change detection
}

export async function indexRepository(
  owner: string,
  repo: string,
  branch: string
): Promise<RepoIndex> {
  // 1. Fetch all source files from GitHub
  const files = await fetchAllSourceFiles(owner, repo, branch);

  // 2. Parse each file to extract structure
  const fileIndexes = await Promise.all(
    files.map(async (file) => {
      const ast = await parseFileAST(file.content, file.language);
      return {
        path: file.path,
        language: file.language,
        imports: extractImports(ast),
        exports: extractExports(ast),
        functions: extractFunctionSignatures(ast),
        astHash: hashAST(ast),
      };
    })
  );

  // 3. Detect architecture pattern
  const architecture = detectArchitecture(fileIndexes);

  // 4. Extract dependencies
  const dependencies = await extractDependencies(owner, repo, branch);

  return {
    files: fileIndexes,
    dependencies,
    architecture,
  };
}

Day 3-4: Embed File Chunks

// Chunk strategy: Embed each file in 200-line chunks with 50-line overlap
export async function embedRepositoryFiles(repoIndex: RepoIndex): Promise<void> {
  for (const file of repoIndex.files) {
    const fileContent = await fetchFileContent(file.path);
    const chunks = chunkFile(fileContent, 200, 50); // 200 lines, 50 overlap

    for (const chunk of chunks) {
      const embedding = await generateEmbedding({
        path: file.path,
        language: file.language,
        imports: file.imports,
        code: chunk.code,
      });

      await db.insert(repoEmbeddings).values({
        repoId: repoIndex.id,
        filePath: file.path,
        chunkIndex: chunk.index,
        chunkCode: chunk.code,
        embedding: embedding,
      });
    }
  }
}

Day 5: Implement Caching

// Cache repo index for 24 hours (invalidate on push)
export async function getRepoIndex(
  owner: string,
  repo: string,
  branch: string
): Promise<RepoIndex> {
  const cacheKey = `repo:${owner}/${repo}:${branch}`;
  const cached = await redis.get(cacheKey);

  if (cached) {
    return JSON.parse(cached);
  }

  const repoIndex = await indexRepository(owner, repo, branch);
  await redis.setex(cacheKey, 86400, JSON.stringify(repoIndex)); // 24 hours
  return repoIndex;
}

Week 2: Context Retrieval¶

Day 1-3: Retrieve Related Code

// src/lib/rag/context-retriever.ts
export async function retrieveRelevantContext(
  issue: VulnerabilityIssue,
  repoIndex: RepoIndex,
  topK: number = 5
): Promise<CodeContext> {
  // 1. Embed the current issue
  const issueEmbedding = await generateEmbedding({
    path: issue.filePath,
    language: issue.language,
    code: issue.code,
  });

  // 2. Find top-K similar code chunks in repo
  const similarChunks = await db.execute(sql`
    SELECT * FROM repo_embeddings
    WHERE repo_id = ${repoIndex.id}
      AND file_path != ${issue.filePath} -- Exclude current file
    ORDER BY embedding <=> ${issueEmbedding}::vector
    LIMIT ${topK}
  `);

  // 3. Also retrieve explicitly related files (imports, callers)
  const relatedFiles = await findRelatedFiles(issue.filePath, repoIndex);

  return {
    similarCode: similarChunks,
    imports: relatedFiles.imports,
    callers: relatedFiles.callers,
    architecture: repoIndex.architecture,
    dependencies: repoIndex.dependencies,
  };
}

Day 4-5: Update AI Prompt with Context

const context = await retrieveRelevantContext(issue, repoIndex);

const prompt = `
You are a security engineer fixing vulnerabilities.

**Repository Architecture**: ${context.architecture}

**Dependencies**: ${JSON.stringify(context.dependencies)}

**Related Code in Repository** (similar patterns):
${context.similarCode.map((chunk) => `
  File: ${chunk.filePath}
  Code:
  ${chunk.chunkCode}
`).join('\n')}

**Imported Modules** (how they're used elsewhere):
${context.imports.map((imp) => `
  ${imp.name} from ${imp.path}
  Usage: ${imp.sampleUsage}
`).join('\n')}

**Current Issue**:
File: ${issue.filePath}
Line: ${issue.line}
Vulnerability: ${issue.type}
Code:
${issue.code}

Fix this vulnerability while maintaining consistency with the repository's architecture and existing patterns.
`;

Week 3: Integration & Testing¶

Day 1-2: Performance Optimization

// Parallel retrieval (don't block on embeddings)
const [similarFixes, repoContext] = await Promise.all([
  retrieveSimilarFixes(issue, teamId, 3), // Phase 1
  retrieveRelevantContext(issue, repoIndex, 5), // Phase 2
]);

// Combine contexts
const prompt = buildPrompt({
  similarFixes, // Past team fixes
  repoContext, // Repository structure
  issue,
});

Day 3-5: End-to-End Testing

Test cases: 1. Large file (500+ lines) → Can AI now fix using context from repo? 2. Cross-file vulnerability (function called from multiple files) → Does AI understand callers? 3. Framework-specific pattern → Does AI use repo's existing auth/DB patterns?

Success Criteria¶

Metrics: - ✅ Can analyze files >300 lines (by retrieving relevant chunks) - ✅ Cross-file awareness: AI understands how functions are called elsewhere - ✅ Acceptance rate: 80% → 85%+ (↑5pp) - ✅ Generation time: <15s (including retrieval overhead)

Qualitative: - ✅ Fixes match repository architecture (REST vs GraphQL vs Microservices) - ✅ Fixes use existing helper functions (don't reinvent the wheel) - ✅ Enterprise feedback: "Understands our entire codebase"

Cost Analysis¶

Repository Indexing:
- One-time per repo: ~500 files × 200 lines = 100K lines
- Embedding cost: 100K lines ≈ 2M tokens × $0.0001 = $0.20 per repo

Incremental Updates:
- Only re-index changed files (GitHub webhook)
- Avg 10 files/day = 2K lines = $0.004/day = $0.12/month

Storage:
- 500 files × 5 chunks/file × 6 KB/chunk = 15 MB per repo
- 100 repos = 1.5 GB = $0.50/month (S3/Postgres)

Retrieval:
- <50ms per query (with ivfflat index)

Total Additional Cost: ~$1-2/month per active repo

Phase 3: Documentation Retrieval¶

Goal¶

Ground AI fixes in official security guidelines and framework documentation.

Timeline¶

Duration: 2 weeks (Q3 2026) Prerequisites: Phase 2 complete

Implementation¶

Week 1: Index Documentation¶

Day 1-2: Scrape & Index OWASP

// src/lib/rag/docs-indexer.ts
export async function indexOWASPDocs(): Promise<void> {
  const owaspTopTen = await fetchOWASPTopTen2025();

  for (const category of owaspTopTen) {
    const embedding = await generateEmbedding({
      title: category.title, // e.g., "A03:2025 - Injection"
      description: category.description,
      examples: category.examples,
      mitigation: category.mitigation,
    });

    await db.insert(docEmbeddings).values({
      docType: 'owasp',
      docId: category.id,
      title: category.title,
      content: category.fullText,
      embedding,
    });
  }
}

Day 3-4: Index Framework Docs

// Index popular framework security docs
const frameworks = [
  'express', 'django', 'flask', 'spring-boot',
  'react', 'vue', 'angular', 'next.js'
];

for (const framework of frameworks) {
  const securityDocs = await fetchFrameworkSecurityDocs(framework);
  // Embed and store...
}

Day 5: Index CVE Database

// Daily sync with NVD (National Vulnerability Database)
export async function syncCVEDatabase(): Promise<void> {
  const recentCVEs = await fetchRecentCVEs(30); // Last 30 days

  for (const cve of recentCVEs) {
    const embedding = await generateEmbedding({
      cveId: cve.id,
      description: cve.description,
      affectedPackages: cve.affectedPackages,
      mitigation: cve.mitigation,
    });

    await db.insert(cveEmbeddings).values({
      cveId: cve.id,
      severity: cve.severity,
      description: cve.description,
      embedding,
    });
  }
}

Week 2: Retrieval & Integration¶

Day 1-3: Retrieve Relevant Docs

export async function retrieveRelevantDocs(
  issue: VulnerabilityIssue,
  framework?: string
): Promise<Documentation> {
  const issueEmbedding = await generateEmbedding({
    vulnerabilityType: issue.type,
    description: issue.message,
    code: issue.code,
  });

  // Retrieve top-3 OWASP docs
  const owaspDocs = await retrieveSimilarDocs(issueEmbedding, 'owasp', 3);

  // Retrieve top-2 framework docs (if framework detected)
  const frameworkDocs = framework
    ? await retrieveSimilarDocs(issueEmbedding, framework, 2)
    : [];

  // Retrieve top-2 related CVEs
  const cveDocs = await retrieveSimilarDocs(issueEmbedding, 'cve', 2);

  return {
    owasp: owaspDocs,
    framework: frameworkDocs,
    cve: cveDocs,
  };
}

Day 4-5: Update AI Prompt

const docs = await retrieveRelevantDocs(issue, issue.framework);

const prompt = `
You are a security engineer fixing vulnerabilities.

**Official Security Guidelines**:

${docs.owasp.map(doc => `
  [${doc.title}](${doc.url})
  ${doc.description}
  Recommended Fix: ${doc.mitigation}
`).join('\n')}

**Framework Best Practices** (${issue.framework}):
${docs.framework.map(doc => `
  ${doc.title}: ${doc.content}
`).join('\n')}

**Related CVEs**:
${docs.cve.map(doc => `
  ${doc.cveId} (${doc.severity}): ${doc.description}
  Affected: ${doc.affectedPackages}
`).join('\n')}

**Current Issue**:
${issue.code}

Fix following official guidelines. Include citation links in your explanation.
`;

Success Criteria¶

Metrics: - ✅ Fix explanations include citations (OWASP, CVE, docs) - ✅ Suggestions align with framework best practices - ✅ User trust: "Fixes feel authoritative" (qualitative feedback) - ✅ Acceptance rate: 85% → 88%+ (↑3pp)

Enterprise Value: - ✅ Compliance reports include OWASP/CVE mappings - ✅ Option to add internal company security docs (self-hosted)

Cost Analysis¶

One-Time Indexing:
- OWASP Top 10: 10 categories × 5K tokens = $0.05
- Framework Docs: 8 frameworks × 10K tokens = $0.80
- CVE Database: 1000 recent CVEs × 1K tokens = $1.00
Total: $1.85 (one-time)

Daily Sync:
- New CVEs: ~5/day × 1K tokens = $0.0005/day = $0.15/month

Storage:
- 1000 docs × 6 KB = 6 MB = negligible

Total Recurring Cost: <$1/month

Privacy & Security¶

Data Retention Policy¶

Tier	Retention	Auto-Delete	Export
Free	30 days	Yes	JSON
Team	90 days	Yes	JSON, CSV
Enterprise	1 year (configurable)	Optional	JSON, CSV, SQL dump

PII Redaction¶

Automatic Redaction (before storage):

export function redactPII(code: string): string {
  return code
    .replace(/\b[\w._%+-]+@[\w.-]+\.[A-Z]{2,}\b/gi, '[EMAIL_REDACTED]')
    .replace(/\b[A-Z0-9]{20,}\b/g, '[API_KEY_REDACTED]')
    .replace(/\bsk_live_[A-Za-z0-9]+/g, '[STRIPE_KEY_REDACTED]')
    .replace(/\bAKIA[0-9A-Z]{16}/g, '[AWS_KEY_REDACTED]');
}

On-Premise Deployment¶

Enterprise Option: - Self-host RAG database in customer VPC - Embeddings generated on-premise (local OpenAI proxy) - No code leaves customer infrastructure - Still benefit from public docs (OWASP, CVE)

Hybrid Mode: - Private code → On-premise embeddings - Public docs → CodeSlick cloud embeddings

User Rights: - Right to access: Export all stored suggestions via API - Right to deletion: DELETE /api/teams/{id}/rag-data - Right to portability: Download JSON dump

Data Processing Agreement: - Code snippets = "pseudonymized data" (no PII) - Encrypted at rest (AES-256) - Encrypted in transit (TLS 1.3) - SOC2 Type II compliant (by Q4 2026)

Cost-Benefit Analysis¶

Total Cost (per team, per month)¶

Phase	Indexing	Retrieval	Storage	Total
Phase 0 (Data Collection)	$0	$0	$0.10	$0.10
Phase 1 (Code Style)	$0.05	$0.02	$0.15	$0.22
Phase 2 (Repo Context)	$0.20	$0.10	$0.50	$0.80
Phase 3 (Docs)	$0.10	$0.05	$0.05	$0.20

Total Additional Cost: ~$1.50/team/month (all phases combined)

Revenue Impact¶

Without RAG (Current): - Acceptance rate: 70% - Churn rate: 5%/month - MRR per team: €99

With RAG (Projected): - Acceptance rate: 85% → Higher perceived value - Churn rate: 3%/month → Better retention (-2pp) - MRR per team: €99 (same price, higher value)

Churn Reduction Impact:

100 teams × €99/month = €9,900 MRR
Without RAG: 5% churn = -€495/month lost revenue
With RAG: 3% churn = -€297/month lost revenue
Net Impact: +€198/month retained revenue

Annual Impact: +€2,376/year per 100 teams
Cost: €150/year (100 teams × $1.50/month)

ROI: 1,484% (€2,376 / €150)

Non-Financial Benefits¶

Competitive Moat: Data flywheel (more usage = better fixes)
Enterprise Sales: "Learns your codebase" is a killer feature
User Delight: Fixes that feel "magical" (exactly what they would've written)
Reduced Support: Fewer "why did AI suggest this?" questions

Rollout Strategy¶

Gradual Rollout¶

Phase 1 (Code Style RAG): - Week 1: Internal testing (CodeSlick team repos) - Week 2: 10% of teams (early adopters) - Week 3: 50% of teams (if metrics good) - Week 4: 100% of teams (if no issues)

Phase 2 (Repo Context): - Week 1: 5 beta teams (large repos >100 files) - Week 2: 20% of teams - Week 3: 100% of teams

Feature Flags:

const ragConfig = {
  codeStyleRAG: process.env.ENABLE_CODE_STYLE_RAG === 'true',
  repoContextRAG: process.env.ENABLE_REPO_CONTEXT_RAG === 'true',
  docsRAG: process.env.ENABLE_DOCS_RAG === 'true',
};

// Per-team override
const teamConfig = await getTeamRAGConfig(teamId);
const useRAG = teamConfig?.enableRAG ?? ragConfig.codeStyleRAG;

Monitoring & Alerts¶

Key Metrics (track in real-time): - Acceptance rate (RAG vs baseline) - False positive rate - Generation time (p50, p95, p99) - Cost per fix - User NPS (weekly survey)

Alerts: - Acceptance rate drops >5pp → Investigate immediately - Generation time >15s → Scale retrieval infrastructure - Cost per fix >$0.50 → Optimize embeddings

Rollback Plan¶

If metrics regress: 1. Disable RAG via feature flag (instant rollback) 2. Investigate root cause (bad embeddings? Retrieval quality?) 3. Fix issue in staging environment 4. Re-enable for 10% of teams (test fix) 5. Gradual rollout again

Success Metrics Summary¶

Metric	Baseline	Phase 1	Phase 2	Phase 3	Target
Acceptance Rate	70%	80%	85%	88%	85%+
False Positive Rate	5%	3%	2.5%	2%	<3%
Modification Rate	25%	15%	10%	8%	<10%
Generation Time	8s	9s	12s	13s	<15s
Cost per Fix	$0.05	$0.10	$0.20	$0.25	<$0.30
User NPS	7.5	8.0	8.5	9.0	8.5+
Churn Rate	5%	4%	3%	2.5%	<3%

Decision Gates¶

Phase 0 → Phase 1¶

Criteria: - ✅ 100+ fix suggestions collected - ✅ User actions tracked reliably - ✅ Code style patterns detected automatically - ✅ No performance impact

Decision: If criteria met → Proceed to Phase 1 (Q1 2026)

Phase 1 → Phase 2¶

Criteria: - ✅ Acceptance rate: 70% → 80%+ (↑10pp) - ✅ Modification rate: 25% → <15% (↓10pp) - ✅ User NPS: +0.5 points - ✅ Cost per fix: <$0.10

Decision: If criteria met → Proceed to Phase 2 (Q2 2026)

Phase 2 → Phase 3¶

Criteria: - ✅ Can analyze files >300 lines - ✅ Cross-file awareness validated - ✅ Acceptance rate: 80% → 85%+ (↑5pp) - ✅ Generation time: <15s

Decision: If criteria met → Proceed to Phase 3 (Q3 2026)

Phase 3 → Private Model¶

Criteria: - ✅ 10,000+ high-quality labeled examples - ✅ Acceptance rate: 85%+ sustained - ✅ Cost per fix: >$0.20 (makes fine-tuning worth it) - ✅ Enterprise demand: 50+ customers

Decision: Evaluate in Q4 2026

Next Steps (This Week)¶

✅ Create database schema (schema-fix-suggestions.ts)
✅ Create data collector (fix-suggestion-collector.ts)
⏳ Run migration: npx drizzle-kit push:pg
⏳ Integrate into Apply Fix API: Collect user actions
⏳ Integrate into AI fix generation: Collect suggestions
⏳ Add reject endpoint: Collect rejections

Timeline: Complete by November 30, 2025

Questions & Assumptions¶

Assumptions¶

pgvector performance: Cosine similarity search scales to 100K+ embeddings
Embedding cost: OpenAI pricing remains stable (~$0.0001/1k tokens)
Acceptance rate improvement: Users value code style consistency
Repository size: Avg repo has <500 source files

Open Questions¶

Multi-repo teams: Index all repos or just active ones?
Decision: Index top-5 most active repos per team
Embedding model: text-embedding-3-small vs text-embedding-3-large?
Decision: Start with small (cheaper, faster), upgrade if needed
Retrieval quality: How many similar fixes to retrieve (topK)?
Decision: Start with topK=3, A/B test 3 vs 5 vs 10
Private model: Which base model to fine-tune?
Decision: Defer to Q4 2026, depends on availability and cost

Appendix: Technical Details¶

Database Schema¶

See: src/lib/db/schema-fix-suggestions.ts

API Endpoints¶

POST /api/teams/{id}/collect-fix-suggestion - Stores AI-generated suggestions - Called from AI fix generation pipeline

POST /api/teams/{id}/apply-fix - Updates user action to 'accepted' - Existing endpoint, add one line

POST /api/teams/{id}/reject-fix - Updates user action to 'rejected' - NEW endpoint (create in Phase 0)

GET /api/teams/{id}/rag-analytics - Returns acceptance rates, false positive rates - Dashboard analytics

pgvector Setup¶

-- Install extension
CREATE EXTENSION vector;

-- Migrate embeddings from JSONB to vector
ALTER TABLE fix_embeddings
  ADD COLUMN embedding_vector vector(1536);

-- Populate from JSONB (one-time migration)
UPDATE fix_embeddings
SET embedding_vector = embedding::text::vector;

-- Create index (IMPORTANT for performance)
CREATE INDEX ON fix_embeddings
  USING ivfflat (embedding_vector vector_cosine_ops)
  WITH (lists = 100);

-- Drop old JSONB column
ALTER TABLE fix_embeddings DROP COLUMN embedding;

Document Owner: CTO Last Updated: November 23, 2025 Next Review: January 6, 2026 (before Phase 8.5 kickoff)

RAG Implementation Plan¶

Executive Summary¶

Problem Statement¶

Current Limitations¶

Impact on Metrics¶

Solution: Three-Phased RAG Implementation¶

Overview¶

Why This Phasing?¶

Phase 0: Data Collection Infrastructure ✅ COMPLETE¶

Goal¶

Implementation¶

Data Schema¶

Success Criteria¶

Phase 1: Code Style Learning (Lightweight RAG POC)¶

Goal¶

Timeline¶

Implementation¶

Week 1: Embeddings & Retrieval¶

Week 2: Integration & A/B Testing¶

Success Criteria¶

Cost Analysis¶

Phase 2: Repository Context¶

Goal¶

Timeline¶

Implementation¶

Week 1: Repository Indexing¶

Week 2: Context Retrieval¶

Week 3: Integration & Testing¶

Success Criteria¶

Cost Analysis¶

Phase 3: Documentation Retrieval¶

Goal¶

Timeline¶

Implementation¶

Week 1: Index Documentation¶

Week 2: Retrieval & Integration¶

Success Criteria¶

Cost Analysis¶

Privacy & Security¶

Data Retention Policy¶

PII Redaction¶

On-Premise Deployment¶

GDPR Compliance¶

Cost-Benefit Analysis¶

Total Cost (per team, per month)¶

Revenue Impact¶

Non-Financial Benefits¶

Rollout Strategy¶

Gradual Rollout¶

Monitoring & Alerts¶

Rollback Plan¶

Success Metrics Summary¶

Decision Gates¶

Phase 0 → Phase 1¶

Phase 1 → Phase 2¶

Phase 2 → Phase 3¶

Phase 3 → Private Model¶

Next Steps (This Week)¶

Questions & Assumptions¶

Assumptions¶

Open Questions¶

Appendix: Technical Details¶

Database Schema¶

API Endpoints¶

pgvector Setup¶