RAG Implementation Plan¶
Created: November 23, 2025 Status: Phase 0 (Data Collection) → Ready to implement Next Review: January 6, 2026 (before Phase 8.5)
Executive Summary¶
This document outlines the phased implementation of Retrieval-Augmented Generation (RAG) for CodeSlick's AI-powered fix suggestions.
Goal: Use RAG to improve fix quality, consistency, and team-specific customization WITHOUT replacing the core LLM.
Approach: Start with lightweight data collection (Phase 0), prove ROI with code style learning (Phase 1), then expand to repository context and documentation retrieval (Phases 2-3).
Timeline: - Phase 0 (NOW): Data collection infrastructure ✅ - Phase 1 (Q1 2026): Lightweight RAG POC (2 weeks) - Phase 2 (Q2 2026): Repository context (3 weeks) - Phase 3 (Q3 2026): Documentation retrieval (2 weeks)
Problem Statement¶
Current Limitations¶
- No Repository Context (300-line limit)
- AI analyzes files in isolation
- Doesn't understand project structure, dependencies, or patterns
-
Fixes are generic (Stack Overflow-style)
-
No Code Style Awareness
- AI doesn't learn team preferences (async/await vs .then(), single vs double quotes)
- Developers must manually edit fixes to match style
-
Lowers acceptance rate
-
No Documentation Grounding
- AI may suggest outdated fixes
- Missing framework-specific best practices
- No citations (OWASP, CVE, docs)
Impact on Metrics¶
Without RAG (Current): - Acceptance rate: 70% (target: 85%+) - False positive rate: 5% (target: <3%) - Modification rate: 25% (fixes need manual edits) - Team NPS: 7.5/10 (target: 8.5+)
With RAG (Projected): - Acceptance rate: 85%+ (↑15pp) - False positive rate: <3% (↓2pp) - Modification rate: <10% (↓15pp) - Team NPS: 8.5+/10 (↑1 point)
Solution: Three-Phased RAG Implementation¶
Overview¶
| Phase | Focus | Timeline | ROI | Risk |
|---|---|---|---|---|
| Phase 0 | Data Collection | NOW | Foundation | Low |
| Phase 1 | Code Style Learning | Q1 2026 (2 weeks) | High | Low |
| Phase 2 | Repository Context | Q2 2026 (3 weeks) | Very High | Medium |
| Phase 3 | Documentation Retrieval | Q3 2026 (2 weeks) | Medium | Low |
Why This Phasing?¶
- Phase 1 first → Creates data flywheel (the more fixes, the better the next fixes)
- Phase 2 second → Solves 300-line limit (biggest pain point for large files)
- Phase 3 last → Nice-to-have, not critical for PMF
Phase 0: Data Collection Infrastructure ✅ COMPLETE¶
Goal¶
Build infrastructure to collect high-quality labeled data for future RAG.
Implementation¶
Completed (Nov 23, 2025):
- ✅ Database schema (schema-fix-suggestions.ts)
- ✅ Data collector service (fix-suggestion-collector.ts)
- ✅ Helper functions (extractRepoContext, detectCodeStyle)
- ✅ Documentation (src/lib/rag/README.md)
Next Steps (This Week): 1. Run migration to create database tables 2. Integrate into Apply Fix API (collect when user accepts fix) 3. Integrate into AI fix generation (collect when AI creates suggestion) 4. Add reject endpoint (collect when user rejects fix)
Data Schema¶
fix_suggestions table stores:
- Vulnerability details (type, severity, CVSS, OWASP, CWE)
- Code context (original, fixed, language, framework)
- AI model details (provider, name, tokens, generation time)
- User actions (accepted, rejected, modified, ignored)
- RAG context:
- repoContext: Dependencies, architecture, related files
- codeStylePatterns: Quotes, async style, destructuring, etc.
- relatedDocs: OWASP, CVE, framework docs links
Success Criteria¶
- ✅ 100+ fix suggestions collected in first 2 weeks
- ✅ User actions captured (accept/reject/modify)
- ✅ Code style patterns detected automatically
- ✅ No performance impact on Apply Fix API (<50ms overhead)
Decision Gate: If data collection works reliably for 2 weeks → Proceed to Phase 1
Phase 1: Code Style Learning (Lightweight RAG POC)¶
Goal¶
Prove RAG value with minimal investment: Learn team code style from past accepted fixes.
Timeline¶
Duration: 2 weeks (Q1 2026) Prerequisites: 100+ accepted fixes in database
Implementation¶
Week 1: Embeddings & Retrieval¶
Day 1-2: Add pgvector
-- Enable pgvector extension
CREATE EXTENSION vector;
-- Migrate fix_embeddings.embedding from JSONB to vector
ALTER TABLE fix_embeddings
ADD COLUMN embedding_vector vector(1536);
-- Create index for similarity search
CREATE INDEX ON fix_embeddings
USING ivfflat (embedding_vector vector_cosine_ops);
Day 3-4: Generate Embeddings
// src/lib/rag/embeddings-generator.ts
import OpenAI from 'openai';
export async function generateFixEmbedding(fix: FixSuggestion): Promise<number[]> {
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// Embedding text: combine context for best retrieval
const embeddingText = `
Language: ${fix.language}
Framework: ${fix.framework}
Vulnerability: ${fix.vulnerabilityType}
Original Code: ${fix.originalCode}
Fixed Code: ${fix.suggestedFix}
Explanation: ${fix.explanation}
`.trim();
const response = await openai.embeddings.create({
model: 'text-embedding-3-small', // $0.0001 per 1k tokens
input: embeddingText,
});
return response.data[0].embedding; // 1536 dimensions
}
// Batch process: Embed all past accepted fixes
export async function backfillEmbeddings() {
const acceptedFixes = await db.query.fixSuggestions.findMany({
where: eq(fixSuggestions.userAction, 'accepted'),
});
for (const fix of acceptedFixes) {
const embedding = await generateFixEmbedding(fix);
await db.insert(fixEmbeddings).values({
fixSuggestionId: fix.id,
embedding: embedding, // pgvector will handle this
embeddingText: `${fix.originalCode} → ${fix.suggestedFix}`,
embeddingType: 'fix_pair',
});
}
}
Day 5: Similarity Retrieval
// src/lib/rag/fix-retriever.ts
export async function retrieveSimilarFixes(
issue: VulnerabilityIssue,
teamId: string,
topK: number = 3
): Promise<FixSuggestion[]> {
// 1. Generate embedding for current issue
const currentEmbedding = await generateFixEmbedding({
language: issue.language,
framework: issue.framework,
vulnerabilityType: issue.type,
originalCode: issue.code,
});
// 2. Find top-K similar past fixes using cosine similarity
const similarFixes = await db.execute(sql`
SELECT fs.*, fe.embedding_vector <=> ${currentEmbedding}::vector AS distance
FROM fix_suggestions fs
JOIN fix_embeddings fe ON fe.fix_suggestion_id = fs.id
WHERE fs.team_id = ${teamId}
AND fs.user_action IN ('accepted', 'modified')
AND fs.language = ${issue.language}
AND fs.acceptance_score > 75
ORDER BY fe.embedding_vector <=> ${currentEmbedding}::vector
LIMIT ${topK}
`);
return similarFixes;
}
Week 2: Integration & A/B Testing¶
Day 1-2: Update AI Prompt
// src/lib/ai/fix-generator.ts
export async function generateFixWithRAG(issue: VulnerabilityIssue, teamId: string) {
// Retrieve similar past fixes
const similarFixes = await retrieveSimilarFixes(issue, teamId, 3);
// Enhanced prompt with RAG context
const prompt = `
You are a security engineer fixing vulnerabilities in code.
**Past fixes in this repository** (use as style guide):
${similarFixes.map((fix, i) => `
Example ${i + 1}:
- File: ${fix.filePath}
- Vulnerability: ${fix.vulnerabilityType}
- Original Code:
${fix.originalCode}
- Fixed Code:
${fix.suggestedFix}
- User Action: ${fix.userAction} (acceptance score: ${fix.acceptanceScore}/100)
- Code Style: ${JSON.stringify(fix.codeStylePatterns)}
`).join('\n')}
**Current issue**:
- File: ${issue.filePath}
- Vulnerability: ${issue.type}
- Severity: ${issue.severity}
- Code:
${issue.code}
**Instructions**:
1. Fix the vulnerability following the style of past accepted fixes
2. Match the code style patterns (quotes, async style, destructuring, etc.)
3. Provide a diff and explanation
Respond in JSON format: { fixedCode, explanation, confidence }
`;
const response = await aiModel.chat({
model: 'anthropic/claude-3.5-sonnet',
messages: [{ role: 'user', content: prompt }],
});
return parseAIResponse(response);
}
Day 3-4: A/B Test
// Randomly assign 50% to RAG, 50% to baseline
const useRAG = Math.random() < 0.5;
const fix = useRAG
? await generateFixWithRAG(issue, teamId)
: await generateFixBaseline(issue);
// Store which version was used
await FixSuggestionCollector.collectSuggestion({
...fix,
metadata: {
ragEnabled: useRAG,
similarFixesCount: useRAG ? similarFixes.length : 0,
},
});
Day 5: Analyze Results
-- Compare acceptance rates: RAG vs Baseline
SELECT
metadata->>'ragEnabled' as rag_enabled,
COUNT(*) as total_suggestions,
SUM(CASE WHEN user_action = 'accepted' THEN 1 ELSE 0 END) as accepted,
ROUND(100.0 * SUM(CASE WHEN user_action = 'accepted' THEN 1 ELSE 0 END) / COUNT(*), 2) as acceptance_rate,
AVG(acceptance_score) as avg_acceptance_score
FROM fix_suggestions
WHERE created_at >= NOW() - INTERVAL '7 days'
GROUP BY metadata->>'ragEnabled';
Success Criteria¶
Metrics (measured over 2 weeks): - ✅ Acceptance rate: 70% → 80%+ (↑10pp minimum) - ✅ Modification rate: 25% → <15% (↓10pp minimum) - ✅ User NPS: +0.5 points - ✅ Cost per fix: <$0.10 additional (embeddings are cheap)
Qualitative: - ✅ Fixes match team code style without manual edits - ✅ No performance regression (<10s generation time maintained) - ✅ User feedback: "Fixes feel more tailored to our codebase"
Decision Gate: If all criteria met → Proceed to Phase 2. If not → Investigate root cause before expanding.
Cost Analysis¶
Embedding Generation:
- Model: text-embedding-3-small ($0.0001 per 1k tokens)
- Avg fix: 500 tokens = $0.00005 per embedding
- 1000 fixes/month = $0.05/month
Storage (pgvector):
- 1536 dimensions × 4 bytes = 6.14 KB per embedding
- 1000 embeddings = 6.14 MB
- Cost: negligible
Retrieval:
- Cosine similarity search: <10ms (with ivfflat index)
- 1000 retrievals/month = negligible compute cost
Total Additional Cost: <$5/month (includes buffer)
Phase 2: Repository Context¶
Goal¶
Break the 300-line limit by retrieving relevant context from the entire repository.
Timeline¶
Duration: 3 weeks (Q2 2026) Prerequisites: Phase 1 success validated
Implementation¶
Week 1: Repository Indexing¶
Day 1-2: Index Repository Structure
// src/lib/rag/repo-indexer.ts
export interface RepoIndex {
files: FileIndex[];
dependencies: Record<string, string>;
architecture: string; // "REST API", "GraphQL", "Microservices"
}
export interface FileIndex {
path: string;
language: string;
imports: string[]; // Imported modules
exports: string[]; // Exported functions/classes
functions: FunctionSignature[];
astHash: string; // For change detection
}
export async function indexRepository(
owner: string,
repo: string,
branch: string
): Promise<RepoIndex> {
// 1. Fetch all source files from GitHub
const files = await fetchAllSourceFiles(owner, repo, branch);
// 2. Parse each file to extract structure
const fileIndexes = await Promise.all(
files.map(async (file) => {
const ast = await parseFileAST(file.content, file.language);
return {
path: file.path,
language: file.language,
imports: extractImports(ast),
exports: extractExports(ast),
functions: extractFunctionSignatures(ast),
astHash: hashAST(ast),
};
})
);
// 3. Detect architecture pattern
const architecture = detectArchitecture(fileIndexes);
// 4. Extract dependencies
const dependencies = await extractDependencies(owner, repo, branch);
return {
files: fileIndexes,
dependencies,
architecture,
};
}
Day 3-4: Embed File Chunks
// Chunk strategy: Embed each file in 200-line chunks with 50-line overlap
export async function embedRepositoryFiles(repoIndex: RepoIndex): Promise<void> {
for (const file of repoIndex.files) {
const fileContent = await fetchFileContent(file.path);
const chunks = chunkFile(fileContent, 200, 50); // 200 lines, 50 overlap
for (const chunk of chunks) {
const embedding = await generateEmbedding({
path: file.path,
language: file.language,
imports: file.imports,
code: chunk.code,
});
await db.insert(repoEmbeddings).values({
repoId: repoIndex.id,
filePath: file.path,
chunkIndex: chunk.index,
chunkCode: chunk.code,
embedding: embedding,
});
}
}
}
Day 5: Implement Caching
// Cache repo index for 24 hours (invalidate on push)
export async function getRepoIndex(
owner: string,
repo: string,
branch: string
): Promise<RepoIndex> {
const cacheKey = `repo:${owner}/${repo}:${branch}`;
const cached = await redis.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
const repoIndex = await indexRepository(owner, repo, branch);
await redis.setex(cacheKey, 86400, JSON.stringify(repoIndex)); // 24 hours
return repoIndex;
}
Week 2: Context Retrieval¶
Day 1-3: Retrieve Related Code
// src/lib/rag/context-retriever.ts
export async function retrieveRelevantContext(
issue: VulnerabilityIssue,
repoIndex: RepoIndex,
topK: number = 5
): Promise<CodeContext> {
// 1. Embed the current issue
const issueEmbedding = await generateEmbedding({
path: issue.filePath,
language: issue.language,
code: issue.code,
});
// 2. Find top-K similar code chunks in repo
const similarChunks = await db.execute(sql`
SELECT * FROM repo_embeddings
WHERE repo_id = ${repoIndex.id}
AND file_path != ${issue.filePath} -- Exclude current file
ORDER BY embedding <=> ${issueEmbedding}::vector
LIMIT ${topK}
`);
// 3. Also retrieve explicitly related files (imports, callers)
const relatedFiles = await findRelatedFiles(issue.filePath, repoIndex);
return {
similarCode: similarChunks,
imports: relatedFiles.imports,
callers: relatedFiles.callers,
architecture: repoIndex.architecture,
dependencies: repoIndex.dependencies,
};
}
Day 4-5: Update AI Prompt with Context
const context = await retrieveRelevantContext(issue, repoIndex);
const prompt = `
You are a security engineer fixing vulnerabilities.
**Repository Architecture**: ${context.architecture}
**Dependencies**: ${JSON.stringify(context.dependencies)}
**Related Code in Repository** (similar patterns):
${context.similarCode.map((chunk) => `
File: ${chunk.filePath}
Code:
${chunk.chunkCode}
`).join('\n')}
**Imported Modules** (how they're used elsewhere):
${context.imports.map((imp) => `
${imp.name} from ${imp.path}
Usage: ${imp.sampleUsage}
`).join('\n')}
**Current Issue**:
File: ${issue.filePath}
Line: ${issue.line}
Vulnerability: ${issue.type}
Code:
${issue.code}
Fix this vulnerability while maintaining consistency with the repository's architecture and existing patterns.
`;
Week 3: Integration & Testing¶
Day 1-2: Performance Optimization
// Parallel retrieval (don't block on embeddings)
const [similarFixes, repoContext] = await Promise.all([
retrieveSimilarFixes(issue, teamId, 3), // Phase 1
retrieveRelevantContext(issue, repoIndex, 5), // Phase 2
]);
// Combine contexts
const prompt = buildPrompt({
similarFixes, // Past team fixes
repoContext, // Repository structure
issue,
});
Day 3-5: End-to-End Testing
Test cases: 1. Large file (500+ lines) → Can AI now fix using context from repo? 2. Cross-file vulnerability (function called from multiple files) → Does AI understand callers? 3. Framework-specific pattern → Does AI use repo's existing auth/DB patterns?
Success Criteria¶
Metrics: - ✅ Can analyze files >300 lines (by retrieving relevant chunks) - ✅ Cross-file awareness: AI understands how functions are called elsewhere - ✅ Acceptance rate: 80% → 85%+ (↑5pp) - ✅ Generation time: <15s (including retrieval overhead)
Qualitative: - ✅ Fixes match repository architecture (REST vs GraphQL vs Microservices) - ✅ Fixes use existing helper functions (don't reinvent the wheel) - ✅ Enterprise feedback: "Understands our entire codebase"
Cost Analysis¶
Repository Indexing:
- One-time per repo: ~500 files × 200 lines = 100K lines
- Embedding cost: 100K lines ≈ 2M tokens × $0.0001 = $0.20 per repo
Incremental Updates:
- Only re-index changed files (GitHub webhook)
- Avg 10 files/day = 2K lines = $0.004/day = $0.12/month
Storage:
- 500 files × 5 chunks/file × 6 KB/chunk = 15 MB per repo
- 100 repos = 1.5 GB = $0.50/month (S3/Postgres)
Retrieval:
- <50ms per query (with ivfflat index)
Total Additional Cost: ~$1-2/month per active repo
Phase 3: Documentation Retrieval¶
Goal¶
Ground AI fixes in official security guidelines and framework documentation.
Timeline¶
Duration: 2 weeks (Q3 2026) Prerequisites: Phase 2 complete
Implementation¶
Week 1: Index Documentation¶
Day 1-2: Scrape & Index OWASP
// src/lib/rag/docs-indexer.ts
export async function indexOWASPDocs(): Promise<void> {
const owaspTopTen = await fetchOWASPTopTen2025();
for (const category of owaspTopTen) {
const embedding = await generateEmbedding({
title: category.title, // e.g., "A03:2025 - Injection"
description: category.description,
examples: category.examples,
mitigation: category.mitigation,
});
await db.insert(docEmbeddings).values({
docType: 'owasp',
docId: category.id,
title: category.title,
content: category.fullText,
embedding,
});
}
}
Day 3-4: Index Framework Docs
// Index popular framework security docs
const frameworks = [
'express', 'django', 'flask', 'spring-boot',
'react', 'vue', 'angular', 'next.js'
];
for (const framework of frameworks) {
const securityDocs = await fetchFrameworkSecurityDocs(framework);
// Embed and store...
}
Day 5: Index CVE Database
// Daily sync with NVD (National Vulnerability Database)
export async function syncCVEDatabase(): Promise<void> {
const recentCVEs = await fetchRecentCVEs(30); // Last 30 days
for (const cve of recentCVEs) {
const embedding = await generateEmbedding({
cveId: cve.id,
description: cve.description,
affectedPackages: cve.affectedPackages,
mitigation: cve.mitigation,
});
await db.insert(cveEmbeddings).values({
cveId: cve.id,
severity: cve.severity,
description: cve.description,
embedding,
});
}
}
Week 2: Retrieval & Integration¶
Day 1-3: Retrieve Relevant Docs
export async function retrieveRelevantDocs(
issue: VulnerabilityIssue,
framework?: string
): Promise<Documentation> {
const issueEmbedding = await generateEmbedding({
vulnerabilityType: issue.type,
description: issue.message,
code: issue.code,
});
// Retrieve top-3 OWASP docs
const owaspDocs = await retrieveSimilarDocs(issueEmbedding, 'owasp', 3);
// Retrieve top-2 framework docs (if framework detected)
const frameworkDocs = framework
? await retrieveSimilarDocs(issueEmbedding, framework, 2)
: [];
// Retrieve top-2 related CVEs
const cveDocs = await retrieveSimilarDocs(issueEmbedding, 'cve', 2);
return {
owasp: owaspDocs,
framework: frameworkDocs,
cve: cveDocs,
};
}
Day 4-5: Update AI Prompt
const docs = await retrieveRelevantDocs(issue, issue.framework);
const prompt = `
You are a security engineer fixing vulnerabilities.
**Official Security Guidelines**:
${docs.owasp.map(doc => `
[${doc.title}](${doc.url})
${doc.description}
Recommended Fix: ${doc.mitigation}
`).join('\n')}
**Framework Best Practices** (${issue.framework}):
${docs.framework.map(doc => `
${doc.title}: ${doc.content}
`).join('\n')}
**Related CVEs**:
${docs.cve.map(doc => `
${doc.cveId} (${doc.severity}): ${doc.description}
Affected: ${doc.affectedPackages}
`).join('\n')}
**Current Issue**:
${issue.code}
Fix following official guidelines. Include citation links in your explanation.
`;
Success Criteria¶
Metrics: - ✅ Fix explanations include citations (OWASP, CVE, docs) - ✅ Suggestions align with framework best practices - ✅ User trust: "Fixes feel authoritative" (qualitative feedback) - ✅ Acceptance rate: 85% → 88%+ (↑3pp)
Enterprise Value: - ✅ Compliance reports include OWASP/CVE mappings - ✅ Option to add internal company security docs (self-hosted)
Cost Analysis¶
One-Time Indexing:
- OWASP Top 10: 10 categories × 5K tokens = $0.05
- Framework Docs: 8 frameworks × 10K tokens = $0.80
- CVE Database: 1000 recent CVEs × 1K tokens = $1.00
Total: $1.85 (one-time)
Daily Sync:
- New CVEs: ~5/day × 1K tokens = $0.0005/day = $0.15/month
Storage:
- 1000 docs × 6 KB = 6 MB = negligible
Total Recurring Cost: <$1/month
Privacy & Security¶
Data Retention Policy¶
| Tier | Retention | Auto-Delete | Export |
|---|---|---|---|
| Free | 30 days | Yes | JSON |
| Team | 90 days | Yes | JSON, CSV |
| Enterprise | 1 year (configurable) | Optional | JSON, CSV, SQL dump |
PII Redaction¶
Automatic Redaction (before storage):
export function redactPII(code: string): string {
return code
.replace(/\b[\w._%+-]+@[\w.-]+\.[A-Z]{2,}\b/gi, '[EMAIL_REDACTED]')
.replace(/\b[A-Z0-9]{20,}\b/g, '[API_KEY_REDACTED]')
.replace(/\bsk_live_[A-Za-z0-9]+/g, '[STRIPE_KEY_REDACTED]')
.replace(/\bAKIA[0-9A-Z]{16}/g, '[AWS_KEY_REDACTED]');
}
On-Premise Deployment¶
Enterprise Option: - Self-host RAG database in customer VPC - Embeddings generated on-premise (local OpenAI proxy) - No code leaves customer infrastructure - Still benefit from public docs (OWASP, CVE)
Hybrid Mode: - Private code → On-premise embeddings - Public docs → CodeSlick cloud embeddings
GDPR Compliance¶
User Rights: - Right to access: Export all stored suggestions via API - Right to deletion: DELETE /api/teams/{id}/rag-data - Right to portability: Download JSON dump
Data Processing Agreement: - Code snippets = "pseudonymized data" (no PII) - Encrypted at rest (AES-256) - Encrypted in transit (TLS 1.3) - SOC2 Type II compliant (by Q4 2026)
Cost-Benefit Analysis¶
Total Cost (per team, per month)¶
| Phase | Indexing | Retrieval | Storage | Total |
|---|---|---|---|---|
| Phase 0 (Data Collection) | $0 | $0 | $0.10 | $0.10 |
| Phase 1 (Code Style) | $0.05 | $0.02 | $0.15 | $0.22 |
| Phase 2 (Repo Context) | $0.20 | $0.10 | $0.50 | $0.80 |
| Phase 3 (Docs) | $0.10 | $0.05 | $0.05 | $0.20 |
Total Additional Cost: ~$1.50/team/month (all phases combined)
Revenue Impact¶
Without RAG (Current): - Acceptance rate: 70% - Churn rate: 5%/month - MRR per team: €99
With RAG (Projected): - Acceptance rate: 85% → Higher perceived value - Churn rate: 3%/month → Better retention (-2pp) - MRR per team: €99 (same price, higher value)
Churn Reduction Impact:
100 teams × €99/month = €9,900 MRR
Without RAG: 5% churn = -€495/month lost revenue
With RAG: 3% churn = -€297/month lost revenue
Net Impact: +€198/month retained revenue
Annual Impact: +€2,376/year per 100 teams
Cost: €150/year (100 teams × $1.50/month)
ROI: 1,484% (€2,376 / €150)
Non-Financial Benefits¶
- Competitive Moat: Data flywheel (more usage = better fixes)
- Enterprise Sales: "Learns your codebase" is a killer feature
- User Delight: Fixes that feel "magical" (exactly what they would've written)
- Reduced Support: Fewer "why did AI suggest this?" questions
Rollout Strategy¶
Gradual Rollout¶
Phase 1 (Code Style RAG): - Week 1: Internal testing (CodeSlick team repos) - Week 2: 10% of teams (early adopters) - Week 3: 50% of teams (if metrics good) - Week 4: 100% of teams (if no issues)
Phase 2 (Repo Context): - Week 1: 5 beta teams (large repos >100 files) - Week 2: 20% of teams - Week 3: 100% of teams
Feature Flags:
const ragConfig = {
codeStyleRAG: process.env.ENABLE_CODE_STYLE_RAG === 'true',
repoContextRAG: process.env.ENABLE_REPO_CONTEXT_RAG === 'true',
docsRAG: process.env.ENABLE_DOCS_RAG === 'true',
};
// Per-team override
const teamConfig = await getTeamRAGConfig(teamId);
const useRAG = teamConfig?.enableRAG ?? ragConfig.codeStyleRAG;
Monitoring & Alerts¶
Key Metrics (track in real-time): - Acceptance rate (RAG vs baseline) - False positive rate - Generation time (p50, p95, p99) - Cost per fix - User NPS (weekly survey)
Alerts: - Acceptance rate drops >5pp → Investigate immediately - Generation time >15s → Scale retrieval infrastructure - Cost per fix >$0.50 → Optimize embeddings
Rollback Plan¶
If metrics regress: 1. Disable RAG via feature flag (instant rollback) 2. Investigate root cause (bad embeddings? Retrieval quality?) 3. Fix issue in staging environment 4. Re-enable for 10% of teams (test fix) 5. Gradual rollout again
Success Metrics Summary¶
| Metric | Baseline | Phase 1 | Phase 2 | Phase 3 | Target |
|---|---|---|---|---|---|
| Acceptance Rate | 70% | 80% | 85% | 88% | 85%+ |
| False Positive Rate | 5% | 3% | 2.5% | 2% | <3% |
| Modification Rate | 25% | 15% | 10% | 8% | <10% |
| Generation Time | 8s | 9s | 12s | 13s | <15s |
| Cost per Fix | $0.05 | $0.10 | $0.20 | $0.25 | <$0.30 |
| User NPS | 7.5 | 8.0 | 8.5 | 9.0 | 8.5+ |
| Churn Rate | 5% | 4% | 3% | 2.5% | <3% |
Decision Gates¶
Phase 0 → Phase 1¶
Criteria: - ✅ 100+ fix suggestions collected - ✅ User actions tracked reliably - ✅ Code style patterns detected automatically - ✅ No performance impact
Decision: If criteria met → Proceed to Phase 1 (Q1 2026)
Phase 1 → Phase 2¶
Criteria: - ✅ Acceptance rate: 70% → 80%+ (↑10pp) - ✅ Modification rate: 25% → <15% (↓10pp) - ✅ User NPS: +0.5 points - ✅ Cost per fix: <$0.10
Decision: If criteria met → Proceed to Phase 2 (Q2 2026)
Phase 2 → Phase 3¶
Criteria: - ✅ Can analyze files >300 lines - ✅ Cross-file awareness validated - ✅ Acceptance rate: 80% → 85%+ (↑5pp) - ✅ Generation time: <15s
Decision: If criteria met → Proceed to Phase 3 (Q3 2026)
Phase 3 → Private Model¶
Criteria: - ✅ 10,000+ high-quality labeled examples - ✅ Acceptance rate: 85%+ sustained - ✅ Cost per fix: >$0.20 (makes fine-tuning worth it) - ✅ Enterprise demand: 50+ customers
Decision: Evaluate in Q4 2026
Next Steps (This Week)¶
- ✅ Create database schema (
schema-fix-suggestions.ts) - ✅ Create data collector (
fix-suggestion-collector.ts) - ⏳ Run migration:
npx drizzle-kit push:pg - ⏳ Integrate into Apply Fix API: Collect user actions
- ⏳ Integrate into AI fix generation: Collect suggestions
- ⏳ Add reject endpoint: Collect rejections
Timeline: Complete by November 30, 2025
Questions & Assumptions¶
Assumptions¶
- pgvector performance: Cosine similarity search scales to 100K+ embeddings
- Embedding cost: OpenAI pricing remains stable (~$0.0001/1k tokens)
- Acceptance rate improvement: Users value code style consistency
- Repository size: Avg repo has <500 source files
Open Questions¶
- Multi-repo teams: Index all repos or just active ones?
- Decision: Index top-5 most active repos per team
- Embedding model: text-embedding-3-small vs text-embedding-3-large?
- Decision: Start with small (cheaper, faster), upgrade if needed
- Retrieval quality: How many similar fixes to retrieve (topK)?
- Decision: Start with topK=3, A/B test 3 vs 5 vs 10
- Private model: Which base model to fine-tune?
- Decision: Defer to Q4 2026, depends on availability and cost
Appendix: Technical Details¶
Database Schema¶
See: src/lib/db/schema-fix-suggestions.ts
API Endpoints¶
POST /api/teams/{id}/collect-fix-suggestion - Stores AI-generated suggestions - Called from AI fix generation pipeline
POST /api/teams/{id}/apply-fix - Updates user action to 'accepted' - Existing endpoint, add one line
POST /api/teams/{id}/reject-fix - Updates user action to 'rejected' - NEW endpoint (create in Phase 0)
GET /api/teams/{id}/rag-analytics - Returns acceptance rates, false positive rates - Dashboard analytics
pgvector Setup¶
-- Install extension
CREATE EXTENSION vector;
-- Migrate embeddings from JSONB to vector
ALTER TABLE fix_embeddings
ADD COLUMN embedding_vector vector(1536);
-- Populate from JSONB (one-time migration)
UPDATE fix_embeddings
SET embedding_vector = embedding::text::vector;
-- Create index (IMPORTANT for performance)
CREATE INDEX ON fix_embeddings
USING ivfflat (embedding_vector vector_cosine_ops)
WITH (lists = 100);
-- Drop old JSONB column
ALTER TABLE fix_embeddings DROP COLUMN embedding;
Document Owner: CTO Last Updated: November 23, 2025 Next Review: January 6, 2026 (before Phase 8.5 kickoff)