LLM 主观评审系统
LLM Subjective Review System
基于 LLM 的盲审流水线,分批评估代码设计质量并合并多轮评审结果
子问题
1.Review packet generation with file context
2.Batch splitting and parallel subagent execution
3.Evidence-weighted score merging across batches
4.Finding deduplication and concept matching
5.Dimension-specific prompt contract injection for review consistency
6.Score-independent evidence weighting prevents score-aware bias in merge
7.Dynamic score cap based on finding severity prevents high-score-with-findings contradiction
各项目的解法1 solutions
横向对比
| 维度 | Desloppify |
|---|---|
| 检查方式 | LLM subagent 盲审 + 9 类主题批次并行 |
| 评估维度 | JSON 驱动可配置维度(abstraction_fitness, design_coherence 等 15+ 维度) |
| 评估粒度 | 维度级评分 0-100 + 证据加权合并 + finding-pressure 下压 |
| 迭代机制 | auto-resolve 旧 findings + 增量维度补充 rerun |
| 盲审隔离 | blind packet 剥离已有评分,消除 LLM 锚定效应 |
| 合并算法 | 70% 加权均值 + 30% 批次最低分 + severity-based pressure penalty + dynamic score cap |
最佳实践
1.Fail-closed import validation prevents low-quality reviews
2.Blind review isolation prevents score-aware bias
3.70/30 weighted-mean/floor blend prevents single outlier batch from dominating
4.Positive observation filter rejects non-defect findings at import time
5.Prompt contract ensures score-finding consistency across all subagents