问题域/PD-502

Anti-Gaming 评分完整性

Anti-Gaming Score Integrity

防止 AI 或人类通过作弊手段提升质量分数的完整性保护机制

子问题

1.Target score proximity detection

2.Blind review session isolation

3.Low-score evidence enforcement

4.Wontfix penalty in strict scoring

5.High-score missing justification detection

6.Unverified fix status exploitation

7.Evidence-free score inflation in batch merging

8.Provenance chain tampering for assessment trust escalation

各项目的解法1 solutions

Signals

横向对比

维度Desloppify
检测机制容差带 proximity detection(±5%),warn→penalized 两级响应
评分通道lenient/strict/verified_strict 三模式并行,wontfix 和未验证修复分级惩罚
证据要求双向强制:低分(<85)要 finding,高分(>85)要 issues_note,fail-closed 拒绝
盲审隔离SHA-256 packet hash + runner 白名单 + attestation 短语验证的四层 provenance 信任链
合并策略Evidence-weighted merge(1+evidence+findings),finding pressure 施加 penalty 和 cap
惩罚机制penalized 状态重置匹配维度为 0.0,强制 re-review

最佳实践

1.Fail-closed import validation for review findings

2.Evidence-weighted assessment merging

3.Score-independent evidence weighting (1+evidence+findings) prevents hollow high scores from diluting substantive reviews

4.Three-mode scoring channels (lenient/strict/verified_strict) serve different trust levels without score manipulation

5.Bidirectional evidence mandate: low scores need findings, high scores need issue explanations

6.Two-stage gaming response: single match warns, multiple matches reset to zero