2025年9月20日,小吾使用Claude Code搭配Claude Sonnet 4大模型生成HTML数据看板,不仅数据全部准确,而且RIPER-5过程清晰。RIPER-5是网友robotlovehuman于2025年3月17日在Cursor论坛上发布的指导AI有序生成代码的五步法则。它能清晰展示AI生成代码的研究、创新、规划、执行和评审这五个步骤的状态,提升人类在协同编程过程中的掌控感。
1. 解忧场景-需求描述
老板丢给你一个有80条数据的“2025年AI智能体表现数据集”的Excel文件(first-80-rows-agentic_ai_performance_dataset_20250622.xlsx),让你两个小时后给他一份综合数据看板,要求包含下面3个老板关心的问题的数据可视化图表:
(1)支持多模态处理(multimodal_capability)的智能体类型(agent_type)在该智能体类型中的占比从大到小排名前三的智能体类型是哪三个?; (2)支持多模态处理(multimodal_capability)的大模型架构(model_architecture)在该大模型架构中的占比从大到小排名前三的大模型架构是哪三个?; (3)各种智能体任务处理类别(task_category)所对应的智能体所表现出的公正性(bias detection)的中位数从高到低排名前三的是哪三种智能体任务处理类别?
你打算在Cursor中用Claude Code搭配Claude Sonnet 4大模型遵循RIPER-5规则生成HTML数据看板。
2. 期望的正确数据✅ vs Cursor+ClaudeCode+Sonnet4+RIPER5实际得到的数据✅
(1)智能体类型多模态能力占比排名前三
期望的正确数据✅ | Cursor+ClaudeCode+Sonnet4+RIPER5实际得到的数据✅ |
---|---|
Research Assistant: 60.0% (3/5) ✅ | Research Assistant: 60.0% (3/5) ✅ |
Document Processor: 33.3% (2/6) ✅ | Document Processor: 33.3% (2/6) ✅ |
Sales Assistant: 28.6% (2/7) ✅ | Sales Assistant: 28.6% (2/7) ✅ |
(2)大模型架构多模态能力占比排名前三
期望的正确数据✅ | Cursor+ClaudeCode+Sonnet4+RIPER5实际得到的数据✅ |
---|---|
GPT-4o: 37.5% (3/8) ✅ | GPT-4o: 37.5% (3/8) ✅ |
CodeT5+: 33.3% (3/9) ✅ | CodeT5+: 33.3% (3/9) ✅ |
Transformer-XL: 20.0% (2/10) ✅ | Transformer-XL: 20.0% (2/10) ✅ |
(3)任务类型公正性中位数排名前三
期望的正确数据✅ | Cursor+ClaudeCode+Sonnet4+RIPER5实际得到的数据✅ |
---|---|
Communication: 0.8214 ✅ | Communication: 0.8214 ✅ |
Research & Summarization: 0.7853 ✅ | Research & Summarization: 0.7853 ✅ |
Decision Making: 0.7816 ✅ | Decision Making: 0.7816 ✅ |
3. 测评过程
(1)在电脑上创建一个空目录,用于测评,并在终端进入该目录,用Cursor打开该目录
(2)将Excel文件“first-80-rows-agentic_ai_performance_dataset_20250622.xlsx”复制到测评空目录中
(3)用Cursor生成适用于python/html/css/javascript/excel的.gitignore文件
(4)创建空的todo.md和bubbles.md文件,分别让AI用于记录计划和对话
(5)在Cursor的内置终端中运行“claude“命令启动Claude Code,并在其中运行”/init“生成CLAUDE.md文件
(6)用Cursor在项目根目录创建需求提示词文件 requirements.md,并添加以下内容:
我有一个名为 @first-80-rows-agentic_ai_performance_dataset_20250622.xlsx 的Kaggle数据集"Agentic AI Performance Dataset 2025”,主要关注AI智能体表现的3个问题:
(1)支持多模态处理(multimodal_capability)的智能体类型(agent_type)在该智能体类型中的占比从大到小排名前三的智能体类型是那三个?;
(2)支持多模态处理(multimodal_capability)的大模型架构(model_architecture)在该大模型架构中的占比从大到小排名前三的大模型架构是哪三个?;
(3)各种智能体处理任务(task_category)各自的智能体表现的公正性(bias detection)的中位数从高到低排名前三的是哪三种智能体处理任务?
我希望将这个数据集可视化,请帮我读取这个Excel文件,生成一个名为“data-dashboard.html”的HTML格式的综合数据看板。
看板应包含数据可视化设计,并显示你实际处理的数据集总记录数(本数据集共80条数据,请只显示你实际读取并处理的数据行数)。
请使用浅色调设计看板,提供可直接运行的HTML静态代码。
无须提供动态效果或使用静态图片,也不要依赖加载Excel文件来运行HTML代码,确保所有图表在手机浏览器中能完整显示。
请将你用于读取Excel数据集并分析数据的Python代码保存到名为“read-excel-data.py”的Python文件中,以便我之后运行并检查代码。
(7)用Cursor在项目根目录创建RIPER-5规则提示词文件 rules-riper5.md,并添加以下内容(小吾在RIPER-5原始版中的PLAN Mode里增加了将规划写入todo.md和在EXECUTE Mode里增加了简单设计的内容):
# RIPER-5 MODE: STRICT OPERATIONAL PROTOCOL
## CONTEXT PRIMER
You are Claude Code, you are integrated into Cursor IDE, an A.I based fork of VS Code. Due to your advanced capabilities, you tend to be overeager and often implement changes without explicit request, breaking existing logic by assuming you know better than me. This leads to UNACCEPTABLE disasters to the code. When working on my codebase—whether it’s web applications, data pipelines, embedded systems, or any other software project—your unauthorized modifications can introduce subtle bugs and break critical functionality. To prevent this, you MUST follow this STRICT protocol:
## META-INSTRUCTION: MODE DECLARATION REQUIREMENT
**YOU MUST BEGIN EVERY SINGLE RESPONSE WITH YOUR CURRENT MODE IN BRACKETS. NO EXCEPTIONS.**
**Format: [MODE: MODE_NAME]**
**Failure to declare your mode is a critical violation of protocol.**
## THE RIPER-5 MODES
### MODE 1: RESEARCH
[MODE: RESEARCH]
- **Purpose**: Information gathering ONLY
- **Permitted**: Reading files, asking clarifying questions, understanding code structure
- **Forbidden**: Suggestions, implementations, planning, or any hint of action
- **Requirement**: You may ONLY seek to understand what exists, not what could be
- **Duration**: Until I explicitly signal to move to next mode
- **Output Format**: Begin with [MODE: RESEARCH], then ONLY observations and questions
### MODE 2: INNOVATE
[MODE: INNOVATE]
- **Purpose**: Brainstorming potential approaches
- **Permitted**: Discussing ideas, advantages/disadvantages, seeking feedback
- **Forbidden**: Concrete planning, implementation details, or any code writing
- **Requirement**: All ideas must be presented as possibilities, not decisions
- **Duration**: Until I explicitly signal to move to next mode
- **Output Format**: Begin with [MODE: INNOVATE], then ONLY possibilities and considerations
### MODE 3: PLAN
[MODE: PLAN]
- **Purpose**: Creating exhaustive technical specification
- **Permitted**: Detailed plans with exact file paths, function names, and changes
- **Forbidden**: Any implementation or code writing, even “example code”
- **Requirement**: Plan must be comprehensive enough that no creative decisions are needed during implementation
- **Mandatory Final Step**: Convert the entire plan into a numbered, sequential CHECKLIST with each atomic action as a separate item and append the plan to the end of the @todo.md file and give the current requirement a name before the plan.
- **Checklist Format**:
IMPLEMENTATION CHECKLIST:
1. [Specific action 1]
2. [Specific action 2]
...
n. [Final action]
- **Duration**: Until I explicitly approve plan and signal to move to next mode
- **Output Format**: Begin with [MODE: PLAN], then ONLY specifications and implementation details
### MODE 4: EXECUTE
[MODE: EXECUTE]
- **Purpose**: Implementing EXACTLY what was planned in Mode 3
- **Permitted**:
Starting by translating my full prompt into English and appending it to the end of the file @bubbles.md , then starting working on the to-dos one by one, marking them off in @todo.md as you complete them.
At each step, please give me a brief summary of your changes and append it to the file @bubbles.md .
ONLY implementing what was explicitly detailed in the approved plan.
Keep each task and code change as simple as possible. Avoid making large-scale or complex modifications. Each change should affect the minimum amount of code. When there are multiple ways to complete a task, choose the simplest and most direct approach.
Finally, append a review section at the end of the file @todo.md summarizing the changes you made and any other relevant information.
- **Forbidden**: Any deviation, improvement, or creative addition not in the plan
- **Entry Requirement**: ONLY enter after explicit “ENTER EXECUTE MODE” command from me
- **Deviation Handling**: If ANY issue is found requiring deviation, IMMEDIATELY return to PLAN mode
- **Output Format**: Begin with [MODE: EXECUTE], then ONLY implementation matching the plan
### MODE 5: REVIEW
[MODE: REVIEW]
- **Purpose**: Ruthlessly validate implementation against the plan
- **Permitted**: Line-by-line comparison between plan and implementation
- **Required**: EXPLICITLY FLAG ANY DEVIATION, no matter how minor
- **Deviation Format**: “⚠️DEVIATION DETECTED: [description of exact deviation]”
- **Reporting**: Must report whether implementation is IDENTICAL to plan or NOT
- **Conclusion Format**: “✅IMPLEMENTATION MATCHES PLAN EXACTLY” or “❌IMPLEMENTATION DEVIATES FROM PLAN”
- **Output Format**: Begin with [MODE: REVIEW], then systematic comparison and explicit verdict
## CRITICAL PROTOCOL GUIDELINES
1. You CANNOT transition between modes without my explicit permission
2. You MUST declare your current mode at the start of EVERY response
3. In EXECUTE mode, you MUST follow the plan with 100% fidelity
4. In REVIEW mode, you MUST flag even the smallest deviation
5. You have NO authority to make independent decisions outside the declared mode
6. Failing to follow this protocol will cause catastrophic outcomes for my codebase
## MODE TRANSITION SIGNALS
Only transition modes when I explicitly signal with:
- “ENTER RESEARCH MODE”
- “ENTER INNOVATE MODE”
- “ENTER PLAN MODE”
- “ENTER EXECUTE MODE”
- “ENTER REVIEW MODE”
Without these exact signals, remain in your current mode.
rules-riper5.md中译版:
# RIPER-5 模式:严格操作协议
## 背景介绍
你是Claude Code,你集成在Cursor IDE中,这是一个基于人工智能的VS Code分支。由于你的高级能力,你往往过于积极主动,经常在没有明确请求的情况下实施更改,通过假设你比我更了解而破坏现有逻辑。这会导致代码出现不可接受的灾难。当处理我的代码库时——无论是Web应用程序、数据管道、嵌入式系统还是任何其他软件项目——你未经授权的修改可能会引入微妙的错误并破坏关键功能。为防止这种情况,你必须遵循这个严格的协议:
## 元指令:模式声明要求
**你必须在每一个回复的开头用方括号标明你当前的模式。绝无例外。**
**格式:[模式:模式名称]**
**未能声明你的模式是对协议的严重违反。**
## RIPER-5 模式
### 模式 1:研究
[模式:研究]
- **目的**:仅限信息收集
- **允许**:读取文件,提出澄清问题,理解代码结构
- **禁止**:建议、实现、规划或任何行动暗示
- **要求**:你只能寻求理解现有内容,而不是可能的内容
- **持续时间**:直到我明确指示进入下一个模式
- **输出格式**:以[模式:研究]开始,然后只有观察和问题
### 模式 2:创新
[模式:创新]
- **目的**:头脑风暴潜在方法
- **允许**:讨论想法,优点/缺点,寻求反馈
- **禁止**:具体规划、实施细节或任何代码编写
- **要求**:所有想法必须作为可能性提出,而非决定
- **持续时间**:直到我明确指示进入下一个模式
- **输出格式**:以[模式:创新]开始,然后只有可能性和考虑因素
### 模式 3:规划
[模式:规划]
- **目的**:创建详尽的技术规范
- **允许**:包含准确文件路径、函数名称和更改的详细计划
- **禁止**:任何实现或代码编写,甚至"示例代码"
- **要求**:计划必须足够全面,以至于实现过程中不需要创造性决策
- **强制最后步骤**:将整个计划转换为编号的、顺序的检查清单,每个原子操作作为单独的项目,并将计划附加到@todo.md文件的末尾,并在计划前为当前需求命名。
- **清单格式**:
实施清单:
1. [具体操作1]
2. [具体操作2]
...
n. [最终操作]
- **持续时间**:直到我明确批准计划并指示进入下一个模式
- **输出格式**:以[模式:规划]开始,然后只有规范和实施细节
### 模式 4:执行
[模式:执行]
- **目的**:准确实现模式3中计划的内容
- **允许**:
首先将我的完整提示翻译成英文并附加到文件@bubbles.md的末尾,然后一个一个地开始处理待办事项,在完成时在@todo.md中标记它们。
在每个步骤中,请给我一个简短的更改摘要并将其附加到文件@bubbles.md中。
只实现已批准计划中明确详述的内容。
保持每项任务和代码更改尽可能简单。避免大规模或复杂的修改。每次更改应影响最少量的代码。当有多种方法完成任务时,选择最简单和最直接的方法。
最后,在文件@todo.md的末尾附加一个审查部分,总结你所做的更改和任何其他相关信息。
- **禁止**:任何偏离、改进或创造性添加不在计划中
- **进入要求**:仅在我明确"进入执行模式"命令后进入
- **偏差处理**:如果发现任何需要偏离的问题,立即返回规划模式
- **输出格式**:以[模式:执行]开始,然后只有与计划匹配的实现
### 模式 5:审查
[模式:审查]
- **目的**:无情地验证实现是否与计划一致
- **允许**:逐行比较计划和实现
- **要求**:明确标记任何偏差,无论多么微小
- **偏差格式**:"⚠️发现偏差:[准确偏差描述]"
- **报告**:必须报告实现是否与计划完全相同
- **结论格式**:"✅实现与计划完全匹配"或"❌实现偏离计划"
- **输出格式**:以[模式:审查]开始,然后系统比较和明确判决
## 关键协议指南
1. 没有我的明确许可,你不能在模式之间转换
2. 你必须在每个回复的开头声明你当前的模式
3. 在执行模式下,你必须100%忠实地遵循计划
4. 在审查模式下,你必须标记即使是最小的偏差
5. 在声明的模式之外,你没有权力做出独立决定
6. 未能遵循此协议将导致我的代码库出现灾难性后果
## 模式转换信号
只有当我明确发出以下信号时才转换模式:
- "进入研究模式"
- "进入创新模式"
- "进入规划模式"
- "进入执行模式"
- "进入审查模式"
没有这些确切信号,保持你当前的模式。
(8)在Claude Code中提交以下提示词:
请你严格按照riper5工作流程完成一个生成数据看板的需求。
riper5工作流程参见 @rules-riper5.md。
生成数据看板的需求参见 @requirements.md。
4. 测评结论
生成的数据可视化看板界面美观,且3个老板关心的问题的数据全对。

AI在INNOVATE mode的回复给人以启发:
⏺ [MODE: INNOVATE]
Based on my research, I can explore several approaches for creating the data dashboard and Python
analysis script:
Approach 1: Python-first with embedded data
- Use pandas for comprehensive data analysis in Python
- Generate all statistics and charts data in Python
- Embed the processed results directly into the HTML as JavaScript arrays
- Use Chart.js or D3.js for visualizations
- Pros: Clean separation, robust analysis
- Cons: Requires converting data to JavaScript format
Approach 2: Minimal Python with HTML-embedded calculations
- Python script focuses only on data extraction and basic preprocessing
- Embed raw data into HTML and perform calculations in JavaScript
- Use lightweight charting libraries
- Pros: Simpler data flow, self-contained HTML
- Cons: Duplicate logic between Python and JavaScript
Approach 3: Python generates complete HTML
- Python script does all analysis and generates complete HTML with embedded data
- Use template-based HTML generation
- Include CSS and JavaScript inline for mobile compatibility
- Pros: Single source of truth, consistent styling
- Cons: More complex Python script
Visualization Library Considerations:
- Chart.js: Lightweight, mobile-friendly, good for basic charts
- Plotly.js: Rich features but larger file size
- Pure CSS/SVG: Minimal dependencies, excellent mobile support
- Canvas-based custom charts: Maximum control, optimized for mobile
Mobile Optimization Strategies:
- Responsive grid layout with CSS flexbox/grid
- Touch-friendly interactive elements
- Optimized chart sizes for small screens
- Light color palette with good contrast
- Minimal external dependencies
Data Analysis Approach:
- Question 1: Group by agent_type, calculate multimodal percentage, rank top 3
- Question 2: Group by model_architecture, calculate multimodal percentage, rank top 3
- Question 3: Group by task_category, calculate bias_detection median, rank top 3
AI在PLAN mode的回复给人以启发:
⏺ [MODE: PLAN]
Based on the requirements and selected approach, here is the detailed technical specification:
Project Specification: Agentic AI Performance Data Dashboard
Deliverables:
1. read-excel-data.py - Python script for data analysis
2. data-dashboard.html - Standalone HTML dashboard
Technical Architecture:
Python Script (read-excel-data.py):
Data Processing Pipeline:
1. Import required libraries: pandas, json
2. Read Excel file with proper header handling
3. Perform three required analyses:
- Analysis 1: Calculate multimodal capability percentage by agent_type
- Analysis 2: Calculate multimodal capability percentage by model_architecture
- Analysis 3: Calculate bias_detection median by task_category
4. Generate summary statistics (total records processed)
5. Export results to JSON format for HTML embedding
6. Print analysis results to console for verification
Specific Analysis Logic:
- Question 1: Group by agent_type, calculate (multimodal_capability==True).sum() / total_count *
100, rank descending, top 3
- Question 2: Group by model_architecture, calculate (multimodal_capability==True).sum() /
total_count * 100, rank descending, top 3
- Question 3: Group by task_category, calculate bias_detection_score.median(), rank descending,
top 3
HTML Dashboard (data-dashboard.html):
Structure:
1. HTML5 document with responsive viewport meta tag
2. Inline CSS with light color scheme
3. Inline JavaScript with Chart.js CDN
4. Four main sections: Header, Question 1, Question 2, Question 3
5. Footer with data summary
Styling Requirements:
- Light color palette: #f8f9fa (background), #ffffff (cards), #6c757d (text), #007bff (accents)
- Mobile-responsive grid layout using CSS flexbox
- Card-based design for each analysis section
- Touch-friendly button sizes (minimum 44px)
- Readable fonts with good contrast ratios
Visualization Components:
- Question 1: Horizontal bar chart showing top 3 agent types with multimodal percentages
- Question 2: Horizontal bar chart showing top 3 model architectures with multimodal percentages
- Question 3: Horizontal bar chart showing top 3 task categories with bias detection medians
- Data summary: Simple text display showing total records processed
Chart Configuration:
- Use Chart.js library via CDN
- Responsive charts with maintainAspectRatio: false
- Light color scheme for bars: #007bff, #28a745, #ffc107
- Clear axis labels and tooltips
- Mobile-optimized legend placement
File Structure:
/read-excel-data.py
/data-dashboard.html
Data Flow:
1. Python script reads Excel → processes data → outputs JSON structure
2. JSON data embedded directly in HTML JavaScript section
3. Chart.js renders visualizations from embedded data
4. No external file dependencies for HTML execution
IMPLEMENTATION CHECKLIST:
1. Create read-excel-data.py with pandas Excel reading logic
2. Implement proper header handling for Excel file
3. Add analysis function for agent_type multimodal percentage calculation
4. Add analysis function for model_architecture multimodal percentage calculation
5. Add analysis function for task_category bias_detection median calculation
6. Add JSON export functionality for analysis results
7. Add console output for verification of results
8. Create HTML file structure with responsive viewport and meta tags
9. Add inline CSS with light color scheme and mobile-responsive layout
10. Add Chart.js CDN link and initialization
11. Embed analysis data as JavaScript variables in HTML
12. Create horizontal bar chart for Question 1 (agent_type analysis)
13. Create horizontal bar chart for Question 2 (model_architecture analysis)
14. Create horizontal bar chart for Question 3 (task_category analysis)
15. Add data summary section showing total records processed
16. Test mobile responsiveness and chart display
17. Validate HTML works without external file dependencies
18. Verify all charts display correctly on mobile browsers
5. 测评环境
5.1 测评时间
2025.09.20
5.2 电脑操作系统macOS
版本号:15.6.1
5.3 Claude Code版本
Claude Code v1.0.120
5.4 搭配大模型型号
Sonnet Sonnet 4 for daily use