RAG Evaluation Sheet
Download a structured testing workbook for evaluating clinical RAG system accuracy, citation quality, and safety.
Download Evaluation Sheet
A structured CSV template with 50 rows for testing clinical RAG accuracy, citations, hallucinations, and safety.
50 rows with 5 example questions included
Sheet Fields
| Field | Description |
|---|---|
| Question ID | Unique identifier for each test question (Q001–Q050) |
| Question | The clinical question asked to the RAG system |
| Category | Question type: Clinical guideline, Drug query, Emergency, Edge case, Adversarial |
| Expected Source | The authoritative source document that should be retrieved |
| Retrieved Source | The actual source document returned by the system |
| Answer | The RAG system's generated response |
| Citation Present (Y/N) | Whether the answer includes source citations |
| Unsupported Claim (Y/N) | Whether the answer contains claims not supported by retrieved sources |
| Clinical Risk Level | Low / Medium / High — severity of potential harm if answer is incorrect |
| Reviewer Notes | Free-text comments from the clinician reviewer |
| Pass/Fail | Overall assessment for this test question |
How to Use This Sheet
- Open the CSV in a spreadsheet application (Excel, Google Sheets, or Numbers).
- Review and customize the 5 example questions, or replace them with your own clinical test set.
- Run each question through your RAG system and record the response.
- Have a clinician independently score each response for accuracy, citation quality, and safety.
- Track Pass/Fail rates by category to identify system weaknesses.
- Repeat evaluation after each system change or knowledge base update.