RAG Evaluation Sheet

Download a structured testing workbook for evaluating clinical RAG system accuracy, citation quality, and safety.

Download Evaluation Sheet

A structured CSV template with 50 rows for testing clinical RAG accuracy, citations, hallucinations, and safety.

50 rows with 5 example questions included

Sheet Fields

FieldDescription
Question IDUnique identifier for each test question (Q001–Q050)
QuestionThe clinical question asked to the RAG system
CategoryQuestion type: Clinical guideline, Drug query, Emergency, Edge case, Adversarial
Expected SourceThe authoritative source document that should be retrieved
Retrieved SourceThe actual source document returned by the system
AnswerThe RAG system's generated response
Citation Present (Y/N)Whether the answer includes source citations
Unsupported Claim (Y/N)Whether the answer contains claims not supported by retrieved sources
Clinical Risk LevelLow / Medium / High — severity of potential harm if answer is incorrect
Reviewer NotesFree-text comments from the clinician reviewer
Pass/FailOverall assessment for this test question

How to Use This Sheet

  1. Open the CSV in a spreadsheet application (Excel, Google Sheets, or Numbers).
  2. Review and customize the 5 example questions, or replace them with your own clinical test set.
  3. Run each question through your RAG system and record the response.
  4. Have a clinician independently score each response for accuracy, citation quality, and safety.
  5. Track Pass/Fail rates by category to identify system weaknesses.
  6. Repeat evaluation after each system change or knowledge base update.