This document outlines the required format for storing anonymized human subject data. The data is split into two files: one for individual participant data (human_data_ind.json) and one for mean/aggregated data (human_data_mean.json).
Both files must include two summary metadata fields at the root:
| Key | Type | Description |
|---|---|---|
participants_info |
Dictionary | Demographic summary: requires count (Integer), age (Number; mean age or range), and gender (Dictionary with category counts). Additional fields allowed (e.g., education, geography). |
judgment_count |
Integer | Total number of human ratings in this file (sum of lengths across all response arrays). |
human_data_ind.json (Individual Data)This file contains the raw, anonymized responses for all participants, compiled by trial and question. The root of the JSON is an object where each key is a stimuli_id.
Each stimuli_id object contains keys that must match the tags defined in the trial's queries (see Section 3.2). The value for each tag is an array, where each item in the array is the response from a single participant.
For example, if there were 4 participants, each response array will contain 4 elements.
The data type inside each participant's response array depends on the query type defined in trial.jsonl:
single-slider: Array of Numbers (if num_clicks is 1 or not specified). If num_clicks > 1, then Array of Arrays of Numbers (each inner array contains N values from one participant)."goal_confidence": [95, 88, 70, 92] (single click) or "confidence_ratings": [[65, 72, 68], [80, 75, 78]] (multi-click with num_clicks=3)multi-choice: Array of Objects."agent_goal": [ { "idx": 0, "option_text": "Reaching the red box" }, { "idx": 1, "option_text": "Reaching the blue box" }, ... ]multi-slider: Array containing a single Object. The object's keys are the slider labels (from the option array in trial.jsonl), and its values are Arrays of Numbers (each array contains the response from all participants for that one slider)."statement_rating": [ { "Statement 1": [80, 75], "Statement 2": [25, 30] } ]multi-select: Array of Arrays of Integers. Each inner array is a one-hot encoding (0 or 1) of the participant's selections, corresponding to the option array in trial.jsonl."cities_visited": [ [0, 1, 1, 0], [1, 0, 0, 0] ]textbox: Array of Strings."strategy_description": [ "The agent seemed...", "It was trying to...", "I think it wanted...", ... ]ranking: Array of Arrays of Objects. Each inner array contains the ranked items for one participant, with each object including idx, option_text, and rank."goal_ranking": [ [ { "idx": 2, "option_text": "Reaching the exit", "rank": 1 }, { "idx": 0, "option_text": "Finding the key", "rank": 2 } ], [ ... ] ]human_data_ind.jsonThis example corresponds to the query tags defined in the Trial Schema examples, assuming data from 2 participants. The keys "trial_1_1", "agent_goal", "goal_confidence", "statement_rating", and "strategy_description" are all defined in the trial.jsonl file.
{
"participants_info": {
"count": 104,
"age": 35.27,
"gender": { "male": 33, "female": 69, "other": 2 }
},
"judgment_count": 11886,
"trial_1_1": {
"agent_goal": [
{ "idx": 0, "option_text": "Reach the red box" },
{ "idx": 1, "option_text": "Reach the blue box" }
],
"goal_confidence": [95, 88],
"statement_rating": [
{ "Statement 1": [80, 75], "Statement 2": [25, 30] }
],
"strategy_description": [
"The agent seemed to favor the red box.",
"It hesitated before choosing the blue box."
]
},
"trial_1_2": {
"agent_goal": [
{ "idx": 2, "option_text": "Stay near the center" },
{ "idx": 0, "option_text": "Reach the red box" }
],
"goal_confidence": [70, 82]
}
}
human_data_mean.json (Mean Data)This file provides a convenient summary of the data, aggregated across all participants. The structure mirrors human_data_ind.json, but replaces the arrays of individual responses with aggregated statistics.
The structure for each stimuli_id will vary depending on the data type, but generally follows these principles:
single-slider Data (e.g., goal_confidence): A single Number representing the "mean" of the numbers in the ind array.multi-choice Data (e.g., agent_goal): An object where keys are constructed from the tag and the option's 1-based index (e.g., tag_1, tag_2), and values are the counts or proportions.multi-slider Data (e.g., statement_rating): An object where keys are constructed from the tag and the option's 1-based index (e.g., tag_1, tag_2), and each value is a single Number representing the "mean".textbox or multi-select Data: Aggregations are optional and task-specific (e.g., keyword counts, selection frequencies).human_data_mean.jsonThis example shows the aggregated data for the human_data_ind.json example above. The keys are derived from the tags in trial.jsonl (e.g., agent_goal becomes agent_goal_1, agent_goal_2; statement_rating becomes statement_rating_1, statement_rating_2).
{
"participants_info": {
"count": 104,
"age": 35.27,
"gender": { "male": 33, "female": 69, "other": 2 }
},
"judgment_count": 11886,
"trial_1_1": {
"agent_goal": {
"agent_goal_1": 0.5,
"agent_goal_2": 0.5
},
"goal_confidence": 91.5,
"statement_rating": {
"statement_rating_1": 77.5,
"statement_rating_2": 27.5
}
},
"trial_1_2": {
"agent_goal": {
"agent_goal_1": 0.5,
"agent_goal_3": 0.5
},
"goal_confidence": 76.0
}
}