This document outlines the standardized file structure and JSON schemas required for submitting a dataset to the CogGym benchmark. The goal is to create a consistent, machine-readable format for diverse cognitive science experiments.
HEML (Human Experiment Markup Language) is a JSON-based markup language for describing the critical components of an online experiment. HEML specifies:
By adhering to HEML, experiments become portable, reproducible, and easy to render consistently across platforms.
This guide is structured to follow the logical components of an experiment:
Each dataset must be contained within a single directory, named according to the convention below. A study may have multiple experiments, each in its own subdirectory (exp1, exp2, etc.). If there is only one experiment, it should still be placed in an exp1 subdirectory. Every study directory must include a top-level README.md, and every experiment subdirectory must include its own README.md summarizing the experiment-specific contents.
[DatasetName]/
├── README.md
├── exp1/ (or exp1, exp2, exp3, etc. if multiple experiments)
│ ├── README.md
│ ├── trial.jsonl
│ ├── config.json
│ ├── instruction.jsonl
│ ├── human_data_ind.json
│ ├── human_data_mean.json
│ └── assets/
│ ├── stimulus_image_01.png
│ ├── stimulus_video_01.mp4
│ └── ...
├── exp2/ (optional, if study has multiple experiments)
│ ├── README.md
│ ├── trial.jsonl
│ ├── config.json
│ └── ...
The name of the root folder ([DatasetName]) should follow the following convention: [AuthorLastName][Year][FirstNonArticleWordOfTitle]
Examples:
Rabinowitz2018Minimal/ with at minimum an exp1/ subdirectoryEach experiment subdirectory (exp1, exp2, etc.) must contain the following files:
trial.jsonl: Definitions for all stimuli and experimental trials.config.json: Core configuration, metadata, and experiment flow for the experiment.instruction.jsonl: Defines the pre-experiment instruction and quiz steps.human_data_ind.json / human_data_mean.json: Anonymized human subject data.README.md: Required narrative summary describing the experiment, linking to relevant documentation, and noting unique considerations.assets/ (Optional Folder): This folder is required if and only if the experiment uses non-textual stimuli (e.g., images, audio, video).The following sections detail the required structure for each file. Please refer to them to ensure your files are compliant.
Provides high-level metadata about the experiment (name, description, DOI) and controls the experiment flow (e.g., block order, randomization). See Section 2 for the full specification.
Defines individual experimental trials, including stimuli (images, videos) and the questions asked (sliders, multiple-choice, etc.). See Section 3 for the full specification.
Defines the sequence of instruction pages, practice trials, and comprehension quizzes presented to the participant before the main task. See Section 4 for the full specification.
Specifies the format for both individual participant data (human_data_ind.json) and aggregated mean data (human_data_mean.json). See Section 5 for the full specification.