QuNex data hierarchy specification#

Overview#

This document provides an overview of the QuNex data hierarchy structure for an exemplar generic dataset analyzed using the QuNex toolset. The logic behind the data hierarchy specifications is to provide a clear and predictable structure so that both users and tools can reference data in consistent locations with a unified naming grammar, limiting the need for referential databases and metadata management. The data hierarchy is defined primarily for individual sessions. Group-level data hierarchy specification (e.g. analyses folder) is not fixed, thus enabling flexible internal organization according to specific analysis needs and goals.

Study folder structure#

  • QuNex expects that each study contains a 'master' folder that contains all the data and results.

  • Such a study would by default contain the folder hierarchy defined in $TOOLS/python/qx_utilities/templates/study_folders_default.txt:

studyfolder                         # -- Overall study base folder
├── analysis                        # -- The base folder for group data analysis and results that follow basic (pre)processing
│   └── scripts                     # -- Code for analyses that use preprocessed data would go here
├── processing                      # -- The base folder for content related to data (pre)processing
│   ├── logs                        # -- Folder storing all study processing logs
│   │   ├── batchlogs               # -- Default log location for output captured from runs scheduled on cluster nodes
│   │   ├── comlogs                 # -- Default log location for detailed output generated by the executed command(s)
│   │   ├── runchecks               # -- Reports on completion of individual steps / presence of resulting files
│   │   └── runlogs                 # -- Default log location for the executed QuNex command(s) with supplied parameters
│   ├── lists                       # -- List files used for processing and analyses
│   ├── scripts                     # -- Code for data (pre)processing would go here
│   ├── scenes                      # -- Location for misc scenes used in analyses and QC
│   │   └── QC                      # -- Location for additional custom QC scenes and associated files
│   │       ├── T1w                 # -- T1w QC scenes and associated files
│   │       ├── T2w                 # -- T2w QC scenes and associated files
│   │       ├── myelin              # -- myelin QC scenes and associated files
│   │       ├── BOLD                # -- BOLD QC scenes and associated files
│   │       └── DWI                 # -- DWI QC scenes and associated files
│   └── batch.txt                   # -- Batch files describing processing and analysis parameters in a header followed by sessions' data information
├── info                            # -- The base folder that stores various information and materials
│   ├── demographics                # -- Participant demographic information
│   ├── tasks                       # -- Folder containing task-related information
│   ├── stimuli                     # -- Folder containing task-related stimuli if used
│   ├── bids                        # -- Study-level information in BIDS format
│   └── hcpls                       # -- HCPLS dataset related information
└── sessions                        # -- The base folder that stores individual sessions' data
    ├── inbox                       # -- Group .fidl and .conc files for processing and analyses
    │   ├── MR                      # -- Incoming MR data from the scanner
    │   ├── EEG                     # -- Incoming EEG data
    │   ├── BIDS                    # -- Incoming BIDS dataset
    │   ├── HCPLS                   # -- Incoming HCPLS dataset
    │   ├── behavior                # -- Incoming behavioral data
    │   ├── concs                   # -- .conc files for the sessions
    │   └── events                  # -- .fidl files for the sessions
    ├── archive                     # -- folder with raw zipped data from the scanner for backup
    │   ├── MR                      # -- Archive of raw data MR data imported from the scanner
    │   ├── EEG                     # -- Archive of the raw EEG data
    │   ├── BIDS                    # -- Archive of the processed BIDS dataset(s)
    │   ├── HCPLS                   # -- Archive of the processed HCPLS dataset(s)
    │   └── behavior                # -- Archive of the raw behavioral data
    ├── specs                       # -- specifications files to be used on the sessions in the study that include mapping files that provide MR and/or EEG data mapping info
    │   ├── <pipeline>_mapping.txt  # -- These mapping files are used for mapping to pipeline specific structure
    │   └── parameters.txt               # -- batch acquisition parameter headers that are used to specify preprocessing parameters and#    to generate processing `batch.txt` files which are ultimately stored in <study_name>/processing
    ├── QC                          # -- folder with group-level quality control data for all sessions
    └── <session_id>                # -- session specific folder

Note here that one can specify a custom study folder structure. To do this prepare your own folder structure specification and provide its path in the --folders parameter of the create_study command.

Session folder structure#

Each session will have a subfolder inside the main QuNex study structure which contains everything related to the particular session, below is its folder structure.

studyfolder                         # -- Overall study base folder
...
└── sessions                        # -- The base folder that stores individual sessions' data
    ...
    └── <session_id>                # -- session specific processing and analysis folder (naming formula is "<subject id>[_<session name>]", e.g "s12" or "s12_pre", "s12_post")
        ├── session.txt             # -- an information file describing the session's data generated following sorting of DICOMs
        ├── session<pipeline>.txt   # -- an information file describing the sessions’s data with mapping for further pipeline processing
        ├── inbox                   # -- folder with incoming raw data from the scanner
        ├── nii                     # -- folder with original data in NIfTI format after initial import from DICOMs, BIDS or HCPLS input
        ├── physio                  # -- folder with physiological recordings files
        ├── behavior                # -- folder with behavioral data
        ├── QC                      # -- folder with session-specific quality control data
        ├── bids                    # -- folder with subject and session specific data in BIDS format
        ├── dicom                   # -- folder with sorted dicom images along with a log txt file documenting what was acquired
        │   ├── 1                   # -- Hypothetical T1w scan DICOM folder
        │   ├── 2                   # -- Hypothetical T2w scan DICOM folder
        │   ├── 3                   # -- Hypothetical SpinEcho Field Map Phase Encoding Direction 1 (original)
        │   ├── 4                   # -- Hypothetical SpinEcho Field Map Phase Encoding Direction 2 (reversed)
        │   ├── 5                   # -- Hypothetical BOLD_1 acquisition
        │   ├── 6                   # -- Hypothetical BOLD_1 acquisition
        │   ├── 7                   # -- Hypothetical DWI acquisition
        │   ├── 8                   # -- Hypothetical DWI acquisition
        │   ├── 9                   # -- Hypothetical DWI acquisition
        │   └── 10                  # -- Hypothetical DWI acquisition
        ├── eyetracking             # -- Eye tracking data
        ├── EEG                     # -- EEG data
        │   ├── raw                 # -- converted unprocessed EEG data
        │   ├── preproc             # -- EEG data in different preprocessing stages
        │   └── results             # -- results of EEG data ready for further group based analyses
        ├── images                  # -- folder with processed neuroimaging data mapped for further analyses
        │   ├── functional          # -- holds BOLD and related files
        │   │   ├── concs           # -- conc files specifying fMRI files that constitute a series
        │   │   ├── events          # -- event (.fidl) files used for de-noising or task analyses
        │   │   ├── glm             # -- general linear model descriptions (e.g. for task analyses)
        │   │   └── movement        # -- motion regressors
        │   ├── ROI                 # -- holds any information on ROI used in preprocessing (e.g. nuisance ROI)
        │   │   └── nuisance        # -- maps used for definition of ROI used to extract nuisance regressors used for BOLD de-noising
        │   ├── segmentation        # -- holds any files related to segmentation (e.g. bold brain masks, freesurfer segmentations, hcp surface files, etc.)
        │   │   ├── boldmasks       # -- whole-brain masks for the BOLD data specifying actual coverage of the brain
        │   │   ├── freesurfer      # -- mapping of select freesurfer segmentation data as part of the HCP pipelines (detailed below)
        │   │   │   ├── mri         # -- freesurfer mapping following HCP pipelines
        │   │   │   │   └── orig    # -- freesurfer mapping following HCP pipelines
        │   │   │   └── surf        # -- freesurfer mapping following HCP pipelines
        │   │   └── hcp             # -- mapping of surface files from the HCP pipelines (detailed below)
        │   │       └── fsaverage_LR32k # -- HCP mapping of surfaces
        │   ├── structural          # -- holds structural images (T1w image and BOLD template image)
        │   └── diffusion           # -- holds processed DWI images
        └── hcp                     # -- folder with data for and from HCP pipeline
            └── <session_id>        # -- session id is repeated here for hcp folder mapping
                ├── BOLD_1                                          # -- HCP pipeline outputs (Initial processing for BOLD series 1)
                │   ├── BOLD_1_nonlin_norm.wdir                     # -- HCP pipeline outputs (Initial processing for BOLD series 1)
                │   ├── DistortionCorrectionAndEPIToT1wReg_FLIRTBBRAndFreeSurferBBRbased    # -- HCP pipeline outputs (Initial processing for BOLD series 1)
                │   │   └── FieldMap                                # -- HCP pipeline outputs (Initial processing for BOLD series 1)
                │   ├── MotionCorrection_FLIRTbased                 # -- HCP pipeline outputs (Initial processing for BOLD series 1)
                │   │   └── BOLD_1_mc.mat                           # -- HCP pipeline outputs (Initial processing for BOLD series 1)
                │   ├── MotionMatrices                              # -- HCP pipeline outputs (Initial processing for BOLD series 1)
                │   └── OneStepResampling                           # -- HCP pipeline outputs (Initial processing for BOLD series 1)
                │       ├── postvols                                # -- HCP pipeline outputs (Initial processing for BOLD series 1)
                │       └── prevols                                 # -- HCP pipeline outputs (Initial processing for BOLD series 1)
                ├── BOLD_2                                          # -- HCP pipeline outputs (Initial processing for BOLD series 2)
                │   ├── BOLD_2_nonlin_norm.wdir                     # -- HCP pipeline outputs (Initial processing for BOLD series 2)
                │   ├── DistortionCorrectionAndEPIToT1wReg_FLIRTBBRAndFreeSurferBBRbased    # -- HCP pipeline outputs (Initial processing for BOLD series 2)
                │   │   └── FieldMap                                # -- HCP pipeline outputs (Initial processing for BOLD series 2)
                │   ├── MotionCorrection_FLIRTbased                 # -- HCP pipeline outputs (Initial processing for BOLD series 2)
                │   │   └── BOLD_2_mc.mat                           # -- HCP pipeline outputs (Initial processing for BOLD series 2)
                │   ├── MotionMatrices                              # -- HCP pipeline outputs (Initial processing for BOLD series 2)
                │   └── OneStepResampling                           # -- HCP pipeline outputs (Initial processing for BOLD series 2)
                │       ├── postvols                                # -- HCP pipeline outputs (Initial processing for BOLD series 2)
                │       └── prevols                                 # -- HCP pipeline outputs (Initial processing for BOLD series 2)
                ├── Diffusion                                       # -- HCP pipeline outputs (Raw DWI images mapped for input)
                │   ├── data                                        # -- HCP pipeline outputs (DWI processing output)
                │   ├── eddy                                        # -- HCP pipeline outputs (DWI processing output)
                │   ├── rawdata                                     # -- HCP pipeline outputs (DWI processing output)
                │   ├── reg                                         # -- HCP pipeline outputs (DWI processing output)
                │   └── topup                                       # -- HCP pipeline outputs (DWI processing output)
                ├── MNINonLinear                                    # -- HCP pipeline outputs (Final HCP processing results)
                │   ├── fsaverage                                   # -- HCP pipeline outputs (Low-res surfaces used for CIFTI format generation in atlas space)
                │   ├── fsaverage_LR32k                             # -- HCP pipeline outputs (Low-res surfaces used for CIFTI format generation in atlas space)
                │   ├── Native                                      # -- HCP pipeline outputs (???)
                │   ├── Results                                     # -- HCP pipeline outputs (All images processed and mapped in MNI space)
                │   │   ├── BOLD_1                                  # -- HCP pipeline outputs (BOLD 1 images results in MNI space)
                │   │   │   ├── BOLD_1_hp2000.ica                   # -- HCP pipeline outputs (BOLD 1 images FIXICA denoising results)
                │   │   │   │   ├── filtered_func_data.ica          # -- HCP pipeline outputs (BOLD 2 images FIXICA denoising results)
                │   │   │   │   │   ├── report                      # -- HCP pipeline outputs (BOLD 2 images FIXICA denoising results)
                │   │   │   │   │   ├── stats                       # -- HCP pipeline outputs (BOLD 2 images FIXICA denoising results)
                │   │   │   │   ├── fix                             # -- HCP pipeline outputs (BOLD 2 images FIXICA denoising results)
                │   │   │   │   ├── mc                              # -- HCP pipeline outputs (BOLD 2 images FIXICA denoising results)
                │   │   │   │   └── reg                             # -- HCP pipeline outputs (BOLD 2 images FIXICA denoising results)
                │   │   │   └── RibbonVolumeToSurfaceMapping        # -- HCP pipeline outputs (BOLD 1 images surface mapping ouputs)
                │   │   ├── BOLD_2                                  # -- HCP pipeline outputs (BOLD 2 images results in MNI space)
                │   │   │   ├── BOLD_2_hp2000.ica                   # -- HCP pipeline outputs (BOLD 2 images FIXICA denoising results)
                │   │   │   │   ├── filtered_func_data.ica          # -- HCP pipeline outputs (BOLD 2 images FIXICA denoising results)
                │   │   │   │   │   ├── report                      # -- HCP pipeline outputs (BOLD 2 images FIXICA denoising results)
                │   │   │   │   │   ├── stats                       # -- HCP pipeline outputs (BOLD 2 images FIXICA denoising results)
                │   │   │   │   ├── fix                             # -- HCP pipeline outputs (BOLD 2 images FIXICA denoising results)
                │   │   │   │   ├── mc                              # -- HCP pipeline outputs (BOLD 2 images FIXICA denoising results)
                │   │   │   │   └── reg                             # -- HCP pipeline outputs (BOLD 2 images FIXICA denoising results)
                │   │   │   └── RibbonVolumeToSurfaceMapping        # -- HCP pipeline outputs (BOLD 2 images surface mapping ouputs)
                │   │   └── Tractography                            # -- HCP pipeline outputs (DWI tractography results in MNI space)
                │   │       ├── Mat1_logs                           # -- HCP pipeline outputs (DWI tractography results in MNI space)
                │   │       └── Mat3_logs                           # -- HCP pipeline outputs (DWI tractography results in MNI space)
                │   ├── ROIs                                        # -- HCP pipeline outputs (Structural ROIs for CIFTI file generation in MNI space)
                │   └── xfms                                        # -- HCP pipeline outputs (Transformation matrix files)
                ├── T1w                                             # -- HCP pipeline outputs (T1w processing inputs mapped here and all structural and DWI native space results)
                │   ├── ACPCAlignment                               # -- HCP pipeline outputs (T1w processing results)
                │   ├── BiasFieldCorrection_sqrtT1wXT1w             # -- HCP pipeline outputs (T1w processing results)
                │   ├── BrainExtraction_FNIRTbased                  # -- HCP pipeline outputs (T1w processing results)
                │   ├── Diffusion                                   # -- HCP pipeline outputs (DWI processing results for dtifit)
                │   ├── Diffusion.bedpostX                          # -- HCP pipeline outputs (DWI processing results for bedpostX)
                │   │   ├── logs                                    # -- HCP pipeline outputs (DWI processing results for bedpostX)
                │   │   │   ├── logs_gpu                            # -- HCP pipeline outputs (DWI processing results for bedpostX)
                │   │   │   └── monitor                             # -- HCP pipeline outputs (DWI processing results for bedpostX)
                │   │   └── xfms                                    # -- HCP pipeline outputs (Transformation matrix files)
                │   ├── fsaverage                                   # -- HCP pipeline outputs (Hi-res surfaces used for CIFTI format generation in native space)
                │   ├── fsaverage_LR32k                             # -- HCP pipeline outputs (Low-res surfaces used for CIFTI format generation native space)
                │   ├── Native                                      # -- HCP pipeline outputs (???)
                │   ├── <session_id>                                # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   │   ├── bem                                     # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   │   ├── label                                   # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   │   ├── mri                                     # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   │   │   ├── orig                                # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   │   │   └── transforms                          # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   │   │       └── bak                             # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   │   ├── scripts                                 # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   │   ├── src                                     # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   │   ├── stats                                   # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   │   ├── surf                                    # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   │   ├── tmp                                     # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   │   ├── touch                                   # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   │   └── trash                                   # -- HCP pipeline outputs (FreeSurfer processing results in session native space)
                │   ├── Results                                     # -- HCP pipeline outputs (DWI tractography results in session native space)
                │   │   └── log_pretractographydense                # -- HCP pipeline outputs
                │   ├── ROIs                                        # -- HCP pipeline outputs (Structural ROIs for CIFTI file generation in native space)
                │   ├── T1w1_GradientDistortionUnwarp               # -- HCP pipeline outputs
                │   └── xfms                                        # -- HCP pipeline outputs (Transformation matrix files)
                ├── T2w                                             # -- HCP pipeline outputs (T2w processing inputs mapped here)
                │   ├── ACPCAlignment                               # -- HCP pipeline outputs
                │   ├── BrainExtraction_FNIRTbased                  # -- HCP pipeline outputs
                │   ├── T2w1_GradientDistortionUnwarp               # -- HCP pipeline outputs
                │   ├── T2wToT1wDistortionCorrectAndReg             # -- HCP pipeline outputs
                │   │   ├── FieldMap                                # -- HCP pipeline outputs
                │   │   └── T2w2T1w                                 # -- HCP pipeline outputs
                │   └── xfms                                        # -- HCP pipeline outputs (Transformation matrix files)
                └── unprocessed                                     # -- The folder holding unprocessed data
                    ├── T1w                                         # -- HCP pipeline unprocessed input folder: high-resolution T1w image(s)
                    ├── SpinEchoFieldMap1                           # -- HCP pipeline unprocessed input folder: high-resolution T2w image(s)
                    ├── SpinEchoFieldMap1                           # -- HCP pipeline unprocessed input folder: spin echo field map images
                    ├── BOLD_1                                      # -- HCP pipeline unprocessed input folder: bold 1 image
                    ├── BOLD_1_SBRef                                # -- HCP pipeline unprocessed input folder: bold 1 single-band reference image
                    ├── BOLD_2                                      # -- HCP pipeline unprocessed input folder: bold 2 image
                    ├── BOLD_2_SBRef                                # -- HCP pipeline unprocessed input folder: bold 2 single-band reference image
                    └── Diffusion                                   # -- HCP pipeline unprocessed input folder: diffusion weighted images

Considerations for BOLD Naming Output Convention#

  • Please note that the specific names of folders in the hcp hierarchy may vary based on provided batch parameters and structure options.

  • If hcp_filename is set to userdefined instead of the default automated, the image and folder names will be named based on the filename specification in session_<pipeline>.txt

  • For instance: rfMRI_REST2_PA and rfMRI_REST2_PA_SBRef instead of BOLD_1 and BOLD_1_SBRef.

  • Additionally, if automated hcp_filename is used, the prefix for the resulting bold file names and folders can be defined using --hcp_bold_prefix parameter. The default value is BOLD_.

  • For full details on how hcp processing naming convention settings affect the file hierarchy see HCP File Naming Wiki Section