General information on running the HCP pipeline
General information on running the HCP pipeline#
This part of the guide will first present the general information on how to run the steps. For all HCP pipeline steps please refer to Running the HCP pipeline.
General HCP processing settings#
HCP processing mode#
HCP processing can be run in two modes, which are specified using the
--hcp_processing_mode parameter. The two modes are:
When the acquired data meets the requirements defined by the HCP, specifically, presence of high-resolution T1w and T2w (or FLAIR) structural images, presence of field map images for the processing of functional images, multiband functional images, diffusion images acquired with opposite phase encoding direction, then the processing can and should follow the steps as described in the Glasser et al. (2016) paper. In this case the
HCPStyleDataprocessing mode should be used.
When any of the HCP acquisition requirements are not met (e.g. lack of a high-resolution T2w image) or processing options incompatible with the HCP specification as described in the Glasser et al. (2016) paper are to be used (e.g. slice timing correction of single-band functional images), then the
LegacyStyleDataprocessing mode can be indicated to enable the extended options.
HCP folder structure#
QuNex supports two folder structures in organizing and naming input files, specified using the
hcp_folderstructure parameter. The two options are:
In this case the folder structure used in the initial HCP Young Adults study is used. Specifically, the source files are stored in individual folders within the main
hcpfolder in parallel with the working folders and the
MNINonLinearfolder with results. In addition, folders and files are specified using
strctags in the filename, for functional bold images and structural images respectively.
In this case the folder structure used in the HCP Life Span study is used. Specifically, the source files are all stored within their individual subfolders located in the joint
unprocessedfolder in the main
hcpfolder, parallel to the working folders and the
MNINonLinearfolder. This is the default option used by QuNex.
HCP file naming#
QuNex supports two ways to name the source and the results files, which are defined using the
hcp_filename parameter. The two options are:
In this case, all the image types are named automatically. T1w files are named
T1w_MPR; T2w files are named
T2W_SPC; GE Fieldmaps are named
FieldMap_GE; magnitude and phase field map images are named
FieldMap_Phase, respectively; functional images and their reference images are named
BOLD_[N]_SBRef, respectively; spin echo pairs are named
BOLD_<LR/RL/AP/PA>_SB_SE; diffusion weighted images are named
DWI. This is the default option used by QuNex.
In this case images are named using their user defined names, if they are provided in
session_hcp.txtfiles and in the
batch.txtfile using the
filenamespecification in the relevant sequence specification line, e.g.:
20: bold3:EMOTION : tfMRI_EMOTION_PA : se(2) : phenc(PA) : EchoSpacing(0.0005800090) : filename(tfMRI_EMOTION_PA).
General information on running HCP preprocessing steps#
To enable efficient HCP preprocessing QuNex utilizes a processing engine. As noted before, to successfully run the HCP preprocessing steps, the following has to be accomplished first:
The files need to be mapped to the right folder structure.
All the information on sessions and their data need to be compiled into a batch file.
All the relevant parameters need to be compiled and specified either in a batch file or as command line arguments.
The command needs to be run with the right scheduler parameters or executed locally from the command line.
Steps for mapping the data and how to compile the
batch.txt file is described in the sections above. The relevant image parameters need to be added to the start of the
batch.txt file, either manually after the information has been compiled, or by writing them into a
sessions/specs/parameters.txt file to be automatically prepended when
create_batch command is used. In both cases the parameters are specified in the same manner (see batch file specification for general description), briefly, each parameter is added in a separate line as:
_<parameter name> : <parameter value>
Specific examples for each of the steps will be provided below. If passed in a command line, the format is:
qunex <command> \ --<parameter name>="<parameter value>" \ --<parameter name>="<parameter value>"
Do take care to put parameter values in double quotes if they include whitespace characters.
The last element to consider is the scheduler. The commands can be either run locally (via command line) or passed to cluster nodes using a batch scheduler. By default, commands are run locally. If a scheduler is to be used, then a scheduler settings string needs to be specified using the
Running commands locally#
When a command is run locally, QuNex will use a pool of processors to run a number of sessions in parallel. How many sessions are run concurrently is specified using
--parsessions flag/parameter (the default value is 1). Specifically, if
parsessions is set to
5, QuNex will start running processing of the first five sessions listed in the
batch.txt file concurrently. As soon as a session is processed and the relevant process is freed, the next session listed in the
batch.txt will start processing until either all the sessions in
batch.txt file have been processed or the number of sessions specified using
--nprocess parameter have been processed.
--nprocess is set to 0 by default, in which case it processes all the sessions listed in the
hcp_fmri_surface allow parallel processing of bold images for a particular session using the
--parelements flag/parameter (if there are several bold images that need processing for a session, and settings allow such processing, these images can be processed in parallel).
Running commands using a scheduler#
QuNex currently supports PBS and SLURM schedulers. For more specific information about the scheduler settings, please consult the relevant instructions for PBS (e.g. PBS User's Guide or qsub manual page) or SLURM (sbatch command). Information on how to format the scheduler string to specify the desired options is easiest to access by running
qunex schedule. Briefly, the string starts with the name of the scheduler to use, followed by a comma separated list of arguments and their values:
Specific scheduler string example for
Also when running a command using the
--scheduler parameter, the
parsessions parameter specifies how many sessions will be scheduled to run on each node in parallel. If
parsessions is set to
5, the QuNex scheduler engine will take the first five sessions listed in the
batch.txt file and submit a job to spawn itself on a node with those five sessions. It will then take the next five sessions specified in the
batch.txt file and submit another job to spawn itself on another node with these five sessions. And so on until either the list is exhausted or
nprocess sessions have been submitted to be processed. In this way, the scheduler functionality in QuNex allows for flexible and massively parallel runs of sessions across multiple nodes, while at the same time utilizing all the CPU cores per user specifications.
Just like with a local execution of commands
hcp_fmri_surface allow parallel processing of bold images for a particular session using the
Once the command ends, its success is validated by running a command completion test, i.e. testing for the presence of the last file that should be generated by the command. The completion check is always run.
Logging of HCP pipeline functions#
Multiple log files are created when a command is run. The progress of the command execution is both printed to the terminal as well as saved to a runlog file. This log will list the exact command that was run. In turn, it will list for each session the relevant information for the files and settings and report the success or failure of processing that session. Last, it will print a short summary of the success in processing each session, one session per line. By default runlogs are stored in
processing/logs/runlogs. They are given unique names compiled using the following specification:
In addition to runlog files, detailed information about processing of each session is stored in comlog files. These files are generated each time a process is started. Comlog files are saved in
processing/logs/comlogs folder. When the command is started the files are named using the following specification:
If commands are run for individual files separately (e.g. each BOLD is run separately) then the specification is:
tmp_<command_name>_<file name>_<session code>_<date>_<hour>.<minute>.<microsecond>.log
After the command has completed, the outcome is reflected on the logfiles and depends on a) the level of checking indicated when running the command (the
hcp_<short process name>_check parameter described above), and b) the
log parameter. The possible results are:
The command specific test file (the last file that would be generated by the command) is not present, indicating that the command has not completed successfully. In this case the start of the log filename will be changed from
All the required tests have completed successfully. If
logparameter is set to 'keep', the start of the log file will be renamed from
logparameter is set to 'remove', then the log file will be removed.
The test for the final file to be generated indicates that the command has completed, however, not all files specified in the file check list file have been found. In this case the start of the log filename will be changed from
It is advisable to always save comlog files. This can be achieved as a default setting by specifying the
log parameter in
batch.txt file. The
log parameter specifies whether to remove ('remove') the comlog files after
successful completion or to keep them in one or several locations ('keep', 'study', 'session', 'hcp').
One can also use
logfolder parameter to specify the location where the runlog and comlog files are to be saved if the desired location is other than the default. Note that if one runs a command within the
processing folder or its subfolders, the folder where the command is run is used as the
Please see Logging and log files for more information.
Executing a test run#
Prior to running each of the above commands, the user can append the
--test flag / parameter, which will execute a 'dry' test run. If this parameter is set then each of the commands performs a number of tests to see whether the command might have already been run and successfully completed and all the required files are available. These tests are reported in runlogs and printed to the standard output on the terminal. Before running any of the above commands it is advisable to test-run the command (i.e. to do all the checks without actually running the command). As noted, this can be achieved by setting
--run parameter to
--test. If all the tested prerequisites are met, the session is reported ready to run both an individual session report as well as final report section of the runlog. Examples are provided in the Examples section of the
hcp_pre_freesurfer command reference.
Running multiple variants of HCP preprocessing#
Sometimes the user might want to run HCP pipelines with different settings and keep the results in parallel folders. To achieve this, use the
hcp_suffix parameter when setting up the data and when running the HCP minimal processing pipeline steps. If
hcp_suffix parameter is specified, the processing will be run on data in
<session id>/hcp/<session id><hcp_suffix> folder.