# General information on running the HCP pipeline This part of the guide will first present the general information on how to run the steps. For all HCP pipeline steps please refer to [Running the HCP pipeline](../UsageDocs/RunningPreprocessHCP). ## General HCP processing settings ### HCP processing mode HCP processing can be run in two modes, which are specified using the `--hcp_processing_mode` parameter. The two modes are: * `HCPStyleData` When the acquired data meets the requirements defined by the HCP, specifically, presence of high-resolution T1w and T2w (or FLAIR) structural images, presence of field map images for the processing of functional images, multiband functional images, diffusion images acquired with opposite phase encoding direction, then the processing can and should follow the steps as described in the [Glasser et al. (2016)](https://www.nature.com/articles/nn.4361) paper. In this case the `HCPStyleData` processing mode should be used. * `LegacyStyleData` When any of the HCP acquisition requirements are not met (e.g. lack of a high-resolution T2w image) or processing options incompatible with the HCP specification as described in the [Glasser et al. (2016)](https://www.nature.com/articles/nn.4361) paper are to be used (e.g. slice timing correction of single-band functional images), then the `LegacyStyleData` processing mode can be indicated to enable the extended options. ### HCP folder structure QuNex supports two folder structures in organizing and naming input files, specified using the `hcp_folderstructure` parameter. The two options are: * `hcpya` In this case the folder structure used in the initial HCP Young Adults study is used. Specifically, the source files are stored in individual folders within the main `hcp` folder in parallel with the working folders and the `MNINonLinear` folder with results. In addition, folders and files are specified using `fncb` and `strc` tags in the filename, for functional bold images and structural images respectively. * `hcpls` In this case the folder structure used in the HCP Life Span study is used. Specifically, the source files are all stored within their individual subfolders located in the joint `unprocessed` folder in the main `hcp` folder, parallel to the working folders and the `MNINonLinear` folder. This is the default option used by QuNex. ### HCP file naming QuNex supports two ways to name the source and the results files, which are defined using the `hcp_filename` parameter. The two options are: * `automated` In this case, all the image types are named automatically. T1w files are named `T1w_MPR`; T2w files are named `T2W_SPC`; GE Fieldmaps are named `FieldMap_GE`; magnitude and phase field map images are named `FieldMap_Magnitude` and `FieldMap_Phase`, respectively; functional images and their reference images are named `BOLD_[N]` and `BOLD_[N]_SBRef`, respectively; spin echo pairs are named `BOLD__SB_SE`; diffusion weighted images are named `DWI`. This is the default option used by QuNex. * `userdefined` In this case images are named using their user defined names, if they are provided in `session_hcp.txt` files and in the `batch.txt` file using the `filename` specification in the relevant sequence specification line, e.g.: `20: bold3:EMOTION : tfMRI_EMOTION_PA : se(2) : phenc(PA) : EchoSpacing(0.0005800090) : filename(tfMRI_EMOTION_PA)`. ### General information on running HCP preprocessing steps To enable efficient HCP preprocessing QuNex utilizes a processing engine. As noted before, to successfully run the HCP preprocessing steps, the following has to be accomplished first: 1. The files need to be mapped to the right folder structure. 2. All the information on sessions and their data need to be compiled into a batch file. 3. All the relevant parameters need to be compiled and specified either in a batch file or as command line arguments. 4. The command needs to be run with the right scheduler parameters or executed locally from the command line. Steps for mapping the data and how to compile the `batch.txt` file is described in the sections above. The relevant image parameters need to be added to the start of the `batch.txt` file, either manually after the information has been compiled, or by writing them into a `sessions/specs/parameters.txt` file to be automatically prepended when [`create_batch`](../../api/gmri/create_batch.rst) command is used. In both cases the parameters are specified in the same manner (see [batch file specification](../Overview/file_batch_txt) for general description), briefly, each parameter is added in a separate line as: ```_ : ``` Specific examples for each of the steps will be provided below. If passed in a command line, the format is: ````bash qunex \ --="" \ --="" ```` Do take care to put parameter values in double quotes if they include whitespace characters. The last element to consider is the scheduler. The commands can be either run locally (via command line) or passed to cluster nodes using a batch scheduler. By default, commands are run locally. If a scheduler is to be used, then a scheduler settings string needs to be specified using the `--scheduler` flag/parameter. #### Running commands locally When a command is run locally, QuNex will use a pool of processors to run a number of sessions in parallel. How many sessions are run concurrently is specified using `--parsessions` flag/parameter (the default value is 1). Specifically, if `parsessions` is set to `5`, QuNex will start running processing of the first five sessions listed in the `batch.txt` file concurrently. As soon as a session is processed and the relevant process is freed, the next session listed in the `batch.txt` will start processing until either all the sessions in `batch.txt` file have been processed or the number of sessions specified using `--nprocess` parameter have been processed. `--nprocess` is set to 0 by default, in which case it processes all the sessions listed in the `batch.txt` file. The commands [`hcp_fmri_volume`](../../api/gmri/hcp_fmri_volume.rst) and [`hcp_fmri_surface`](../../api/gmri/hcp_fmri_surface.rst) allow parallel processing of bold images for a particular session using the `--parelements` flag/parameter (if there are several bold images that need processing for a session, and settings allow such processing, these images can be processed in parallel). #### Running commands using a scheduler QuNex currently supports PBS and SLURM schedulers. For more specific information about the scheduler settings, please consult the relevant instructions for PBS (e.g. [PBS User's Guide](http://www.pbsworks.com/pdfs/PBSUserGuide13.0.pdf) or [qsub manual page](https://www.jlab.org/hpc/PBS/qsub.html)) or SLURM ([sbatch command](https://slurm.schedmd.com/sbatch.html)). Information on how to format the scheduler string to specify the desired options is easiest to access by running `qunex schedule`. Briefly, the string starts with the name of the scheduler to use, followed by a comma separated list of arguments and their values: ```--scheduler=",=,=,="``` Specific scheduler string example for `SLURM` is: ```--scheduler="SLURM,jobname=hcp_pre_freesurfer,time=24:00:00,cpus-per-task=2,mem-per-cpu=1500,partition=day"``` Also when running a command using the `--scheduler` parameter, the `parsessions` parameter specifies how many sessions will be scheduled to run on each node in parallel. If `parsessions` is set to `5`, the QuNex scheduler engine will take the first five sessions listed in the `batch.txt` file and submit a job to spawn itself on a node with those five sessions. It will then take the next five sessions specified in the `batch.txt` file and submit another job to spawn itself on another node with these five sessions. And so on until either the list is exhausted or `nprocess` sessions have been submitted to be processed. In this way, the scheduler functionality in QuNex allows for flexible and massively parallel runs of sessions across multiple nodes, while at the same time utilizing all the CPU cores per user specifications. Just like with a local execution of commands `hcp_fmri_volume` and `hcp_fmri_surface` allow parallel processing of bold images for a particular session using the `parelements` parameter/flag. #### Completion testing Once the command ends, its success is validated by running a command completion test, i.e. testing for the presence of the last file that should be generated by the command. The completion check is always run. #### Logging of HCP pipeline functions Multiple log files are created when a command is run. The progress of the command execution is both printed to the terminal as well as saved to a *runlog* file. This log will list the exact command that was run. In turn, it will list for each session the relevant information for the files and settings and report the success or failure of processing that session. Last, it will print a short summary of the success in processing each session, one session per line. By default runlogs are stored in `processing/logs/runlogs`. They are given unique names compiled using the following specification: ```Log--_...log``` In addition to *runlog* files, detailed information about processing of each session is stored in *comlog* files. These files are generated each time a process is started. Comlog files are saved in `processing/logs/comlogs` folder. When the command is started the files are named using the following specification: ```tmp____...log``` If commands are run for individual files separately (e.g. each BOLD is run separately) then the specification is: ```tmp_____...log``` After the command has completed, the outcome is reflected on the logfiles and depends on a) the level of checking indicated when running the command (the `hcp__check` parameter described above), and b) the `log` parameter. The possible results are: * `error` The command specific test file (the last file that would be generated by the command) is not present, indicating that the command has not completed successfully. In this case the start of the log filename will be changed from `tmp` to `error` * `done` All the required tests have completed successfully. If `log` parameter is set to 'keep', the start of the log file will be renamed from `tmp` to `done`. If `log` parameter is set to 'remove', then the log file will be removed. * `incomplete` The test for the final file to be generated indicates that the command has completed, however, not all files specified in the file check list file have been found. In this case the start of the log filename will be changed from `tmp` to `incomplete`. It is advisable to always save *comlog* files. This can be achieved as a default setting by specifying the `log` parameter in `batch.txt` file. The `log` parameter specifies whether to remove ('remove') the *comlog* files after successful completion or to keep them in one or several locations ('keep', 'study', 'session', 'hcp'). One can also use `logfolder` parameter to specify the location where the *runlog* and *comlog* files are to be saved if the desired location is other than the default. Note that if one runs a command within the `processing` folder or its subfolders, the folder where the command is run is used as the `logfolder` location. Please see [Logging and log files](../Overview/Logging) for more information. #### Executing a test run Prior to running each of the above commands, the user can append the `--test` flag / parameter, which will execute a 'dry' test run. If this parameter is set then each of the commands performs a number of tests to see whether the command might have already been run and successfully completed and all the required files are available. These tests are reported in *runlogs* and printed to the standard output on the terminal. Before running any of the above commands it is advisable to test-run the command (i.e. to do all the checks without actually running the command). As noted, this can be achieved by setting `--run` parameter to `--test`. If all the tested prerequisites are met, the session is reported ready to run both an individual session report as well as final report section of the *runlog*. Examples are provided in the Examples section of the [`hcp_pre_freesurfer`](../../api/gmri/hcp_pre_freesurfer.rst) command reference. #### Running multiple variants of HCP preprocessing Sometimes the user might want to run HCP pipelines with different settings and keep the results in parallel folders. To achieve this, use the `hcp_suffix` parameter when setting up the data and when running the HCP minimal processing pipeline steps. If `hcp_suffix` parameter is specified, the processing will be run on data in `/hcp/` folder.