import_hcp#

qx_utilities.hcp.import_hcp.import_hcp(sessionsfolder=None, inbox=None, sessions=None, action='link', overwrite='no', archive='move', hcplsname=None, nameformat=None, filesort=None, processed_data=None)#

import_hcp [sessionsfolder=.] [inbox=<sessionsfolder>/inbox/HCPLS] [sessions=""] [action=link] [overwrite=no] [archive=move] [hcplsname=<inbox folder name>] [nameformat='(?P<subject_id>[^/]+?)_(?P<session_name>[^/]+?)/unprocessed/(?P<data>.*)'] [filesort=<file sorting option>] [processed_data=<path to hcp processed data>]

Maps HCPLS data to the QuNex Suite file structure.

Parameters

--sessionsfolder (str, default '.'):

The sessions folder where all the sessions are to be mapped to. It should be a folder within the <study folder>.

--inbox (str, default <sessionsfolder>/inbox/HCPLS):

The location of the HCPLS dataset. It can be any of the following: the HCPLS dataset top folder, a folder that contains the HCPLS dataset, a path to the compressed .zip or .tar.gz package that can contain a single session or a multi-session dataset, or a folder that contains a compressed package. For instance the user can specify "<path>/<hcpfs_file>.zip" or "<path>" to a folder that contains multiple packages. The default location where the command will look for a HCPLS dataset is "<sessionsfolder>/inbox/HCPLS".

--sessions (str, default detailed below):

An optional parameter that specifies a comma or pipe separated list of sessions from the inbox folder to be processed. Regular expression patterns can be used. If provided, only packets or folders within the inbox that match the list of sessions will be processed. If inbox is a file sessions will not be applied. If inbox is a valid HCPLS datastructure folder, then the sessions will be matched against the <subject id>[_<session name>]. Note: the session will match if the string is found within the package name or the session id. So 'HCPA' with match any zip file that contains string 'HCPA' or any session id that contains 'HCPA'!

--action (str, default 'link'):

How to map the files to QuNex structure. The following actions are supported:

  • 'link' ... files will be mapped by creating hard links if possible, otherwise they will be copied

  • 'copy' ... files will be copied

  • 'move' ... files will be moved.

--overwrite (str, default 'no'):

The parameter specifies what should be done with data that already exists in the locations to which HCPLS data would be mapped to. Options are:

  • 'no' ... do not overwrite the data and skip processing of the session

  • 'yes' ... remove exising files in nii folder and redo the mapping.

--archive (str, default 'move'):

What to do with the files after they were mapped. Options are:

  • 'leave' ... leave the specified archive where it is

  • 'move' ... move the specified archive to <sessionsfolder>/archive/HCPLS)

  • 'copy' ... copy the specified archive to <sessionsfolder>/archive/HCPLS)

  • 'delete' ... delete the archive after processing if no errors were identified.

Please note that there can be an interaction with the action parameter. If files are moved during action, they will be missing if archive is set to 'move' or 'copy'.

--hcplsname (str, default detailed below):

The optional name of the HCPLS dataset. If not provided it will be set to the name of the inbox folder or the name of the compressed package.

--nameformat (str, default '(?P<subject_id>[^/]+?)_(?P<session_name>[^/]+?)/unprocessed/(?P<data>.*)'):

An optional parameter that contains a regular expression pattern with named fields used to extract the subject and session information based on the file paths and names. The pattern has to return the groups named:

  • 'subject_id' ... the id of the subject

  • 'session_name' ... the name of the session

  • 'data' ... the rest of the path with the sequence related files.

--filesort (str, default 'name_type_se'):

An optional parameter that specifies how the files should be sorted before mapping to nii folder and inclusion in session_hcp.txt. The sorting is specified by a string of sort keys separated by '_'. The available sort keys are:

  • 'name' ... sort by the name of the file

  • 'type' ... sort by the type of the file (T1w, T2w, rfMRI, tfMRI, Diffusion)

  • 'se' ... sort by the number of the related pair of the SE fieldmap images.

The files will be sorted in the order of the listed keys.

NOTE:

  1. SE field map pair will always come before the first image in the sorted list that references it.

  2. Diffusion images will always be listed jointly in a fixed order.

--processed_data (str):

Path to the folder with processed data. If provided, the command will copy that data along with unprocessed data. If onboarding multiple sessions then define the portion of the path to be replaced with session id with <session_id>, e.g.: --processed_data=/archive/fMRI/hca/<session_id>.

Output files

After running the import_hcp command the HCPLS dataset will be mapped to the QuNex folder structure and image files will be prepared for further processing along with required metadata.

  • The original HCPL session-level data is stored in:

    <sessionsfolder>/<session>/hcpls
    
  • Image files mapped to new names for QuNex are stored in:

    <sessionsfolder>/<session>/nii
    
  • The full description of the mapped files is in:

    <sessionsfolder>/<session>/session.txt
    
  • The output log of HCPLS mapping is in:

    <sessionsfolder>/<session>/hcpls/hcpls2nii.log
    

Notes

The import_hcp command consists of two steps:

  1. Mapping HCPLS dataset to QuNex Suite folder structure:

    The inbox parameter specifies the location of the HCPLS dataset. This path is inspected for a HCPLS compliant dataset. The path can point to a folder with extracted HCPLS dataset, a .zip or .tar.gz archive or a folder containing one or more .zip or .tar.gz archives. In the initial step, each file found will be assigned either to a specific session.

    <hcpls_dataset_name> can be provided as a hcplsname parameter to the command call. If hcplsname is not provided, the name will be set to the name of the parent folder or the name of the compressed archive.

    The files identified as belonging to a specific session will be mapped to folder:

    <sessions_folder>/<subject>_<session>/hcpls
    

    The <subject>_<session> string will be used as the identifier for the session in all the following steps. If the folder for the session does not exist, it will be created.

    When the files are mapped, their filenames will be preserved.

  2. Mapping image files to QuNex Suite nii folder:

    For each session separately, images from the hcpls folder are mapped to the nii folder and appropriate session.txt file is created per standard QuNex specification.

    The second step is achieved by running map_hcpls2nii on each session folder. This step is run automatically, but can be invoked independently if mapping of HCPLS dataset to QuNex Suite folder structure was already completed. For detailed information about this step, please review map_hcpls2nii inline help.

Please see map_hcpls2nii inline documentation!

Importing only specific sessions:

If only specific sessions are to be imported, then the --sessions parameter can be used. Specifically, --sessions is an optional parameter that should specify a comma or pipe separated list of sessions from the inbox folder to be processed. Regular expression patterns can be used. If provided, only packets or folders within the inbox folder that match the list of sessions will be processed. If inbox is a file, sessions will not be applied. If inbox is a valid HCPLS datastructure folder, then the sessions will be matched against the <subject id>[_<session name>].

Note: the session will match if the string is found within the package name or the session id. So 'HCPA' with match any zip file that contains string 'HCPA' or any session id that contains 'HCPA'!

Examples

qunex import_hcp \
   --sessionsfolder="<absolute path to study folder>/sessions" \
   --inbox="<absolute path to folder with HCP dataset>" \
   --archive=move \
   --overwrite=yes

The above command would map the entire HCP dataset located at the specified location into the relevant sessions’ folders—creating them when needed—, organize the MR image files in the sessions’ nii folder and prepare session_hcp.txt file for further processing. Any preexisting data for the sessions present in the HCP dataset would be removed and replaced. By default the HCP files would be hard-linked to the new location.

qunex import_hcp \
   --sessionsfolder="<absolute path to study folder>/sessions" \
   --inbox="<absolute path to folder with HCP dataset>" \
   --action='copy' \
   --archive='leave' \
   --overwrite=no

The above command would map the entire HCP dataset located at the specified location into the relevant session folders—creating them when needed—, organize the MR image files in the sessions’ nii folder and prepare session_hcp.txt file for further processing. If for any of the sessions HCP mapped data already exist, that session will be skipped when processing. The files would be mapped to their destinations by creating a copy rather than hard-linking them.

qunex import_hcp \
   --sessionsfolder="<absolute path to study folder>/sessions" \
   --sessions="HCA6086369_V1_MR,HCA6166973_V1_MR,HCD00" \
   --inbox="<absolute path to folder with HCP dataset>" \
   --action='copy' \
   --archive='leave' \
   --overwrite=no

The above example additionally specifies, that only sessions ‘HCA6086369_V1_MR’ and ‘HCA6166973_V1_MR’, and any session that starts with ‘HCD00’ should be imported.

qunex import_hcp \
    --sessionsfolder=myStudy/sessions \
    --inbox=HCPLS \
    --overwrite=yes \
    --hcplsname=hcpls