.. highlight:: shell ============ Get started ============ The command-line usage -------------------------- After :ref:`installing iGV_snapshot_maker`, run this command in the terminal to verify the installation: .. code-block:: console igv_snapshot_maker -h usage: IGV_snapshot_maker [-h] [-o output directory] [-e Extend +/- N bp] [-g genome] [--igv IGV_CMD] [-m IGV memory MB] -i Input file [-n] [-b Target OS[Mac/Win] original_prefix new_prefix] [-c config YAML file] IGV_snapshot_maker.py v0.1.0-dev: Genenerate IGV snapshots optional arguments: -h, --help show this help message and exit -o output directory, --output output directory Output directory for snapshots -e Extend +/- N bp, --extend Extend +/- N bp Extend N (N=100 by default) base pairs in two directions in IGV window -g genome Name of the reference genome, Defaults to hg19 --igv IGV_CMD The command to run IGV (at CCAD) -m IGV memory (MB), --mem IGV memory (MB) Amount of memory to allocate to IGV, in Megabytes (MB) -i Input file, --input Input file Input file in YAML format -n, --norun Do not run the batch script -b Target OS[Mac/Win] original_prefix new_prefix, --binding Target OS[Mac/Win] original_prefix new_prefix Replace the original path prefix with new path prefix after binding at the target OS. -c config YAML file, --config config YAML file IGV setting in YAML format Prepare the YAML input file --------------------------- The only required input file is to specify the bam files and regions of interest to take IGV snapshots. The information is defined as a list of entries in the YAML format. In the each entry (i.e., an IGV session), there are several attributes specified: .. list-table:: Attributes of each entry specified in the YAML input file :widths: 15 15 40 30 :header-rows: 1 * - Attribute - Type - Description - Examples * - name - string - unique identifer for the session name (and folder) - cdRCC_1929_03_T01 * - bam_files - list of strings - absolute file paths to the bam files - - /data/DCEG_pRCC_SV/EAGLE_Kidney_BAM/GPK0149_0421.bam - /data/DCEG_pRCC_SV/EAGLE_Kidney_BAM/GPK0149_0401.bam * - snapshots - list of items - Each item contains 5 attributes of the regions of interest: name, chr, start, stop, and ext. - - chr: '1' - ext: 200 - name: cdRCC_1929_03_T01_INTER_SV00035_BP1 - start: 104423883 - stop: 104423984 An example of the YAML input file: .. code-block:: YAML --- - bam_files: - /data/DCEG_pRCC_SV/EAGLE_Kidney_BAM/GPK0149_0421.bam - /data/DCEG_pRCC_SV/EAGLE_Kidney_BAM/GPK0149_0401.bam - /data/DCEG_pRCC_SV/EAGLE_Kidney_BAM/GPK0149_0403.bam - /data/DCEG_pRCC_SV/EAGLE_Kidney_BAM/GPK0149_0402.bam name: cdRCC_1929_03_T01 snapshots: - chr: '1' ext: 200 name: cdRCC_1929_03_T01_INTER_SV00035_BP1 start: 104423883 stop: 104423984 - chr: '8' ext: 200 name: cdRCC_1929_03_T01_INTER_SV00035_BP2 start: 33776273 stop: 33776374 - bam_files: - /data/DCEG_pRCC_SV/EAGLE_Kidney_BAM/GPK7006_2000.bam - /data/DCEG_pRCC_SV/EAGLE_Kidney_BAM/GPK4017_0401.bam name: pRCC1_1654_01_T01 snapshots: - chr: '6' ext: 200 name: pRCC1_1654_01_T01_INTRA_SV00060_BP1 start: 136376293 stop: 136376293 There are YAML libraries for the common programming languages, like, PERL, Python, and R. So, it is easy for the users with the programming skill to generate a YAML input to specify the regions of interest. We may also provide additional helper scripts to convert from other input files to the YAML input files upon request. Run igv_snapshot_maker ---------------------- Users usually prefer to running igv_snapshot_maker at the server, where the bam files can be accessed easily. In that case, IGV and the unix command `xvfb-run `_ should be installed at the server, so as to generate the IGV snapshots without a display. Users may use IGV to interactively review the regions of interest if the snapshot generated by igv_snapshot_maker cannot fully meet the need. As a general solution, it is easier to mount the network drive where the bam files are located rather than to transfer the large bam files from the remote server to the local computer. In the output of igv_snapshot_maker, three different IGV batch scripts are generated: .. list-table:: Three types of IGV batch scripts generated by igv_snapshot_maker :widths: 25 50 25 :header-rows: 1 * - Name - Description - Examples * - .bat - IGV batch script to generate all the snapshots for the session at the (remote) server. - cdRCC_1929_03_T01.bat * - _ROIs.bat - IGV batch script to list all the regions of interest for the interactively inspection at the (local) desktop/laptop. - cdRCC_1929_03_T01_ROIs.bat * - .bat - IGV batch script to regenerate the specific snapshot at the (local) desktop/laptop. - cdRCC_1929_03_T01_INTER_SV00035_BP1.bat Among the different types of the IGV batch script, the bam file locations are different to address the change in the bam location path due to the network drive mounting, for example: + On the server side: /data/DCEG_pRCC_SV/EAGLE_Kidney_BAM/GPK0149_0421.bam + On the local computer: + Mac: /Volumes/DCEG_pRCC_SV/EAGLE_Kidney_BAM/GPK0149_0421.bam + Windows: T:\\data\\DCEG_pRCC_SV\\EAGLE_Kidney_BAM\\GPK0149_0421.bam User may specify the binding path using the command-line option `-b`, followed by three parameters. For instance, .. code-block:: console # For Windows machines igv_snapshot_maker -n -b Win '^/' 'T:\\' --igv "igv -m 20g " -i input.yaml # For Mac machines igv_snapshot_maker -n -b Mac '^/data' '/Volumes' --igv "igv -m 20g " -i input.yaml # The output from this dry-run (due to -n option) tree IGV_Snapshots/ IGV_Snapshots/ ├── cdRCC_1929_03_T01 │ ├── cdRCC_1929_03_T01.bat │ ├── cdRCC_1929_03_T01_INTER_SV00035_BP1.bat │ ├── cdRCC_1929_03_T01_INTER_SV00035_BP2.bat │ └── cdRCC_1929_03_T01_ROIs.bat └── pRCC1_1654_01_T01 ├── pRCC1_1654_01_T01.bat ├── pRCC1_1654_01_T01_INTRA_SV00060_BP1.bat └── pRCC1_1654_01_T01_ROIs.bat 2 directories, 7 files Run igv_snapshot_maker at Biowulf ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ xvfb has been installed at most linux systems, including Biowulf and CCAD. .. code-block:: console # Start an interactive session at Biowulf first sinteractive --cpus-per-task=4 --mem=32g --gres=lscratch:20 # Then load igv in the interactive session module load igv igv_snapshot_maker -g hg19 -i pRCC_SV.yaml -o pRCC_mac -c IGV_config.yaml -b Mac '^/data' '/Volumes' --igv "igv -m 20g " Example input files and config files are available at `github `_.