Table of VDJtools modules¶
VDJtools software package contains a comprehensive set of immune repertoire post-analysis routines, which are subdivided into several analysis modules. Each module’s section provides command line usage syntax and parameter descriptions for each of the routines, as well as output example and description.
Summary statistics, spectratyping, etc
- CalcBasicStats Computes summary statistics for samples: read counts, mean clonotype sizes, number of non-functional clonotypes, etc
- CalcSegmentUsage Computes Variable (V) and Joining (J) segment usage profiles
- CalcSpectratype Computes spectratype, the distribution of clonotype abundance by CDR3 sequence length
- PlotFancySpectratype Plots spectratype explicitly showing top N clonotypes
- PlotFancyVJUsage Plots the frequency of different V-J pairings
- PlotSpectratypeV Plots distribution of V segment abundance by resulting CDR3 sequence length
Repertoire richness and diversity
Clonotype sharing between samples
Filtering and resampling
- Correct Performs a frequency-based erroneous clonotype correction
- Decontaminate Filters possible cross-sample contaminations in a set of samples
- DownSample Performs down-sampling, i.e. takes a subset of random reads from sample(s)
- FilterNonFunctional Filters non-functional clonotypes
- SelectTop Selects a fixed number of top (most abundant) clonotypes from sample(s)
- FilterByFrequency Filters clonotypes based on a specified frequency threshold.
- ApplySampleAsFilter Filters clonotypes that are present in a specified sample from sample(s)
- FilterBySegment Filters clonotypes according to their V/D/J segment
Clonotype table operations
Functional annotation of clonotype tables (antigen specificity, amino acid properties, etc)
- CalcCdrAAProfile Builds a profile of CDR3 regions (V germline, V-D junction, ...) using a set of amino-acid physical properties
- Annotate Computes a set of basic (insert size, ...) and amino acid physical properties (GRAVY, ...) for clonotypes
- ScanDatabase (Available only up to v1.0.5, use VDJdb) Queries a database containing clonotypes of known antigen specificity.
Some useful utilities
Each routine generates a comprehensive tabular output and some
produce optional graphical output. In case of graphical output,
the corresponding R script with specified arguments (at the beginning of
the script, commented) will be stored to the analysis folder. Thus, user can
uncomment the script arguments, modify the script and re-run it. This behavior
be disabled by running VDJtools with
discard_scripts argument prior
to routine name.
By default, all graphical output is generated in PDF format, to generate
PNG images use ``
--plot-type png option.
When running routines that output clonotype tables consider the following:
- Joint and pooled samples are stored in VDJtools fomat
- Samples produced using ScanDatabase (Available only up to v1.0.5, use VDJdb) or Annotation routine are in VDJtools format and include additional annotation columns. Annotation columns are retained when running most of VDJtools routines
- When loading a joint/pooled sample into VDJtools, clonotype abundance vectors, incidence counts, etc will be treated as clonotype level annotations
- Annotation columns will not be preserved when joining/pooling annotated samples, a workaround
here will be to use ApplySampleAsFilter routine
When exporting a table generated by one of VDJtools routines into R use the following command to parse the input correctly:
read.table("some_table.txt", header=T, quote="", sep = "\t")
There are several parameters that are commonly used among analysis routines:
||Brings up the help message for selected routine|
||path||Path to metadata file. Should point to a tab-delimited file with the first two columns containing sample path and sample id respectively, and the remaining columns containing user-specified data. See Metadata section|
||If present as an option and not set, all statistics will be weighted by clonotype frequency|
||string||Overlap type, that specifies which clonotype features (CDR3 sequence, V/J segments, hypermutations) will be compared when checking if two clonotypes match. Allowed values:
||[plotting] Enable plotting for routines that supports it.|
||<pdf|png>||[plotting] Specifies whether to generate a PDF or PNG file. While latter could be easily embedded, PDF plots have superior quality.|
||string||[plotting] Name of the sample metadata column that should be treated as factor. If the name contains spaces, the argument should be surrounded with double quotes, e.g.
||[plotting] Treat the factor as numeric?|
||string||[plotting] Name of the sample metadata column that should be treated as label. If the name contains spaces, the argument should be surrounded with double quotes, e.g.
||path||Compress resulting clonotype tables using GZIP.|
Some of VDJtools routines require to define clonotype matching strategy when computing clonotype sharing between samples. This parameter is also used when collapsing clonotype tables, e.g. a common situation is when one is interested in estimating the extent of convergent recombination, which is the number of distinct nucleotide CDR3 sequences per one CDR3 amino acid sequence. This requires to collapse clonotype table by identical CDR3aa field.
The list of strategies is defined below.
|strict||CDR3nt (AND) V (AND) J (AND) SHMs||Require full match for receptor nucleotide sequence|
|ntV||CDR3nt (AND) V|
|ntVJ||CDR3nt (AND) V (AND) J|
|aaV||CDR3aa (AND) V|
|aaVJ||CDR3aa (AND) V (AND) J|
|aa!nt||CDR3aa (AND)((NOT) CDR3nt )||Removes nearly all contamination bias from overlap results. Should not be used for samples from the same donor/tracking experiments|
As somatic hypermutations (SHMs) are currently not supported by VDJtools,
ntVJ options are identical. See VDJtools Clonotype
specification for details.