– Don’t count…quantify! Obrázok, na ktorom je text, meradlo Automaticky generovaný popis •free, quick, little memory •takes into account experimental attributes and biases typical of RNA-seq data, including positional biases in coverage, sequence-specific biases at the 5′ and 3′ end of sequenced fragments, fragment-level GC bias, strand-specific protocols, and the fragment length distribution •input: set of target transcripts Obrázok, na ktorom je text, meradlo Automaticky generovaný popis Obrázok, na ktorom je text, meradlo Automaticky generovaný popis MAPPING-BASED MODE •Mapping instead of alignment •Salmon index •Decoy-aware transcriptome •Outputs .fs file • > ./bin/salmon index –t transcripts.fa –i transcripts_index –decoys decoys.txt –k 31 > ./bin/salmon quant -i transcripts_index -l -1 reads1.fq -2 reads2.fq --validateMappings -o transcripts_quant > ./bin/salmon quant -i transcripts_index -l -r reads.fq --validateMappings -o transcripts_quant Building index: Paired-end reads: Single-end reads: MAPPING-BASED MODE Multiple samples: Zipped files: •Pre-aligned data •.bam files •Outputs .fs file ALIGMENT-BASED MODE > ./bin/salmon quant -t transcripts.fa -l -a aln.bam -o salmon_quant LIBTYPE •automatic library type detection in alignment-based mode •argument -l A • 1.the relative orientation of the reads (only if the library is paired-end) • LIBTYPE 2.the strandedness of the library 3. 3. 3. 3. the directionality of the reads (only if the library is stranded) LIBTYPE EXAMPLES Obrázok, na ktorom je text Automaticky generovaný popis •alignment-based mode: 8-12 (4 threads for BAM decompression, rest for quantification) for maximum speed •mapping-based mode: more threads = faster quantification •-p argument (default = maximum number of available threads) • THREADS OUTPUT Obrázok, na ktorom je stôl Automaticky generovaný popis •transcripts per kilobase million •how to calculate: 1.divide the read counts by the length of each gene in kilobases. This gives you reads per kilobase (RPK) 2.then count how many RPK values you have in a sample and divide this number by million to get your “per million” scaling factor 3.and then divide each RPK value by the scaling factor and the result is TPM • TPM •https://combine-lab.github.io/salmon/ •https://salmon.readthedocs.io/en/latest/salmon.html •https://nf-co.re/modules/salmon_quant#input •https://sailfish.readthedocs.io/en/develop/salmon.html •https://www.rna-seqblog.com/rpkm-fpkm-and-tpm-clearly-explained/ • • • • SOURCES