Import transcript isoform quantification into EZbakRData object
ImportIsoformQuant.Rd
A convenient wrapper to tximport()
for importing isoform quantification
data into an EZbakRData object. You need to run this before running
EstimateIsoformFractions
.
Usage
ImportIsoformQuant(
obj,
files,
quant_tool = c("none", "salmon", "sailfish", "alevin", "piscem", "kallisto", "rsem",
"stringtie"),
txIn = TRUE,
...
)
Arguments
- obj
An
EZbakRData
object.- files
A named vector of paths to all transcript quantification files that you would like to import. This will be passed as the first argument of
tximport::tximport()
(also namedfiles
). The names of this vector should be the same as the sample names as they appear in the metadf of theEZbakRData
object.- quant_tool
String denoting the type of software used to generate the abundances. Will get passed to the
type
argument oftximport::tximport()
. As described in the documentation fortximport
'Options are "salmon", "sailfish", "alevin", "piscem", "kallisto", "rsem", "stringtie", or "none". This argument is used to autofill the arguments below (geneIdCol, etc.) "none" means that the user will specify these columns. Be aware that specifying type other than "none" will ignore the arguments below (geneIdCol, etc.)'. Referenced 'arguments below' can be specified as part of...
.- txIn
Whether or now you are providing isoform level quantification files. Alternative (
txIn = FALSE
) is gene-level quantification. InImportIsoformQuant
,txIn
gets passed to BOTH thetxIn
andtxOut
parameters intximport()
.- ...
Additional arguments to be passed to
tximport::tximport()
. Especially relevant if you setquant_tool
to "none".
Value
An EZbakRData
object with an additional element in the readcounts
list named "isform_quant_<quant_tool>". It contains TPM, expected_count,
and effective length information for each transcript_id and each sample.
Examples
# Dependencies for example
library(dplyr)
library(data.table)
#>
#> Attaching package: ‘data.table’
#> The following objects are masked from ‘package:dplyr’:
#>
#> between, first, last
# Simulate and analyze data
simdata <- EZSimulate(30)
ezbdo <- EZbakRData(simdata$cB, simdata$metadf)
ezbdo <- EstimateFractions(ezbdo)
#> Estimating mutation rates
#> Summarizing data for feature(s) of interest
#> Averaging out the nucleotide counts for improved efficiency
#> Estimating fractions
#> Processing output
# Hack to generate example quantification files
savedir <- tempdir()
rsem_data <- tibble(
transcript_id = paste0("tscript_feature", 1:30),
gene_id = paste0("feature", 1:30),
length = 1000,
effective_length = 1000,
expected_count = 1000,
TPM = 10,
FPKM = 10,
IsoPct = 1
)
fwrite(rsem_data, file.path(savedir, "Sample_1.isoforms.results"), sep = '\t')
files <- file.path(savedir,"Sample_1.isoforms.results")
names(files) <- "Sample_1"
# Read in file
ezbdo <- ImportIsoformQuant(ezbdo, files, quant_tool = "rsem")
#> reading in files with read.delim (install 'readr' package for speed up)
#> 1
#>