Simulation of transcript isoform kinetic parameters.
SimulateIsoforms.Rd
SimulateIsoforms()
performs a simple simulation of isoform-specific kinetic
parameters to showcase and test EstimateIsoformFractions()
. It assumes that
there are a set of reads (fraction of total set by funique
parameter) which
map uniquely to a given isoform, while the rest are ambiguous to all isoforms
from that gene. Mutational content of these reads are simulated as in
SimulateOneRep()
.
Usage
SimulateIsoforms(
nfeatures,
nt = NULL,
seqdepth = nfeatures * 2500,
label_time = 4,
sample_name = "sampleA",
feature_prefix = "Gene",
pnew = 0.1,
pold = 0.002,
funique = 0.2,
readlength = 200,
Ucont = 0.25,
avg_numiso = 2,
psynthdiff = 0.5,
logkdeg_mean = -1.9,
logkdeg_sd = 0.7,
logksyn_mean = 2.3,
logksyn_sd = 0.7
)
Arguments
- nfeatures
Number of "features" to simulate data for. Each feature will have a simulated number of transcript isoforms
- nt
(Optional), can provide a vector of the number of isoforms you would like to simulate for each of the
nfeatures
features. Vector can either be length 1, in which case that many isoforms will be simulated for all features, or length equal tonfeatures
.- seqdepth
Total number of sequencing reads to simulate
- label_time
Length of s^4U feed to simulate.
- sample_name
Character vector to assign to
sample
column of output simulated data table (the cB table).- feature_prefix
Name given to the i-th feature is
paste0(feature_prefix, i)
. Shows up in thefeature
column of the output simulated data table.- pnew
Probability that a T is mutated to a C if a read is new.
- pold
Probability that a T is mutated to a C if a read is old.
- funique
Fraction of reads that uniquely "map" to a single isoform.
- readlength
Length of simulated reads. In this simple simulation, all reads are simulated as being exactly this length.
- Ucont
Probability that a nucleotide in a simulated read is a U.
- avg_numiso
Average number of isoforms for each feature. Feature-specific isoform counts are drawn from a Poisson distribution with this average. NOTE: to insure that all features have multiple isoforms, the simulated number of isoforms drawn from a Poisson distribution is incremented by 2. Thus, the actual average number of isoforms from each feature is
avg_numiso
+ 2.- psynthdiff
Percentage of genes for which all isoform abundance differences are synthesis driven. If not synthesis driven, then isoform abundance differences will be driven by differences in isoform kdegs.
- logkdeg_mean
meanlog of a log-normal distribution from which kdegs are simulated
- logkdeg_sd
sdlog of a log-normal distribution from which kdegs are simulated
- logksyn_mean
meanlog of a log-normal distribution from which ksyns are simulated
- logksyn_sd
sdlog of a log-normal distribution from which ksyns are simulated