Create a Censored Student’s T Distribution

Description

Class and methods for left-, right-, and interval-censored t distributions using the workflow from the distributions3 package.

Usage

CensoredStudentsT(df, location = 0, scale = 1, left = -Inf, right = Inf)

Arguments

df numeric. The degrees of freedom of the underlying uncensored t distribution. Can be any positive number, with df = Inf corresponding to the normal distribution.
location numeric. The location parameter of the underlying uncensored t distribution, typically written \(\mu\) in textbooks. Can be any real number, defaults to 0.
scale numeric. The scale parameter (standard deviation) of the underlying uncensored t distribution, typically written \(\sigma\) in textbooks. Can be any positive number, defaults to 1.
left numeric. The left censoring point. Can be any real number, defaults to -Inf (uncensored). If set to a finite value, the distribution has a point mass at left whose probability corresponds to the cumulative probability function of the uncensored t distribution at this point.
right numeric. The right censoring point. Can be any real number, defaults to Inf (uncensored). If set to a finite value, the distribution has a point mass at right whose probability corresponds to 1 minus the cumulative probability function of the uncensored t distribution at this point.

Details

The constructor function CensoredStudentsT sets up a distribution object, representing the censored t probability distribution by the corresponding parameters: the degrees of freedom df, the latent mean location = \(\mu\) and latent scale parameter scale = \(\sigma\) (i.e., the parameters of the underlying uncensored t variable), the left censoring point (with -Inf corresponding to uncensored), and the right censoring point (with Inf corresponding to uncensored).

The censored t distribution has probability density function (PDF) \(f(x)\):

\(T((left - \mu)/\sigma)\) if \(x \le left\)
\(1 - T((right - \mu)/\sigma)\) if \(x \ge right\)
\(\tau((x - \mu)/\sigma)/\sigma\) otherwise

where \(T\) and \(\tau\) are the cumulative distribution function and probability density function of the standard t distribution with df degrees of freedom, respectively.

All parameters can also be vectors, so that it is possible to define a vector of censored t distributions with potentially different parameters. All parameters need to have the same length or must be scalars (i.e., of length 1) which are then recycled to the length of the other parameters.

For the CensoredStudentsT distribution objects there is a wide range of standard methods available to the generics provided in the distributions3 package: pdf and log_pdf for the (log-)density (PDF), cdf for the probability from the cumulative distribution function (CDF), quantile for quantiles, random for simulating random variables, crps for the continuous ranked probability score (CRPS), and support for the support interval (minimum and maximum). Internally, these methods rely on the usual d/p/q/r functions provided for the censored t distributions in the crch package, see dct, and the crps_ct function from the scoringRules package. The methods is_discrete and is_continuous can be used to query whether the distributions are discrete on the entire support (always FALSE) or continuous on the entire support (only TRUE if there is no censoring, i.e., if both left and right are infinite).

See the examples below for an illustration of the workflow for the class and methods.

Value

A CensoredStudentsT distribution object.

See Also

dct, StudentsT, TruncatedStudentsT, CensoredNormal, CensoredLogistic

Examples

library("crch")


## package and random seed
library("distributions3")
set.seed(6020)

## three censored t distributions:
## - uncensored standard t with 5 degrees of freedom
## - left-censored at zero with 5 df, latent location = 1 and scale = 1
## - interval-censored in [0, 5] with 5 df, latent location = 2 and scale = 2
X <- CensoredStudentsT(
  df       = c(   5,   5, 5),
  location = c(   0,   1, 2),
  scale    = c(   1,   1, 2),
  left     = c(-Inf,   0, 0),
  right    = c( Inf, Inf, 5)
)

X
[1] "CensoredStudentsT distribution (df = 5, location = 0, scale = 1, left = -Inf, right = Inf)"
[2] "CensoredStudentsT distribution (df = 5, location = 1, scale = 1, left =    0, right = Inf)"
[3] "CensoredStudentsT distribution (df = 5, location = 2, scale = 2, left =    0, right =   5)"
## compute mean of the censored distribution
mean(X)
[1] 0.000000 1.147911 2.135302
## higher moments (variance, skewness, kurtosis) are not implemented yet

## support interval (minimum and maximum)
support(X)
      min max
[1,] -Inf Inf
[2,]    0 Inf
[3,]    0   5
## simulate random variables
random(X, 5)
           r_1        r_2        r_3        r_4       r_5
[1,] -0.329754 -0.7100405 0.01721632 -0.2439421 0.4039513
[2,]  1.880227  1.2620058 1.04606093  1.0363624 2.3830650
[3,]  1.840700  0.1924168 1.99666405  0.0000000 1.6668390
## histograms of 1,000 simulated observations
x <- random(X, 1000)
hist(x[1, ], main = "uncensored")

hist(x[2, ], main = "left-censored at zero")

hist(x[3, ], main = "interval-censored in [0, 5]")

## probability density function (PDF) and log-density (or log-likelihood)
x <- c(0, 0, 1)
pdf(X, x)
[1] 0.3796067 0.1816087 0.1639593
pdf(X, x, log = TRUE)
[1] -0.9686196 -1.7059007 -1.8081373
log_pdf(X, x)
[1] -0.9686196 -1.7059007 -1.8081373
## cumulative distribution function (CDF)
cdf(X, x)
[1] 0.5000000 0.1816087 0.3191494
## quantiles
quantile(X, 0.5)
[1] 0 1 2
## cdf() and quantile() are inverses (except at censoring points)
cdf(X, quantile(X, 0.5))
[1] 0.5 0.5 0.5
quantile(X, cdf(X, 1))
[1] 1 1 1
## all methods above can either be applied elementwise or for
## all combinations of X and x, if length(X) = length(x),
## also the result can be assured to be a matrix via drop = FALSE
p <- c(0.05, 0.5, 0.95)
quantile(X, p, elementwise = FALSE)
        q_0.05 q_0.5   q_0.95
[1,] -2.015048     0 2.015048
[2,]  0.000000     1 3.015048
[3,]  0.000000     2 5.000000
quantile(X, p, elementwise = TRUE)
[1] -2.015048  1.000000  5.000000
quantile(X, p, elementwise = TRUE, drop = FALSE)
      quantile
[1,] -2.015048
[2,]  1.000000
[3,]  5.000000
## compare theoretical and empirical mean from 1,000 simulated observations
cbind(
  "theoretical" = mean(X),
  "empirical" = rowMeans(random(X, 1000))
)
     theoretical  empirical
[1,]    0.000000 0.07350449
[2,]    1.147911 1.12643481
[3,]    2.135302 2.14667243
## evaluate continuous ranked probability score (CRPS) using scoringRules

library("scoringRules")
crps(X, x)
[1] 0.2570254 0.5906767 0.6655784