proscore.Rd
Generic function and default method for computing various kinds of scores for fitted or predicted probability distributions from (regression) models.
proscore(object, newdata = NULL, ...)
# S3 method for default
proscore(
object,
newdata = NULL,
na.action = na.pass,
type = c("logs", "crps"),
aggregate = TRUE,
drop = FALSE,
...
)
a fitted model object. For the default
method this
needs to have a prodist
and a
newresponse
method.
optionally, a data frame in which to look for variables with which to predict and from which to obtain the response variable. If omitted, the original observations are used.
further parameters passed to the aggregate
function (if any).
function determining what should be done with missing
values in newdata
. The default is to employ NA
.
character specifying the type of score to compute. Avaible types:
"logs"
(or equivalently "log-score"
),
"loglikelihood"
(or equivalently "log_pdf"
),
"CRPS"
(or equivalently "RPS"
), "MAE"
, "MSE"
,
"DSS"
(or equivalently "Dawid-Sebastiani"
).
Upper or lower case spellings can be used interchangably, hyphens or underscores
can be included or omitted. Setting type = NULL
yields all available scores.
logical or function to be used for aggregating scores across
observations. Setting aggregate = TRUE
(the default) corresponds to using mean
.
logical. Should scores be returned in a data frame (default) or (if possible) dropped to a vector?
Either a data.frame
of scores (if drop = FALSE
, default) or
a named numeric vector (if drop = TRUE
and the scores are not a matrix).
The names are the type
specified by the user (i.e., are not canonicalized
by partial matching etc.).
The function proscore
provides a unified framework for scoring
probabilistic forecasts (in-sample or out-of-sample). The following scores
are currently available, using the following notation: \(Y\) is the predicted
random variable with cumulative distribution function \(F(\cdot)\) and probability
density function \(f(\cdot)\). The corresponding expectation and variance are denoted
by \(E(Y)\) and \(V(Y)\). The actual observation is \(y\).
Log-score: Also known as logarithmic score. This is the negative log-likelihood where the negative sign has the effect that smaller values indicate a better fit.
$$ -\log f(y) $$
Log-likelihood: Also known as log-density. Clearly, this is equivalent to the log-score above but using the conventional sign where bigger values indicate a better fit.
$$ \log f(y) $$
Continuous ranked probability score (CRPS):
$$ \int_{-\infty}^{\infty} \left( F(x) - 1(x \ge y) \right)^2 \mbox{d} x $$
where \(1(\cdot)\) denotes the indicator function.
In case of a discrete rather than a continuous distribution, the ranked probability score (RPS) is defined analogously using the sum rather than the integral. In other words it is then the sum of the squared deviations between the predicted cumulative probabilities \(F(x)\) and the ideal step function for the actual observation \(y\).
Mean absolute error (MAE):
$$ \left| y - E(Y) \right| $$
Mean squared error (MSE):
$$ \left( y - E(Y) \right)^2 $$
Dawid-Sebastiani score (DSS):
$$ \frac{\left( y - E(Y) \right)^2}{V(Y)} + \log(V(Y)) $$
Internally, the default proscore
method first computes the fitted/predicted
probability distribution object using prodist
(corresponding to \(Y\) above) and then obtains the corresponding observation
\(y\) using newresponse
. Subsequently, the scores are evaluated
using either the log_pdf
method,
crps
method, or simply the mean
. Finally,
the resulting individual scores per observation can be returned as a full
data frame, or aggregated (e.g., by using mean
, sum
, or summary
, etc.).
## Poisson regression model for FIFA 2018 data:
## number of goals scored by each team in each game, explained by
## predicted ability difference of the competing teams
data("FIFA2018", package = "distributions3")
m <- glm(goals ~ difference, data = FIFA2018, family = poisson)
## default: in-sample mean log-score and CRPS
proscore(m)
#> logs crps
#> 1 1.388258 0.5619936
## element-wise score using a new data set (final of the tournament)
final <- tail(FIFA2018, 2)
proscore(m, newdata = final, aggregate = FALSE)
#> logs crps
#> 127 2.891437 1.7744030
#> 128 1.741564 0.7205361
## replicate in-sample log-likelihood
proscore(m, type = "loglik", aggregate = sum)
#> loglik
#> 1 -177.6971
logLik(m)
#> 'log Lik.' -177.6971 (df=2)
## compute mean of all available scores
proscore(m, type = NULL)
#> logs loglikelihood crps mae mse dss
#> 1 1.388258 -1.388258 0.5619936 0.8320441 1.162032 1.085419
## upper vs. lower case spelling is matched internally but preserved in output
proscore(m, type = c("logs", "crps"))
#> logs crps
#> 1 1.388258 0.5619936
proscore(m, type = c("Log-score", "CRPS"))
#> Log-score CRPS
#> 1 1.388258 0.5619936
## least-squares regression for speed and breaking distance of cars
data("cars", package = "datasets")
m <- lm(dist ~ speed, data = cars)
## replicate in-sample log-likelihood and residual sum of squares
## (aka deviance) by taking the sum (rather than the mean) of the
## log-density and squared errors, respectively
proscore(m, type = c("loglik", "MSE"), aggregate = sum)
#> loglik MSE
#> 1 -206.5784 11353.52
logLik(m)
#> 'log Lik.' -206.5784 (df=3)
deviance(m)
#> [1] 11353.52