- New
PoissonBinomial()
distribution, a generalization of the binomial distribution. The Poisson binomial is characterized by n independent Bernoulli trials but with potentially different success probabilities. Thed
/p
/q
/r
functions employ the efficient implementation from the PoissonBinomial package, if available. In case it is not available, fallback computation based on a normal approximation are provided - with a warning, by default (#100). - The
prodist()
methods for various count regression objects now distinguish between computations for the classic pscl package and the newer countreg package (currently on R-Forge, soon to be released to CRAN). - The
simulate()
method fordistribution
objects is now better aligned withsimulate.lm()
in base R: It now always returns adata.frame
withseed
attribute. - New
simulate()
default method which leveragesprodist()
and subsequently uses thesimulate()
method fordistribution
objects. - New
prodist()
methods fordistribution
objects which just returns the unmodifieddistribution
object itself. - The
format()
method - and hence theprint()
method - fordistribution
objects has been simplified. For example, nowNormal(mu = 0, sigma = 1)
is used instead ofNormal distribution (mu = 0, sigma = 1)
in order to yield a more compact output, especially for vectors of distributions (#101). - Added an
as.character()
method which essentially callsformat(..., digits = 15, drop0trailing = TRUE)
. This mimics the behavior and precision of base R for real vectors. Note that this enables usingmatch()
for distribution objects. - Added a
duplicated()
method which relies on the corresponding method for thedata.frame
of parameters in a distribution. - Enabled the inclusion of
distribution
vectors as columns intibble
data objects, see?vec_proxy.distribution
for further details and a practical example. - Fixed errors in notation of cumulative distribution function in the documentation of
HurdlePoisson()
andHurdleNegativeBinomial()
(by @dkwhu in #94 and #96). - The
prodist()
method forglm
objects can now also handlefamily
specifications fromMASS::negative.binomial(theta)
with fixedtheta
(reported by Christian Kleiber). - Replace
ellipsis
dependency byrlang
as the former will be deprecated/archived (by @olivroy in #105). - Further small improvements in methods and manual pages.
- New generics
is_discrete()
andis_continous()
with methods for all distribution objects in the package. Theis_discrete()
methods returnTRUE
for every distribution that is discrete on the entire support andFALSE
otherwise. Analogously,is_continuous()
returnsTRUE
for every distribution that is continuous on the entire support andFALSE
otherwise. Thus, for mixed discrete-continuous distributions both methods should yieldFALSE
(#90). - New logical argument
elementwise = NULL
inapply_dpqr()
and hence inherited incdf()
,pdf()
,log_pdf()
, andquantile()
. It provides type-safety when applying one of the functions to a vector of distributionsd
to a numeric argumentx
where bothd
andx
are of length n > 1. By settingelementwise = TRUE
the function is applied element-by-element, also yielding a vector of length n. By settingelementwise = FALSE
the function is applied for all combinations yielding an n-by-n matrix. The defaultelementwise = NULL
corresponds toFALSE
ifd
andx
are of different lengths andTRUE
if the are of the same length n > 1 (#87). - Extended support for various count data distributions, now enompassing both the Poisson and negative binomial distributions along with various adjustments for zero counts (hurdle, inflation, and truncation, respectively). More details are provided in the following items (#86).
- New
d
/p
/q
/r
functions forhnbinom
,zinbinom
,ztnbinom
, andztpois
similar to the correspondingnbinom
andpois
functions from base R. - New
HurdleNegativeBinomial()
,ZINegativeBinomial()
,ZTNegativeBinomial()
, andZTPoisson()
distribution constructors along with the corresponding S3 methods for the "usual" generics (exceptskewness()
andkurtosis()
). - New
prodist()
methods for extracting the fitted/predicted probability distributions from models estimated byhurdle()
,zeroinfl()
, andzerotrunc()
objects from either thepscl
package or thecountreg
package. - Added argument
prodist(..., sigma = "ML")
to thelm
method for extracting the fitted/predicted probability distribution from a linear regression model. In the previous version theprodist()
method always used the least-squares estimate of the error variance (= residual sum of squares divided by the residual degrees of freedom, n - k), as also reported by thesummary()
method. Now the default is to use the maximum-likelihood estimate instead (divided by the number of observations, n) which is consistent with thelogLik()
method. The previous behavior can be obtained by specifyingsigma = "OLS"
(#91). - Similarly to the
lm
method theglm
methodprodist(..., dispersion = NULL)
now, by default, uses thedispersion
estimate that matches thelogLik()
output. This is based on the deviance divided by the number of observations, n. Alternatively,dispersion = "Chisquared"
uses the estimate employed in thesummary()
method, based on the Chi-squared statistic divided by the residual degrees of freedom, n - k. - Small improvements in methods for various distribution objects: Added
support()
method for GEV-based distributions (GEV()
,GP()
,Gumbel()
,Frechet()
). Added arandom()
method for theTukey()
distribution (using the inversion method).
- Vectorized univariate distribution objects by Moritz Lang and Achim Zeileis (#71 and #82).
This allows representation of fitted probability distributions from regression models.
New helper functions are provided to help setting up such distribution objects in
a unified way. In particular,
apply_dpqr()
helps to apply the standardd
/p
/q
/r
functions available in base R and many packages. The accompanying manual page provides some worked examples and further guidance. - New vignette (by Achim Zeileis) on using
distributions3
to go from basic probability theory to probabilistic regression models. Illustrated with Poisson GLMs for the number of goals per team in the 2018 FIFA World Cup explained by the teams' ability differences. (#74) - New generic function
prodist()
to extract fitted (in-sample) or predicted (out-of-sample) probability distributions from model objects likelm
,glm
, orarima
. (#83) - Extended support for count data distributions (by Achim Zeileis): Alternative parameterization for negative binomial distribution (commonly used in regression models), zero-inflated Poisson, and zero-hurdle Poisson. (#80 and #81)
- Added a plotting generic for univariate distributions (@paulnorthrop, PR #56)
- Added support for the Generalised Extreme Value (GEV), Frechet, Gumbel, reversed Weibull and Generalised Pareto (GP) distributions (@paulnorthrop, PR #52)
- Added support for the Erlang distribution (@ellessenne, PR #54)
- Various minor bug fixes
- Rename to
distributions3
for CRAN
- Added a
NEWS.md
file to track changes to the package. - Initial release