| Title: | Projecting Customer Retention Based on Fader and Hardie Probability Models |
|---|---|
| Description: | Project Customer Retention based on Beta Geometric, Beta Discrete Weibull and Latent Class Discrete Weibull Models.This package is based on Fader and Hardie (2007) <doi:10.1002/dir.20074> and Fader and Hardie et al. (2018) <doi:10.1016/j.intmar.2018.01.002>. |
| Authors: | Srihari Jaganathan |
| Maintainer: | Srihari Jaganathan <[email protected]> |
| License: | GPL-3 |
| Version: | 0.3 |
| Built: | 2026-05-20 10:16:59 UTC |
| Source: | https://github.com/sriharitn/foretell |
BdW is a beta discrete weibull model implemented based on Fader and Hardie probability based projection methedology. The survivor function for BdW is
BdW( surv_value, h, lower = c(0.001, 0.001, 0.001), upper = c(10000, 10000, 10000), subjects = 1000 )BdW( surv_value, h, lower = c(0.001, 0.001, 0.001), upper = c(10000, 10000, 10000), subjects = 1000 )
surv_value |
a numeric vector of historical customer retention percentage should start at 100 and non-starting values should be between 0 and less than 100 |
h |
forecasting horizon |
lower |
lower limit used in |
upper |
upper limit used in |
subjects |
Total number of customers or subject default 1000 |
fitted: |
Fitted values based on historical data |
projected: |
Projected |
max.likelihood: |
Maximum Likelihood of Beta discrete Weibull |
params - a, b and c:
|
Returns a and b paramters from maximum likelihood estimation for beta distribution and c |
Fader P, Hardie B. How to project customer retention. Journal of Interactive Marketing. 2007;21(1):76-90.
Fader P, Hardie B, Liu Y, Davin J, Steenburgh T. "How to Project Customer Retention" Revisited: The Role of Duration Dependence. Journal of Interactive Marketing. 2018;43:1-16.
surv_value <- c(100,86.9,74.3,65.3,59.3) h <- 6 BdW(surv_value,h)surv_value <- c(100,86.9,74.3,65.3,59.3) h <- 6 BdW(surv_value,h)
Fits the Beta-Discrete-Weibull (BDW) survival model to a monotonically
nonincreasing survival series using unconstrained maximum likelihood
estimation. Internal reparameterization ensures , , and
(e.g., via ), with optional one-and-done mass
and cure fraction . When , the BDW reduces to the shifted
Beta-Geometric (sBG).
bdw1c( surv_value, h, N0 = NULL, one_and_done = FALSE, cure = FALSE, starts_m = 0.6, starts_q = 0.1, starts_c = 1, starts_d = 0.05, starts_cure = 0.05, compute_se = TRUE, input = c("auto", "percent", "count"), percent_tol = 1e-08, boundary_tol = 0.001 )bdw1c( surv_value, h, N0 = NULL, one_and_done = FALSE, cure = FALSE, starts_m = 0.6, starts_q = 0.1, starts_c = 1, starts_d = 0.05, starts_cure = 0.05, compute_se = TRUE, input = c("auto", "percent", "count"), percent_tol = 1e-08, boundary_tol = 0.001 )
surv_value |
Numeric survival series (percent starting at 100, or counts). |
h |
Integer forecast horizon (>= 0). |
N0 |
Cohort size if 'input = "percent"'. |
one_and_done |
Logical; include one-and-done parameter |
cure |
Logical; include cure fraction parameter (fraction never churning). |
starts_m, starts_q
|
Starting values for |
starts_c |
Starting values for |
starts_d, starts_cure
|
Starting values for |
compute_se |
Logical; compute robust SEs via Hessian + delta method. |
input |
One of '"auto"', '"percent"', '"count"'. If '"auto"', infer from 'surv_value' and 'N0'. |
percent_tol, boundary_tol
|
Tolerances for monotonicity checks and boundary SE skipping. |
The baseline BDW survival (no one-and-done, no cure) at discrete times is
With one-and-done mass and cure fraction , survival for is
Parameters (and optionally ) are estimated by maximizing the
likelihood implied by the observed survival series using an internal unconstrained
parameterization. If compute_se = TRUE, standard errors on the original scale
are obtained via the delta method.
Beta reparameterization by mean and polarization index:
The model uses a numerically stable and interpretable reparameterization of
the Beta distribution in terms of a **mean**
and a **polarization (concentration) index**
.
Standard Beta density:
Mapping to mean and polarization:
Here, is the expected probability, and summarizes concentration:
small means large (high concentration); large
means small (diffuse).
Inverse mapping (used internally for estimation):
This transformation is bijective for , ,
and guarantees .
Starting values:
The arguments starts_m and starts_q provide starting values for
and , respectively. They are converted to via the
inverse mapping above:
This parameterization typically improves optimization stability and makes starting values more interpretable.
An object of class bdw_fit with the elements shown below:
yNumeric vector of observed survival values on the input scale.
inputCharacter scalar, one of "input", "percent",
"count", or "prob"; echoes how y was interpreted.
N0Scalar. Reference cohort size used for count/percent scaling.
flagsNamed logical vector with model switches, e.g.
c(one_and_done = TRUE/FALSE, cure = TRUE/FALSE).
coef_paramsNamed numeric vector of parameters on the natural scale.
For sBG: c(a,b) and optionally d (one-and-done) and cure.
For BDW: c(a,b,c) and optionally d, cure.
coef_reparNamed numeric vector with the mean-polarization reparameterization,
typically c(m,q) (and c for BDW if modeled on the log/exp scale).
logitsNamed numeric vector of unconstrained optimization variables
(e.g., theta_m, theta_q, theta_d, theta_cure, theta_c).
fittedNumeric vector on the probability scale; the in-sample fit.
projectedNumeric vector (fit + horizon) on the probability scale.
logLikMaximized log-likelihood.
convergenceOptimizer return code (0 indicates successful convergence).
optim_outRaw optimizer list (as returned by optim()); useful for debugging.
vcov_thetaVariance-covariance matrix on the unconstrained scale (logits).
vcov_paramsVariance-covariance matrix for coef_params (delta-method mapped from vcov_theta).
vcov_reparVariance-covariance matrix for coef_repar (e.g., m,q).
se_paramsNamed vector of standard errors for coef_params.
se_reparNamed vector of standard errors for coef_repar.
se_noteCharacter note if SEs are approximate/unstable (e.g., near-PD fix).
Fader P, Hardie B. How to project customer retention. Journal of Interactive Marketing. 2007;21(1):76-90.
Fader P, Hardie B, Liu Y, Davin J, Steenburgh T. "How to Project Customer Retention" Revisited: The Role of Duration Dependence. Journal of Interactive Marketing. 2018;43:1-16.
## Not run: N0 <- 500; S <- c(100, 86.9, 74.3, 65.3, 59.3, 55.1, 51.7, 49.1) fit <- sbg1c(S, h=6, N0=N0, input="percent") summary(fit) plot(fit, scale="percent") ## End(Not run)## Not run: N0 <- 500; S <- c(100, 86.9, 74.3, 65.3, 59.3, 55.1, 51.7, 49.1) fit <- sbg1c(S, h=6, N0=N0, input="percent") summary(fit) plot(fit, scale="percent") ## End(Not run)
Provides functions for the probability mass function (PMF), cumulative distribution function (CDF), quantile function, and random variate generation for the BdW distribution.
dbdw(x, shape1, shape2, shape3, log = FALSE) pbdw(x, shape1, shape2, shape3, lower.tail = TRUE, log.p = FALSE) qbdw(p, shape1, shape2, shape3, lower.tail = TRUE, log.p = FALSE) rbdw(n, shape1, shape2, shape3)dbdw(x, shape1, shape2, shape3, log = FALSE) pbdw(x, shape1, shape2, shape3, lower.tail = TRUE, log.p = FALSE) qbdw(p, shape1, shape2, shape3, lower.tail = TRUE, log.p = FALSE) rbdw(n, shape1, shape2, shape3)
x |
Vector of non-negative integers for |
p |
Vector of probabilities (0 <= p <= 1) for |
n |
Number of random variates to generate for |
shape1 |
First shape parameter "a" (must be > 0). |
shape2 |
Second shape parameter "b" (must be > 0). |
shape3 |
Second shape parameter "c" (must be > 0). |
log |
Logical; if TRUE, probabilities are returned on the log scale (for |
lower.tail |
Logical; if TRUE (default), probabilities are P(X <= x), otherwise P(X > x) (for |
log.p |
Logical; if TRUE, probabilities are returned on the log scale (for |
dbdw: A numeric vector of PMF values.
pbdw: A numeric vector of CDF values.
qbdw: A numeric vector of quantile values.
rbdw: A numeric vector of random variates.
Fader P, Hardie B. How to project customer retention. Journal of Interactive Marketing. 2007;21(1):76-90.
Fader P, Hardie B, Liu Y, Davin J, Steenburgh T. "How to Project Customer Retention" Revisited: The Role of Duration Dependence. Journal of Interactive Marketing. 2018;43:1-16.
# PMF example dbdw(1:5, shape1 = 2, shape2 = 3,shape3 = 0.5) # CDF example pbdw(1:5, shape1 = 2, shape2 = 3, shape3 = 0.5) # Quantile example qbdw(c(0.1, 0.5, 0.9), shape1 = 2, shape2 = 3 , shape3 = 0.5) # Random variates rbdw(10, shape1 = 2, shape2 = 3, shape3 = 0.5)# PMF example dbdw(1:5, shape1 = 2, shape2 = 3,shape3 = 0.5) # CDF example pbdw(1:5, shape1 = 2, shape2 = 3, shape3 = 0.5) # Quantile example qbdw(c(0.1, 0.5, 0.9), shape1 = 2, shape2 = 3 , shape3 = 0.5) # Random variates rbdw(10, shape1 = 2, shape2 = 3, shape3 = 0.5)
BG is a beta geometric model implemented based on Fader and Hardie probability based projection methedology. The survivor function for BG is
BG(surv_value, h, lower = c(0.001, 0.001), subjects = 1000)BG(surv_value, h, lower = c(0.001, 0.001), subjects = 1000)
surv_value |
a numeric vector of historical customer retention percentage should start at 100 and non-starting values should be between 0 and less than 100 |
h |
forecasting horizon |
lower |
lower limit used in |
subjects |
Total number of customers or subject default 1000 |
fitted: |
Fitted values based on historical data |
projected: |
Projected |
max.likelihood: |
Maximum Likelihood of Beta Geometric |
params - a, b:
|
Returns a and b paramters from maximum likelihood estimation for beta distribution |
Fader P, Hardie B. How to project customer retention. Journal of Interactive Marketing. 2007;21(1):76-90.
surv_value <- c(100,86.9,74.3,65.3,59.3) h <- 6 BG(surv_value,h)surv_value <- c(100,86.9,74.3,65.3,59.3) h <- 6 BG(surv_value,h)
A dataset containing customer retention.
data(customer_retention)data(customer_retention)
A data frame 13 observations and 3 variables.
Time in years
% of regular customers surviving
% of high_end customers surviving
Fader P, Hardie B. How to project customer retention. Journal of Interactive Marketing. 2007;21(1):76-90.
exltrend generates Microsoft(r) Excel(r) based linear, logarthmic, exponential, polynomial of order 2, power trends.
exltrend(surv_value, h)exltrend(surv_value, h)
surv_value |
a numeric vector of historical customer retention percentage should start at 100 and non-starting values should be between 0 and less than 100 |
h |
forecasting horizon |
fitted: |
A data frame of fitted Values based on historical data for linear (lin.p), exponential (exp.p), logarthmic (log.p), polynomial (poly.p) of order 2 and power (pow.p) trends. |
projected: |
A data frame of projected |
surv_value <- c(100,86.9,74.3,65.3,59.3) h <- 6 exltrend(surv_value,h)surv_value <- c(100,86.9,74.3,65.3,59.3) h <- 6 exltrend(surv_value,h)
Fit a geometric survival model (optionally with one-and-done and cure), using logit parameterization, robust SEs from the Hessian, and S3 methods for printing, summarizing, predicting, plotting.
geom1c( surv_value, h, N0 = NULL, one_and_done = FALSE, cure = FALSE, starts_p = 0.15, starts_d = 0.05, starts_cure = 0.05, compute_se = TRUE, input = c("auto", "percent", "count"), percent_tol = 1e-08, boundary_tol = 0.001 )geom1c( surv_value, h, N0 = NULL, one_and_done = FALSE, cure = FALSE, starts_p = 0.15, starts_d = 0.05, starts_cure = 0.05, compute_se = TRUE, input = c("auto", "percent", "count"), percent_tol = 1e-08, boundary_tol = 0.001 )
surv_value |
Numeric survival series (percent starting at 100, or counts). |
h |
Integer forecast horizon (>= 0). |
N0 |
Cohort size if 'input="percent"'. |
one_and_done |
Logical; include one-and-done parameter |
cure |
Logical; include cure fraction parameter. |
starts_p |
Starting values for |
starts_d, starts_cure
|
Starting values for |
compute_se |
Logical; compute robust SEs via Hessian + delta method. |
input |
One of '"auto"', '"percent"', '"count"'. |
percent_tol, boundary_tol
|
Tolerances for monotonicity and boundary SE skipping. |
Let be the per-period churn probability in a geometric model.
The baseline geometric survival (no one-and-done, no cure) at discrete times
is
With one-and-done mass and cure fraction , survival for is
Parameters are estimated by maximizing the likelihood implied by the observed
survival series using an internal unconstrained parameterization for numerical
stability (e.g., a logit transform for ). If compute_se = TRUE,
standard errors on the original scale are obtained via the delta method from
the Hessian on the working scale.
An object of class geom_fit with the elements shown below:
yNumeric vector of observed survival values on the input scale.
inputCharacter scalar, one of "input", "percent",
"count", or "prob"; echoes how y was interpreted.
N0Scalar. Reference cohort size used for count/percent scaling.
flagsNamed logical vector with model switches, e.g.
c(one_and_done = TRUE/FALSE, cure = TRUE/FALSE).
coef_paramsNamed numeric vector of parameters on the natural scale.
For Geom: p and optionally d (one-and-done) and cure.
fittedNumeric vector on the probability scale; the in-sample fit.
projectedNumeric vector (fit + horizon) on the probability scale.
logLikMaximized log-likelihood.
convergenceOptimizer return code (0 indicates successful convergence).
optim_outRaw optimizer list (as returned by optim()); useful for debugging.
vcov_thetaVariance-covariance matrix on the unconstrained scale (logits).
vcov_paramsVariance-covariance matrix for coef_params (delta-method mapped from vcov_theta).
vcov_reparVariance-covariance matrix for coef_repar (e.g., p).
se_paramsNamed vector of standard errors for coef_params.
se_reparNamed vector of standard errors for coef_repar.
se_noteCharacter note if SEs are approximate/unstable (e.g., near-PD fix).
## Not run: N0 <- 500; S <- c(100,94,90,87,85,84,83,82,81,80,79) fit <- geom1c(S, h=6, N0=N0, input="percent") summary(fit) plot(fit, scale="percent") ## End(Not run)## Not run: N0 <- 500; S <- c(100,94,90,87,85,84,83,82,81,80,79) fit <- geom1c(S, h=6, N0=N0, input="percent") summary(fit) plot(fit, scale="percent") ## End(Not run)
LCW is a latent class weibull model implementation based on Fader and Hardie probability based projection methedology. The survivor function for LCW is
LCW( surv_value, h, lower = c(0.001, 0.001, 0.001, 0.001, 0.001), upper = c(0.99999, 10000, 0.999999, 10000, 0.99999), subjects = 1000 )LCW( surv_value, h, lower = c(0.001, 0.001, 0.001, 0.001, 0.001), upper = c(0.99999, 10000, 0.999999, 10000, 0.99999), subjects = 1000 )
surv_value |
a numeric vector of historical customer retention percentage should start at 100 and non-starting values should be between 0 and less than 100 |
h |
forecasting horizon |
lower |
lower limit used in |
upper |
upper limit used in |
subjects |
Total number of customers or subject default 1000 |
fitted: |
Fitted Values based on historical data |
projected: |
Projected |
max.likelihood: |
Maximum Likelihood of LCW |
params - t1, t2, c1, c2, w:
|
Returns t1,c1,t2,c2,w paramters from maximum likelihood estimation |
Fader P, Hardie B. How to project customer retention. Journal of Interactive Marketing. 2007;21(1):76-90.
Fader P, Hardie B, Liu Y, Davin J, Steenburgh T. "How to Project Customer Retention" Revisited: The Role of Duration Dependence. Journal of Interactive Marketing. 2018;43:1-16.
surv_value <- c(100,86.9,74.3,65.3,59.3,55.1,51.7,49.1,46.8,44.5,42.7,40.9,39.4) h <- 6 LCW(surv_value,h)surv_value <- c(100,86.9,74.3,65.3,59.3,55.1,51.7,49.1,46.8,44.5,42.7,40.9,39.4) h <- 6 LCW(surv_value,h)
A dataset containing drug persistency of patients in different therapeutic classes.
data(persistency_data)data(persistency_data)
A data frame 334 observatios and 3 variables:
Type of therapy. Unique values include: "Hypertension" "Occular Hypertension"
"Statin" "Insulin" "Epilepsy" "RA" "Osteoporosis" "Alzheimer""ADHD"
"Atrial Fibrillation". See references below.
Data was extracted using https://automeris.io/WebPlotDigitizer/
and discretized using akima package.
Time Period
% Patients retained
A data frame with 334 rows and 3 variables
Hypertension: Solomon M, Goldman D, Joyce G, Escarce J. Cost Sharing and the Initiation of Drug Therapy for the Chronically Ill.Archives of Internal Medicine. 2009;169(8):740-748.
Occular Hypertension: Campbell J, Schwartz G, LaBounty B, Kowalski J, Patel. Patient adherence and persistence with topical ocular hypotensive therapy in real-world practice: a comparison of bimatoprost 0.01% and travoprost Z 0.004% ophthalmic solutions. Clinical Ophthalmology. 2014;8:927-935.
Statin: Kiss Z, Nagy L, Reiber I, Paragh G, Molnar M, Rokszin G et al. Persistence with statin therapy in Hungary. Archives of Medical Science. 2013;9(3):409-417.
Insulin: Roussel R, Charbonnel B, Behar M, Gourmelen J, Emery C, Detournay B. Persistence with Insulin Therapy in Patients with Type 2 Diabetes in France: An Insurance Claims Study. Diabetes Therapy. 2016;7(3):537-549.
Epilepsy: Lai E, Hsieh C, Su C, Yang Y, Huang C, Lin S et al. Comparative persistence of antiepileptic drugs in patients with epilepsy: A STROBE-compliant retrospective cohort study. Medicine. 2016;95(35):e4481.
RA: Neovius M, Arkema E, Olsson H, Eriksson J, Kristensen L, Simard J et al. Drug survival on TNF inhibitors in patients with rheumatoid arthritis comparison of adalimumab, etanercept and infliximab. Annals of the Rheumatic Diseases. 2013;74(2):354-360.
Osteoporosis: Kishimoto H, Maehara M. Compliance and persistence with daily, weekly, and monthly bisphosphonates for osteoporosis in Japan: analysis of data from the CISA. Archives of Osteoporosis. 2015;10(27):1-6.
Alzheimer: Suh D, Thomas S, Valiyeva E, Arcona S, Vo L. Drug persistency of two cholinesterase inhibitors: rivastigmine versus donepezil in elderly patients with Alzheimer's disease. Drugs & Aging. 2005;22(8):695-707.
ADHD: Beau-Lejdstrom R, Douglas I, Evans S, Smeeth L. Latest trends in ADHD drug prescribing patterns in children in the UK: prevalence, incidence and persistence. BMJ Open. 2016;6(6):1-8.
Atrial Fibrillation: Gomes T, Mamdani M, Holbrook A, Paterson J, Juurlink D. Persistence With Therapy Among Patients Treated With Warfarin for Atrial Fibrillation. Archives of Internal Medicine. 2012;172(21):1687-1689.
Fits the shifted Beta-Geometric (sBG) survival model to a monotonically nonincreasing survival series using unconstrained optimization (internal reparameterization ensures \(a>0, b>0\); optional one-and-done mass \(d\) and cure fraction).
sbg1c( surv_value, h, N0 = NULL, one_and_done = FALSE, cure = FALSE, starts_m = 0.6, starts_q = 0.1, starts_d = 0.05, starts_cure = 0.05, compute_se = TRUE, input = c("auto", "percent", "count"), percent_tol = 1e-08, boundary_tol = 0.001 )sbg1c( surv_value, h, N0 = NULL, one_and_done = FALSE, cure = FALSE, starts_m = 0.6, starts_q = 0.1, starts_d = 0.05, starts_cure = 0.05, compute_se = TRUE, input = c("auto", "percent", "count"), percent_tol = 1e-08, boundary_tol = 0.001 )
surv_value |
Numeric survival series (percent starting at 100, or counts). |
h |
Integer forecast horizon (>= 0). |
N0 |
Cohort size if 'input = "percent"'. |
one_and_done |
Logical; include one-and-done parameter |
cure |
Logical; include cure fraction parameter (fraction never churning). |
starts_m, starts_q
|
Starting values for |
starts_d, starts_cure
|
Starting values for |
compute_se |
Logical; compute robust SEs via Hessian + delta method. |
input |
One of '"auto"', '"percent"', '"count"'. If '"auto"', infer from 'surv_value' and 'N0'. |
percent_tol, boundary_tol
|
Tolerances for monotonicity checks and boundary SE skipping. |
The baseline sBG survival (no one-and-done, no cure) at discrete times is
With one-and-done mass and cure fraction , survival for is
Parameters (and optionally ) are estimated by maximizing the
likelihood implied by the observed survival series using an internal unconstrained
parameterization. If compute_se = TRUE, standard errors on the original scale
are obtained via the delta method.
Beta reparameterization by mean and polarization index:
The model uses a numerically stable and interpretable reparameterization of
the Beta distribution in terms of a **mean**
and a **polarization (concentration) index**
.
Standard Beta density:
Mapping to mean and polarization:
Here, is the expected probability, and summarizes concentration:
small means large (high concentration); large
means small (diffuse).
Inverse mapping (used internally for estimation):
This transformation is bijective for , ,
and guarantees .
Starting values:
The arguments starts_m and starts_q provide starting values for
and , respectively. They are converted to via the
inverse mapping above:
This parameterization typically improves optimization stability and makes starting values more interpretable.
An object of class sbg_fit with the elements shown below:
yNumeric vector of observed survival values on the input scale.
inputCharacter scalar, one of "input", "percent",
"count", or "prob"; echoes how y was interpreted.
N0Scalar. Reference cohort size used for count/percent scaling.
flagsNamed logical vector with model switches, e.g.
c(one_and_done = TRUE/FALSE, cure = TRUE/FALSE).
coef_paramsNamed numeric vector of parameters on the natural scale.
For sBG: c(a,b) and optionally d (one-and-done) and cure.
For BDW: c(a,b,c) and optionally d, cure.
coef_reparNamed numeric vector with the mean-polarization reparameterization,
typically c(m,q) (and c for BDW if modeled on the log/exp scale).
logitsNamed numeric vector of unconstrained optimization variables
(e.g., theta_m, theta_q, theta_d, theta_cure, theta_c).
fittedNumeric vector on the probability scale; the in-sample fit.
projectedNumeric vector (fit + horizon) on the probability scale.
logLikMaximized log-likelihood.
convergenceOptimizer return code (0 indicates successful convergence).
optim_outRaw optimizer list (as returned by optim()); useful for debugging.
vcov_thetaVariance-covariance matrix on the unconstrained scale (logits).
vcov_paramsVariance-covariance matrix for coef_params (delta-method mapped from vcov_theta).
vcov_reparVariance-covariance matrix for coef_repar (e.g., m,q).
se_paramsNamed vector of standard errors for coef_params.
se_reparNamed vector of standard errors for coef_repar.
se_noteCharacter note if SEs are approximate/unstable (e.g., near-PD fix).
Fader P, Hardie B. How to project customer retention. Journal of Interactive Marketing. 2007;21(1):76-90.
## Not run: N0 <- 500; S <- c(100, 86.9, 74.3, 65.3, 59.3, 55.1, 51.7, 49.1) fit <- sbg1c(S, h=6, N0=N0, input="percent") summary(fit) plot(fit, scale="percent") ## End(Not run)## Not run: N0 <- 500; S <- c(100, 86.9, 74.3, 65.3, 59.3, 55.1, 51.7, 49.1) fit <- sbg1c(S, h=6, N0=N0, input="percent") summary(fit) plot(fit, scale="percent") ## End(Not run)
Provides functions for the probability mass function (PMF), cumulative distribution function (CDF), quantile function, and random variate generation for the SBG distribution.
dsbg(x, shape1, shape2, log = FALSE) psbg(x, shape1, shape2, lower.tail = TRUE, log.p = FALSE) qsbg(p, shape1, shape2, lower.tail = TRUE, log.p = FALSE) rsbg(n, shape1, shape2)dsbg(x, shape1, shape2, log = FALSE) psbg(x, shape1, shape2, lower.tail = TRUE, log.p = FALSE) qsbg(p, shape1, shape2, lower.tail = TRUE, log.p = FALSE) rsbg(n, shape1, shape2)
x |
Vector of non-negative integers for |
p |
Vector of probabilities (0 <= p <= 1) for |
n |
Number of random variates to generate for |
shape1 |
First shape parameter "a" (must be > 0). |
shape2 |
Second shape parameter "b" (must be > 0). |
log |
Logical; if TRUE, probabilities are returned on the log scale (for |
lower.tail |
Logical; if TRUE (default), probabilities are P(X <= x), otherwise P(X > x) (for |
log.p |
Logical; if TRUE, probabilities are returned on the log scale (for |
The Shifted Beta Geometric distribution with two shape parameters shape1 () and shape2 () has the following CDF:
For and and .
The Shifted Beta Geometric (sBG) distribution, is a probability mixture model of Beta and Geometric distributions. sBG was introduced by Fader and Hardie, models customer retention by assuming that individuals have heterogeneous dropout probabilities. These probabilities are drawn from a Beta distribution, and each customer's retention follows a geometric process. This combination captures variability in churn behavior across a population, making it well-suited for analyzing survival data, customer lifetime and retention data.
dsbg: A numeric vector of PMF values.
psbg: A numeric vector of CDF values.
qsbg: A numeric vector of quantile values.
rsbg: A numeric vector of random variates.
Fader P, Hardie B. How to project customer retention. Journal of Interactive Marketing. 2007;21(1):76-90.
Fader P, Hardie B, Liu Y, Davin J, Steenburgh T. "How to Project Customer Retention" Revisited: The Role of Duration Dependence. Journal of Interactive Marketing. 2018;43:1-16.
# PMF example dsbg(1:5, shape1 = 2, shape2 = 3) # CDF example psbg(1:5, shape1 = 2, shape2 = 3) # Quantile example qsbg(c(0.1, 0.5, 0.9), shape1 = 2, shape2 = 3) # Random variates rsbg(10, shape1 = 2, shape2 = 3)# PMF example dsbg(1:5, shape1 = 2, shape2 = 3) # CDF example psbg(1:5, shape1 = 2, shape2 = 3) # Quantile example qsbg(c(0.1, 0.5, 0.9), shape1 = 2, shape2 = 3) # Random variates rsbg(10, shape1 = 2, shape2 = 3)
Computes uncertainty bands for a fitted sbg_fit object, returning
up to three band types:
fundamental - process (sampling) variability at fixed parameters
estimation - parameter uncertainty mapped to the mean survival curve
both - predictive bands (parameter draw + one process draw)
sim.bdw1c( object, B1 = 1000, B2 = 0, level = 0.95, scale = c("percent", "count", "prob"), seed = NULL, verbose = FALSE )sim.bdw1c( object, B1 = 1000, B2 = 0, level = 0.95, scale = c("percent", "count", "prob"), seed = NULL, verbose = FALSE )
object |
A fitted |
B1 |
Integer, number of parameter draws for estimation and both bands. |
B2 |
Integer, Average process noise in estimation bands (set |
level |
Confidence level for bands (default |
scale |
Output scale: |
seed |
Optional integer seed for reproducibility. |
verbose |
Logical; print minimal progress messages. |
Set B2 = 0 to skip process simulation inside the estimation bands (fast,
pure estimation-only) to average process noise in estimation bands.
A list with data frames named fundamental, estimation,
and both (whichever were computed). Each data frame has columns:
time, S_hat, lower, upper, part.
King G. , Tomz M. and Wittenberg J. Making the most of statistical analyses: Improving interpretation and presentation. American journal of political science, 2000;347-61.
## Not run: N0 <- 400 S <- c(100, 86.9, 74.3, 65.3, 59.3, 55.1, 51.7, 49.1) fit <- geom1c(S, h = 6, N0 = N0, input = "percent") b <- sim.geom1c(fit, B1 = 200, B2 = 50, level = 0.90, scale = "percent", seed = 123) plot(fit, scale = "percent", bands = b) ## End(Not run)## Not run: N0 <- 400 S <- c(100, 86.9, 74.3, 65.3, 59.3, 55.1, 51.7, 49.1) fit <- geom1c(S, h = 6, N0 = N0, input = "percent") b <- sim.geom1c(fit, B1 = 200, B2 = 50, level = 0.90, scale = "percent", seed = 123) plot(fit, scale = "percent", bands = b) ## End(Not run)
Computes uncertainty bands for a fitted geom_fit object, returning
up to three band types:
fundamental - process (sampling) variability at fixed parameters
estimation - parameter uncertainty mapped to the mean survival curve
both - predictive bands (parameter draw + one process draw)
sim.geom1c( object, B1 = 1000, B2 = 0, level = 0.95, scale = c("percent", "count", "prob"), seed = NULL, verbose = FALSE )sim.geom1c( object, B1 = 1000, B2 = 0, level = 0.95, scale = c("percent", "count", "prob"), seed = NULL, verbose = FALSE )
object |
A fitted |
B1 |
Integer, number of parameter draws and fundemental uncertainity. |
B2 |
Average process noise in estimation bands (set |
level |
Confidence level for bands (default |
scale |
Output scale: |
seed |
Optional integer seed for reproducibility. |
verbose |
Logical; print minimal progress messages. |
Set B2 = 0 to skip process simulation inside the estimation bands (fast,
pure estimation-only). Fundamental bands require B2 > 0.
A list with data frames named fundamental, estimation,
and both (whichever were computed). Each data frame has columns:
time, S_hat, lower, upper, part.
King G. , Tomz M. and Wittenberg J. Making the most of statistical analyses: Improving interpretation and presentation. American journal of political science, 2000;347-61.
## Not run: N0 <- 400 S <- c(100, 86.9, 74.3, 65.3, 59.3, 55.1, 51.7, 49.1) fit <- sbg1c(S, h = 6, N0 = N0, input = "percent") b <- sim.sbg1c(fit, B1 = 200, B2 = 50, level = 0.90, scale = "percent", seed = 123) plot(fit, scale = "percent", bands = b) ## End(Not run)## Not run: N0 <- 400 S <- c(100, 86.9, 74.3, 65.3, 59.3, 55.1, 51.7, 49.1) fit <- sbg1c(S, h = 6, N0 = N0, input = "percent") b <- sim.sbg1c(fit, B1 = 200, B2 = 50, level = 0.90, scale = "percent", seed = 123) plot(fit, scale = "percent", bands = b) ## End(Not run)
Computes uncertainty bands for a fitted sbg_fit object, returning
up to three band types:
fundamental - process (sampling) variability at fixed parameters
estimation - parameter uncertainty mapped to the mean survival curve
both - predictive bands (parameter draw + one process draw)
sim.sbg1c( object, B1 = 1000, B2 = 0, level = 0.95, scale = c("percent", "count", "prob"), seed = NULL, verbose = FALSE )sim.sbg1c( object, B1 = 1000, B2 = 0, level = 0.95, scale = c("percent", "count", "prob"), seed = NULL, verbose = FALSE )
object |
A fitted |
B1 |
Integer, number of parameter draws for estimation and both bands. |
B2 |
Integer, Average process noise in estimation bands (set |
level |
Confidence level for bands (default |
scale |
Output scale: |
seed |
Optional integer seed for reproducibility. |
verbose |
Logical; print minimal progress messages. |
Set B2 = 0 to skip process simulation inside the estimation bands (fast,
pure estimation-only) to average process noise in estimation bands.
A list with data frames named fundamental, estimation,
and both (whichever were computed). Each data frame has columns:
time, S_hat, lower, upper, part.
King G. , Tomz M. and Wittenberg J. Making the most of statistical analyses: Improving interpretation and presentation. American journal of political science, 2000;347-61.
## Not run: N0 <- 400 S <- c(100, 86.9, 74.3, 65.3, 59.3, 55.1, 51.7, 49.1) fit <- geom1c(S, h = 6, N0 = N0, input = "percent") b <- sim.geom1c(fit, B1 = 200, B2 = 50, level = 0.90, scale = "percent", seed = 123) plot(fit, scale = "percent", bands = b) ## End(Not run)## Not run: N0 <- 400 S <- c(100, 86.9, 74.3, 65.3, 59.3, 55.1, 51.7, 49.1) fit <- geom1c(S, h = 6, N0 = N0, input = "percent") b <- sim.geom1c(fit, B1 = 200, B2 = 50, level = 0.90, scale = "percent", seed = 123) plot(fit, scale = "percent", bands = b) ## End(Not run)