LSurvival
Survival analysis functions in Julia for time-to-event outcomes that can include:
- Loss-to-follow-up/right censoring
- Late entry/left truncation (not available in Survival.jl)
- "Person-period" data structures (not available in Survival.jl)
- Observation weights (not available in Survival.jl)
- Competing risks (not available in Survival.jl)
Capabilities include estimators for
- Kaplan-Meier non-parametric conditional risk functions
- Aalen-Johansen non-parametric cause-specific unconditional risk functions
- Cox proportional hazards model (Efron's or Breslow's methods for ties)
- Parametric survival models
Convenience functions enable:
- Non-parametric bootstrapping, cluster-bootstrapping, jackknife
- Estimating baseline hazards from a Cox Model
- Estimating cause-specific risk from an exhaustive set of Cox models for competing risk outcomes
- Simple simulation of competing and non-competing survival events
- Martingale, score, Schoenfeld, and dfbeta residuals
- Cluster robust variance estimation (Cox models)
Plans to include:
- Parametric survival models: more distributions
- Stratification in Cox models
- Parametric survival model diagnostics
The package has been tuned to follow the "survival" package from R in terms of specific estimators/results.
Report issues here
Installation
using Pkg; Pkg.add(url = "https://github.com/alexpkeil1/LSurvival.jl")Quick examples
Single event type: Cox model and Kaplan-Meier curve
# generate some data under a discrete hazards model
using Random
id, int, out, data = LSurvival.dgm(MersenneTwister(1212), 100, 20)
data[:, 1] = round.(data[:, 1], digits = 3)
d, X = data[:, 4], data[:, 1:3]
tab = ( in = int, out = out, d=d, x=X[:,1], z1=X[:,2], z2=X[:,3])
wt = ones(length(d)) # weights of 1.0 just to demonstrate usage
coxph(@formula(Surv(in, out, d)~x+z1+z2), tab, ties = "efron", wts = wt)Output:
Maximum partial likelihood estimates (alpha=0.05):
───────────────────────────────────────────────────────────
ln(HR) StdErr LCI UCI Z P(>|Z|)
───────────────────────────────────────────────────────────
x 1.12624 0.392651 0.356659 1.89582 2.87 0.0041
z1 0.434587 0.27122 -0.0969944 0.966168 1.60 0.1091
z2 1.70434 0.222878 1.26751 2.14117 7.65 <1e-13
───────────────────────────────────────────────────────────
Partial log-likelihood (null): -361.948
Partial log-likelihood (fitted): -332.063
LRT p-value (χ²=59.77, df=3): 6.5858e-13
Newton-Raphson iterations: 5# can also be done if there is no late entry
coxph(@formula(Surv(out, d)~x+z1+z2), tab, ties = "efron", wts = wt)
# can also be done if there is no late entry and no right censoring (i.e. all times are failure times)
coxph(@formula(Surv(out)~x+z1+z2), tab, ties = "efron", wts = wt)
# Kaplan-Meier estimator of the cumulative risk/survival
kaplan_meier(int, outt, d)Competing event analysis: Aalen-Johansen and Cox-model-based estimators of the cumulative risk/survival
# Aalen-Johansen estimator: marginal cause-specific risks
res_aj = aalen_johansen(enter, t, event; wts = wt);
res_aj
# Cox-model estimator: cause-specific risks at given levels of covariates
fit1 = fit(PHModel, X, enter, t, (event .== 1), ties = "efron", wts = wt)
n2idx = findall(event .!= 1)
fit2 = fit(PHModel, X[n2idx,:], enter[n2idx], t[n2idx], (event[n2idx] .== 2), ties = "efron", wts = wt[n2idx])
# risk at average levels of `x` and `z`
res_cph = risk_from_coxphmodels([fit1,fit2], coef_vectors=[coef(fit1), coef(fit2)], pred_profile=mean(X, dims=1))
# compare to Aalen-Johansen fit
res_aj
# this approach operates on left censored outcomes (which operate in the background in model fitting)
LSurvivalResp(enter, t, d, origintime=0)
LSurvivalCompResp(enter, t, event) # automatically infers origin
# can use the ID type to refer to units with multiple observations
id, int, outt, data = dgm(MersenneTwister(), 1000, 10; regimefun = int_0)
LSurvivalResp(int, outt, data[:,4], ID.(id))Index of functions
LSurvival.AbstractLSurvivalParmsLSurvival.AbstractLSurvivalRespLSurvival.AbstractNPSurvLSurvival.AbstractPHLSurvival.AbstractPSModelLSurvival.AbstractSurvDistLSurvival.IDLSurvival.LSurvivalCompRespLSurvival.LSurvivalRespLSurvival.PHModelLSurvival.PHSurvLSurvival.StrataLSurvival._update_PHParms!LSurvival.aalen_johansenLSurvival.bootstrapLSurvival.bootstrapLSurvival.bootstrapLSurvival.bootstrapLSurvival.bootstrapLSurvival.bootstrapLSurvival.bootstrapLSurvival.calcpLSurvival.cdfchisqLSurvival.cdfnormLSurvival.coxphLSurvival.cumhazLSurvival.ddloglik!LSurvival.dgmLSurvival.dgm_compriskLSurvival.dgm_phmodelLSurvival.dloglik!LSurvival.dlpdf_gengammaLSurvival.dlsurv_gengammaLSurvival.jackknifeLSurvival.jackknifeLSurvival.jackknifeLSurvival.kaplan_meierLSurvival.lgh!LSurvival.lgh_breslow!LSurvival.lgh_efron!LSurvival.loglikLSurvival.lpdfLSurvival.lpdfLSurvival.lpdfLSurvival.lpdfLSurvival.lpdfLSurvival.lpdfLSurvival.lpdfLSurvival.lpdfLSurvival.lpdfLSurvival.lpdfLSurvival.lpdf_gradientLSurvival.lpdf_gradientLSurvival.lpdf_gradientLSurvival.lpdf_gradientLSurvival.lpdf_gradientLSurvival.lpdf_hessianLSurvival.lpdf_hessianLSurvival.lpdf_hessianLSurvival.lpdf_hessianLSurvival.lpdf_hessianLSurvival.lpdf_hessianLSurvival.lpdf_hessianLSurvival.lpdf_hessianLSurvival.lpdf_hessianLSurvival.lpdf_weibullLSurvival.lsurvLSurvival.lsurvLSurvival.lsurvLSurvival.lsurvLSurvival.lsurvLSurvival.lsurvLSurvival.lsurvLSurvival.lsurvLSurvival.lsurvLSurvival.lsurvLSurvival.lsurv_gradientLSurvival.lsurv_gradientLSurvival.lsurv_gradientLSurvival.lsurv_gradientLSurvival.lsurv_gradientLSurvival.lsurv_hessianLSurvival.lsurv_hessianLSurvival.lsurv_hessianLSurvival.lsurv_hessianLSurvival.lsurv_hessianLSurvival.lsurv_hessianLSurvival.lsurv_hessianLSurvival.lsurv_hessianLSurvival.lsurv_hessianLSurvival.lsurv_hessianLSurvival.qstdnormLSurvival.qweibullLSurvival.randweibullLSurvival.risk_from_coxphmodelsLSurvival.survivaldataRecipesBase.apply_recipeRecipesBase.apply_recipeRecipesBase.apply_recipeRecipesBase.apply_recipeRecipesBase.apply_recipeRecipesBase.apply_recipeRecipesBase.apply_recipeRecipesBase.apply_recipeRecipesBase.apply_recipeStatsAPI.confintStatsAPI.confintStatsAPI.fitStatsAPI.fitStatsAPI.fitStatsAPI.fitStatsAPI.fit!StatsAPI.loglikelihoodStatsAPI.loglikelihoodStatsAPI.nullloglikelihoodStatsAPI.nullloglikelihoodStatsAPI.predictStatsAPI.residualsStatsAPI.stderrorStatsAPI.stderrorStatsAPI.vcov
Function help
LSurvival.AbstractLSurvivalParms — Type
AbstractLsurvParms
Abstract type representing a model predictors and coefficient parameters
LSurvival.AbstractLSurvivalResp — Type
AbstractLsurvResp
Abstract type representing a model response vector
LSurvival.AbstractNPSurv — Type
Abstract type for non-parametric survival models, including Kaplan-Meier, Aalen Johansen, and Cox-model based estimates of survival using an Aalen-Johansen-like estimator
LSurvival.AbstractPH — Type
Abstract type for proportional hazards models
LSurvival.AbstractPSModel — Type
AbstractPS
Abstract type for parametric survival models
LSurvival.AbstractSurvDist — Type
AbstractSurvDist
Abstract type for parametric survival distributions
LSurvival.ID — Type
Type for identifying individuals in survival outcomes.
Used for the id argument in
- Outcome types: LSurvivalResp, LSurvivalCompResp
- Model types: PHModel, KMRisk, AJRisk
Accepts any Number or String. There is no significance to having this particular struct, but it enables easier use of multiple dispatch.
[ID(i) for i in 1:10]LSurvival.LSurvivalCompResp — Type
Outcome type for competing risk survival outcomes subject to left truncation and right censoring (not generally needed for users)
Parameters
enterTime at observation startexitTime at observation endyevent occurrence in observationwtsobservation weightseventtimesunique event timesoriginorigin on the time scaleidperson level identifier (must be wrapped in ID() function)eventtypesvector of unique event typeseventmatrixmatrix of indicators on the observation levelSignatures:
struct LSurvivalCompResp{
E<:AbstractVector,
X<:AbstractVector,
Y<:AbstractVector,
W<:AbstractVector,
T<:Real,
I<:AbstractLSurvivalID,
V<:AbstractVector,
M<:AbstractMatrix,
} <: AbstractLSurvivalResp
enter::E
exit::X
y::Y
wts::W
eventtimes::X
origin::T
id::Vector{I}
eventtypes::V
eventmatrix::M
end LSurvivalCompResp(
enter::E,
exit::X,
y::Y,
wts::W,
id::Vector{I}
) LSurvivalCompResp(
enter::E,
exit::X,
y::Y,
id::Vector{I}
) LSurvivalCompResp(
enter::E,
exit::X,
y::Y,
wts::W,
) LSurvivalCompResp(
enter::E,
exit::X,
y::Y,
) LSurvivalCompResp(
exit::X,
y::Y,
) where {X<:Vector,Y<:Union{Vector{<:Real},BitVector}}LSurvival.LSurvivalResp — Type
Outcome type for survival outcome subject to left truncation and right censoring.
Will not generally be needed by users
Parameters
enter: Time at observation startexit: Time at observation endy: event occurrence in observationwts: observation weightseventtimes: unique event timesorigin: origin on the time scaleid: person level identifier (must be wrapped in ID() function)
struct LSurvivalResp{
E<:AbstractVector,
X<:AbstractVector,
Y<:AbstractVector,
W<:AbstractVector,
T<:Real,
I<:AbstractLSurvivalID,
} <: AbstractLSurvivalResp
enter::E
exit::X
y::Y
wts::W
eventtimes::E
origin::T
id::Vector{I}
end
LSurvivalResp(
enter::E,
exit::X,
y::Y,
wts::W,
id::Vector{I},
) where {
E<:Vector,
X<:Vector,
Y<:Union{Vector{<:Real},BitVector},
W<:Vector,
I<:AbstractLSurvivalID,
} LSurvivalResp(
enter::E,
exit::X,
y::Y,
id::Vector{I},
)
LSurvivalResp(
y::Vector{Y},
wts::W,
id::Vector{I},
) where {Y<:AbstractSurvTime,W<:Vector,I<:AbstractLSurvivalID} LSurvivalResp(
enter::E,
exit::X,
y::Y,
) where {E<:Vector,X<:Vector,Y<:Union{Vector{<:Real},BitVector}} LSurvivalResp(exit::X, y::Y) where {X<:Vector,Y<:Vector}Examples
# no late entry
LSurvivalResp([.5, .6], [1,0])
LSurvival.PHModel — Type
PHModel: Mutable object type for proportional hazards regression (not generally needed for users)
Parameters
RSurvival responseP# parameterstiesString: "efron" or "breslow"fitBool: logical for whether the model has been fittedbhAbstractMatrix: baseline hazard estimatesSignatures
mutable struct PHModel{G<:LSurvivalResp,L<:AbstractLSurvivalParms} <: AbstractPH
R::G # Survival response
P::L # parameters
ties::String #"efron" or"breslow"
fit::Bool
bh::AbstractMatrix
end
PHModel(
R::G,
P::L,
ties::String,
fit::Bool,
) where {G<:LSurvivalResp,L<:AbstractLSurvivalParms}
PHModel(R::G, P::L, ties::String) where {G<:LSurvivalResp,L<:AbstractLSurvivalParms}
PHModel(R::G, P::L) where {G<:LSurvivalResp,L<:AbstractLSurvivalParms}Methods: fit, coef, confint, std_err, show
Example
using LSurvival
using Random
import LSurvival: _stepcox!, dgm_comprisk
z,x,t,d, event,wt = dgm_comprisk(MersenneTwister(1212), 100);
enter = zeros(length(t));
X = hcat(x,z);
R = LSurvivalResp(enter, t, Int.(d), wt)
P = PHParms(X)
mf = PHModel(R,P)
LSurvival._fit!(mf)LSurvival.PHSurv — Type
Mutable type for proportional hazards models (not generally needed by users)
PHSsurv: Object type for proportional hazards regression
surv::Vector{Float64} risk::Matrix{Float64} basehaz::Vector{Float64} event::Vector{Float64}
fitlist: vector of PHSurv objects (Cox model fits)eventtypes: vector of unique event typestimes: unique event timessurv: Overall survival at each timerisk: Cause-specific risk at each time (1 for each outcome type)basehaz: baseline hazard for a specific event typeevent: value of event type that occurred at each time
Methods: fit, show
mutable struct PHSurv{G<:Array{T} where {T<:PHModel}} <: AbstractNPSurv
fitlist::G
eventtypes::AbstractVector
times::AbstractVector
surv::Vector{Float64}
risk::Matrix{Float64}
basehaz::Vector{Float64}
event::Vector{Float64}
end
PHSurv(fitlist::Array{T}, eventtypes) where {T<:PHModel}
PHSurv(fitlist::Array{T}) where {T<:PHModel}LSurvival.Strata — Type
Type for identifying individuals in survival outcomes. Used for the strata argument in PHModel (not yet implemented)
Accepts any Number or String. There is no significance to having this particular struct, but it enables easier use of multiple dispatch.
[Strata(i) for i in 1:10]LSurvival._update_PHParms! — Method
Update the partial likelihood, gradient and Hessian values from a Cox model fit (used during fitting, not generally useful for users).
Uses Breslow's or Efron's partial likelihood.
Updates over all observations
Signature
_update_PHParms!(
m::M,
# big indexes
ne::I,
caseidxs::Vector{Vector{T}},
risksetidxs::Vector{Vector{T}},
) where {M<:AbstractPH,I<:Int,T<:Int}updatePHParms!(m, risksetidxs, caseidxs, ne, den)
LSurvival.aalen_johansen — Method
Aalen-Johansen estimator for cumulative cause-specific risk (in the presence of competing events)
Signatures
StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}
aalen_johansen(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
; kwargs...)
Keyword arguments
wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observationid::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)atol = 0.00000001; absolute tolerance for defining tied event timeskeepy = true; keep the outcome vector after fitting (may save memory with large datasets)eps = 0.00000001; deprecated (replaced by atol)
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
# event variable is coded 0[referent],1,2
m = fit(AJSurv, enter, t, event)
mw = fit(AJSurv, enter, t, event, wts=wt)or, equivalently:
aalen_johansen(enter, t, event, wts=wt)LSurvival.bootstrap — Method
Bootstrapping coefficients of a proportional hazards model
Signatures
# single bootstrap draw, keeping the entire object
bootstrap(rng::MersenneTwister, m::PHModel)
bootstrap(m::PHModel)
# muliple bootstrap draws, keeping only coefficient estimates
bootstrap(rng::MersenneTwister, m::PHModel, iter::Int; kwargs...)
bootstrap(m::PHModel, iter::Int; kwargs...)Returns:
- If using
bootstrap(m): a single bootstrap draw - If using
bootstrap(m, 10)(e.g.): 10 bootstrap draws of the cumulative cause-specific risks at the end of follow up
using LSurvival, Random
id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 500, 5; afun = LSurvival.int_0)
d, X = data[:, 4], data[:, 1:3]
weights = rand(length(d))
# survival outcome:
R = LSurvivalResp(int, outt, d, ID.(id)) # specification with ID only
P = PHParms(X)
Mod = PHModel(R, P)
LSurvival._fit!(Mod, start=Mod.P._B, keepx=true, keepy=true)
# careful propogation of bootstrap sampling
idx, R2 = bootstrap(R)
P2 = bootstrap(idx, P)
Modb = PHModel(R2, P2)
LSurvival._fit!(Mod, start=Mod.P._B, keepx=true, keepy=true)
# convenience function for bootstrapping a model
Modc = bootstrap(Mod)
LSurvival._fit!(Modc, start=Modc.P._B);
Modc
Modc.P.X == nothing
Modc.R == nothing
Bootstrap Cox model coefficients
LSurvival._fit!(mb, keepx=true, keepy=true, start=[0.0, 0.0])using LSurvival, Random
res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 200)
int = zeros(length(d)) # no late entry
X = hcat(z, x)
mainfit = fit(PHModel, X, int, outt, d .* (event .== 1), keepx=true, keepy=true)
function stddev_finite(x)
n = length(x)
mnx = sum(x)/n
ret = sum((x .- mnx) .^ 2)
ret /= n-1
sqrt(ret)
end
# bootstrap standard error versus asymptotic
mb = bootstrap(MersenneTwister(123123), mainfit, 200)
## bootstrap standard error
[stddev_finite(mb[:,i]) for i in 1:2]
## asymptotic standard error
stderror(mainfit)
LSurvival.bootstrap — Method
Bootstrap methods for Aalen-Johansen cumulative risk estimator
Signatures
# single bootstrap draw, keeping the entire object
bootstrap(rng::MersenneTwister, m::AJSurv)
bootstrap(m::AJSurv)
# muliple bootstrap draws, keeping only coefficient estimates
bootstrap(rng::MersenneTwister, m::AJSurv, iter::Int; kwargs...)
bootstrap(m::AJSurv, iter::Int; kwargs...)Returns:
- If using
bootstrap(m): a single bootstrap draw - If using
bootstrap(m, 10)(e.g.): 10 bootstrap draws of the cumulative cause-specific risks at the end of follow up
using LSurvival
using Random
z, x, t, d, event, wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 100)
id = 1:length(x)
enter = zeros(length(t))
aj1 = aalen_johansen(enter, t, event, id=ID.(id), wts=wt)
aj2 = bootstrap(aj1, keepy=false);
ajboot = bootstrap(aj1, 10, keepy=false);
aj1
aj1.R
aj2.R
LSurvival.bootstrap — Method
Bootstrap methods for Kaplan-Meier survival curve estimator
Signatures
# single bootstrap draw, keeping the entire object
bootstrap(rng::MersenneTwister, m::KMSurv)
bootstrap(m::KMSurv)
# muliple bootstrap draws, keeping only coefficient estimates
bootstrap(rng::MersenneTwister, m::KMSurv, iter::Int; kwargs...)
bootstrap(m::KMSurv, iter::Int; kwargs...)Returns:
- If using
bootstrap(m): a single bootstrap draw - If using
bootstrap(m, 10)(e.g.): 10 bootstrap draws of the survival probability at the end of follow up
using LSurvival
using Random
id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 20, 5; afun = LSurvival.int_0)
d, X = data[:, 4], data[:, 1:3]
wts = rand(length(d))
km1 = kaplan_meier(int, outt, d, id=ID.(id), wts=wts)
km2 = bootstrap(km1, keepy=false)
km3 = bootstrap(km1, 10, keepy=false)
km1
km1.R
km2.R
LSurvival.bootstrap — Method
Bootstrapping coefficients of a proportional hazards model
Signatures
# single bootstrap draw, keeping the entire object
bootstrap(rng::MersenneTwister, m::PHModel)
bootstrap(m::PHModel)
# muliple bootstrap draws, keeping only coefficient estimates
bootstrap(rng::MersenneTwister, m::PHModel, iter::Int; kwargs...)
bootstrap(m::PHModel, iter::Int; kwargs...)Returns:
- If using
bootstrap(m): a single bootstrap draw - If using
bootstrap(m, 10)(e.g.): 10 bootstrap draws of the cumulative cause-specific risks at the end of follow up
using LSurvival, Random
id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 500, 5; afun = LSurvival.int_0)
d, X = data[:, 4], data[:, 1:3]
weights = rand(length(d))
# survival outcome:
R = LSurvivalResp(int, outt, d, ID.(id)) # specification with ID only
P = PHParms(X)
Mod = PHModel(R, P)
LSurvival._fit!(Mod, start=Mod.P._B, keepx=true, keepy=true)
# careful propogation of bootstrap sampling
idx, R2 = bootstrap(R)
P2 = bootstrap(idx, P)
Modb = PHModel(R2, P2)
LSurvival._fit!(Mod, start=Mod.P._B, keepx=true, keepy=true)
# convenience function for bootstrapping a model
Modc = bootstrap(Mod)
LSurvival._fit!(Modc, start=Modc.P._B);
Modc
Modc.P.X == nothing
Modc.R == nothing
Bootstrap Cox model coefficients
LSurvival._fit!(mb, keepx=true, keepy=true, start=[0.0, 0.0])using LSurvival, Random
res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 200)
int = zeros(length(d)) # no late entry
X = hcat(z, x)
mainfit = fit(PHModel, X, int, outt, d .* (event .== 1), keepx=true, keepy=true)
function stddev_finite(x)
n = length(x)
mnx = sum(x)/n
ret = sum((x .- mnx) .^ 2)
ret /= n-1
sqrt(ret)
end
# bootstrap standard error versus asymptotic
mb = bootstrap(MersenneTwister(123123), mainfit, 200)
## bootstrap standard error
[stddev_finite(mb[:,i]) for i in 1:2]
## asymptotic standard error
stderror(mainfit)
LSurvival.bootstrap — Method
Bootstrap sampling of a proportional hazards predictor object
using LSurvival, Random
id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 20, 5; afun = LSurvival.int_0)
d, X = data[:, 4], data[:, 1:3]
weights = rand(length(d))
# survival outcome:
R = LSurvivalResp(int, outt, d, ID.(id)) # specification with ID only
P = PHParms(X)
idx, R2 = bootstrap(R)
P2 = bootstrap(idx, P)
Mod = PHModel(R2, P2)
LSurvival._fit!(Mod, start=Mod.P._B)
LSurvival.bootstrap — Method
Bootstrapping sampling of a competing risk survival response
Signatures
bootstrap(rng::MersenneTwister, R::T) where {T<:LSurvivalCompResp}
bootstrap(R::T) where {T<:LSurvivalCompResp}z,x,t,d,event,weights =
LSurvival.dgm_comprisk(MersenneTwister(1212), 300)
enter = zeros(length(event))
# survival outcome:
R = LSurvivalCompResp(enter, t, event, weights, ID.(collect(1:length(t)))) # specification with ID only
bootstrap(R) # note that entire observations/clusters identified by id are keptLSurvival.bootstrap — Method
Bootstrapping sampling of a survival response
id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 20, 5; afun = LSurvival.int_0)
d, X = data[:, 4], data[:, 1:3]
weights = rand(length(d))
# survival outcome:
R = LSurvivalResp(int, outt, d, ID.(id)) # specification with ID onlyLSurvival.calcp — Method
Two-tailed p-value for a (null) standard normal distribution depends on SpecialFunctions https://en.wikipedia.org/wiki/Normal_distribution
calcp(z) = 1.0 - SpecialFunctions.erf(abs(z) / sqrt(2))
calcp(1.96)LSurvival.cdfchisq — Method
quantile function for a chi-squared distribution depends on SpecialFunctions https://en.wikipedia.org/wiki/Chi-squared_distribution
Source code, example:
cdfchisq(df, x) = SpecialFunctions.gamma_inc(df / 2, x / 2, 0)[1]
cdfchisq(3, 3.45)LSurvival.cdfnorm — Method
quantile function for a standard normal distribution depends on SpecialFunctions https://en.wikipedia.org/wiki/Normal_distribution
Source code, example:
cdfnorm(z) = 0.5 * (1 + SpecialFunctions.erf(z / sqrt(2)))
cdfnorm(1.96)LSurvival.coxph — Method
Fit method for AbstractPH objects (Cox models)
Keyword arguments (used here, and passed on to internal structs)
ties"breslow" or "efron" (default)wtsobservation weightsties"breslow" or "efron" (default)offsetnot currently used at allfitargsarguments passed to other structs, which includeidcluster or individual level ID (defaults to a unique value for each row of data) see note below on IDcontrastsStatsModel style contrasts (dicts) that can be used for variable transformations/indicator variable creation (e.g. https://juliastats.org/StatsModels.jl/stable/contrasts/)
Arguments passed onto fitting routine:
eps(default: Float64 = 1e-9) tolerance for declaring convergence. Model is determined to be converged when relative change in log-partial likelihood is <eps.getbasehaz(default: true): estimate baseline hazardstart(default: nothing) nothing, or vector of floats corresponding to initial values for parameters. Note that this defaults to a vector of zeros when set to nothing, and setting to other values invalidates some of the test statistics reported by default withcoxph.keepx(default: true) logical. Keep design matrix in AbstractPH object output (set to false for slight computational gain).keepy(default: true)logical. Keep outcome in AbstractPH object output (set to false for slight computational gain).bootstrap_sample(default: false) Fit the model to a bootstrap sample of the data (not generally used by end-users, but provides some convenience toward bootstrap variance estimation).bootstrap_rng(default: Random.MersenneTwister()) Random number seed used when drawing a bootstrap sample of the data (not generally used by end-users, but provides some convenience toward bootstrap variance estimation).
Signatures
fit(::Type{M},
X::AbstractMatrix,#{<:FP},
enter::AbstractVector{<:Real},
exit::AbstractVector{<:Real},
y::Union{AbstractVector{<:Real},BitVector}
;
ties ="breslow",
wts::AbstractVector{<:Real} = similar(y, 0),
offset::AbstractVector{<:Real} = similar(y, 0),
fitargs...) where {M<:AbstractPH} coxph(f::FormulaTerm, data; kwargs...) coxph(X, enter, exit, y, args...; kwargs...) using LSurvival, Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
X = hcat(x,rand(length(x)));
m = fit(PHModel, X, enter, t, d, ties="efron")
m2 = fit(PHModel, X, enter, t, d, ties="breslow")
coeftable(m)Note on use of id keyword
id is not needed in person-period structure data for standard estimates or confidence intervals
using Random, LSurvival
id, int, outt, dat =
LSurvival.dgm(MersenneTwister(123123), 100, 100; afun = LSurvival.int_0)
data = (
int = int,
outt = outt,
d = dat[:,4] .== 1,
x = dat[:,1],
z = dat[:,2]
)
f = @formula(Surv(int, outt,d)~x+z)
coxph(f, data)BUT, you must specify id to get appropriate robust variance and some other statistics.
Here is an example where the same data are presented in two different ways, which should yield identical statistics when used in Cox model.
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1)
bic(ft)
nobs(ft)
dof_residual(ft)
# lrtest is another one
stderror(ft) # model based
stderror(ft, type="robust") # robust standard error, based on dfbeta residuals
ft
# now using "clustered" data with multiple observations per individual
dat1clust= (
id = [1,2,3,3,4,4,5,5,6,6],
enter = [0,0,0,1,0,1,0,1,0,1],
exit = [1,1,1,6,1,6,1,8,1,9],
status = [1,0,0,1,0,1,0,0,0,1],
x = [1,1,1,1,0,0,0,0,0,0]
)
# use the `id` parameter with the ID struct
ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id))
bic(ft2) # CORRECT
nobs(ft2) # CORRECT
dof_residual(ft2) # CORRECT
stderror(ft2) # model based (CORRECT)
stderror(ft2, type="robust") # robust standard error, based on `id` level dfbeta residuals (CORRECT)
# once robust SE is calculated, coefficient table uses the robust SE for confidence intervals and test statistics
ft2 # CORRECT (compare to `ft` object)NOTE THE FOLLOWING IS INCORRECT because the id keyword is omitted
ft2w = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust)
bic(ft2w) # INCORRECT
nobs(ft2w) # INCORRECT
dof_residual(ft2w) # INCORRECT
stderror(ft2w) # model based (CORRECT)
stderror(ft2w, type="robust") # robust variance (INCORRECT)
ft2w # the coefficient table now shows incorrect confidence intervals and test statistics
LSurvival.cumhaz — Method
Cumulative hazard functions for sets of covariate values.
i.e. for a cox model λ(t|X) = λ_0(t)exp(Xβ)
estimate of Λ(t | X = newX)
= Λ(t | X = newX)
= ∑ λ(t | X = newX)
= ∑ λ_0(t)exp(newXβ)
where λ_0(t) is an estimate of the baseline hazardThis function does not accomodate late entry
using LSurvival
dt = (
id = [1,2,3,3,4,4,5,5,6,6],
enter = [0,0,0,1,0,3,0,3,0,3],
exit = [1,1,1,6,1,6,1,8,1,9],
status = [1,1,0,1,0,1,0,1,0,0],
x = [1,1,1,1,0,0,0,0,0,0]
)
# use the `id` parameter with the ID struct
ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dt, id=ID.(dt.id))
ft3 = coxph(@formula(Surv(enter, exit, status) ~ x),dt)
newids = ID.([1,2,3,4,5])
newX = [1,1,1,0,0][:,:]
entertime = [0,0,0,0,0]
exittime = [8,8,9,10,9]
try
cumhaz(ft2) # throws an error due to late entry in initial fit
catch e
println(e)
end
cumhaz(ft2,newX,entertime,exittime,newids) # warns about carrying forward covariate values
cumhaz(ft2,newX) # no warningLSurvival.ddloglik! — Method
Hessian contribution for an observation in a parametric survival model
d = m.d
i = 1
enter = m.R.enter[i]
exit = m.R.exit[i]
y = m.R.y[i]
wts = m.R.wts[i]
x = m.P.X[i,:]
θ=[1,0,.4]
LSurvival.dgm — Method
Generating discrete survival data without competing risks
Signatures
dgm(rng::MersenneTwister, n::Int, maxT:Int; afun = int_0, yfun = yprob, lfun = lprob)
dgm(n::Int, maxT::Int; kwargs...)Usage: dgm(rng, n, maxT;afun=int0, yfun=yprob, lfun=lprob) dgm(n, maxT;afun=int0, yfun=yprob, lfun=lprob)
Where afun, yfun, and lfun are all functions that take arguments v,l,a and output time-specific values of a, y, and l respectively Example:
expit(mu) = inv(1.0+exp(-mu))
function aprob(v,l,a)
expit(-1.0 + 3*v + 2*l)
end
function lprob(v,l,a)
expit(-3 + 2*v + 0*l + 0*a)
end
function yprob(v,l,a)
expit(-3 + 2*v + 0*l + 2*a)
end
# 10 individuals followed for up to 5 times
LSurvival.dgm(10, 5;afun=aprob, yfun=yprob, lfun=lprob)LSurvival.dgm_comprisk — Method
Generating continuous survival data with competing risks
Signatures
dgm_comprisk(rng::MersenneTwister, n::Int)
dgm_comprisk(n::Int) - rng = random number generator
- n = sample sizeExample:
using LSurvival
# 100 individuals with two competing events
z,x,t,d,event,weights = LSurvival.dgm_comprisk(100)
LSurvival.dgm_phmodel — Method
Proportional hazards model from a Weibull distribution with scale parameter λ
dgm_phmodel(rng::MersenneTwister, n::Int;
λ=1.25,
β=[0.0, 0.0]
)keyword parameters:
- λ: Weibull scale parameter
- β: vector of regression coefficients
rng = MersenneTwister()
X, t, d, _ = dgm_phmodel(2000; λ=1.25,β=[1.0, -0.5])
coxph(@formula(Surv(t0,t,d)~x+z), (t=t,t0=t.*0,d=d,x=X[:,1],z=X[:,2]))LSurvival.dloglik! — Method
Gradient contribution for an observation in a parametric survival model
d = m.d
i = 1
enter = m.R.enter[i]
exit = m.R.exit[i]
y = m.R.y[i]
wts = m.R.wts[i]
x = m.P.X[i,:]
θ=[1,0,.4]
LSurvival.dlpdf_gengamma — Method
α=0.1 ρ =-1.2 κ=1.9 t = 2.0
exp((log(t) - α)exp(-ρ) - ρ) - exp(κ - ρ) (α - log(t))exp(κ - ρ) + (log(t) - α)exp((log(t) - α)exp(-ρ) - ρ) - 1 (log(t) - α)exp(κ - ρ) - exp(κ)SpecialFunctions.digamma(exp(κ))
LSurvival.dlsurv_gengamma — Method
α=0.1 ρ =-1.2 κ=1.9 t = 2.0 dlsurv_gengamma(α, ρ, κ, t; fd = 1e-14)
LSurvival.jackknife — Method
Obtain jackknife (leave-one-out) estimates from a Aalen-Johansen risk curve (risk at end of follow-up) by refitting the model n times
Signatures
jackknife(m::M;kwargs...) where {M<:AJSurv}LSurvival.jackknife — Method
Obtain jackknife (leave-one-out) estimates from a Kaplan-Meier survival curve (survival at end of follow-up) by refitting the model n times
Signatures
jackknife(m::M;kwargs...) where {M<:KMSurv}using LSurvival, Random, StatsBase
dat1 = (time = [1, 1, 6, 6, 8, 9], status = [1, 0, 1, 1, 0, 1], x = [1, 1, 1, 0, 0, 0])
dat1clust = (
id = [1, 2, 3, 3, 4, 4, 5, 5, 6, 6],
enter = [0, 0, 0, 1, 0, 1, 0, 1, 0, 1],
exit = [1, 1, 1, 6, 1, 6, 1, 8, 1, 9],
status = [1, 0, 0, 1, 0, 1, 0, 0, 0, 1],
x = [1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
)
m = kaplan_meier(dat1.time, dat1.status)
a = aalen_johansen(dat1.time, dat1.status)
mc = kaplan_meier(dat1clust.enter, dat1clust.exit, dat1clust.status, id=ID.(dat1clust.id))
ac = aalen_johansen(dat1clust.enter, dat1clust.exit, dat1clust.status, id=ID.(dat1clust.id))
jk = jackknife(m);
jkc = jackknife(mc);
jka = jackknife(a);
bs = bootstrap(mc, 100);
std(bs[:,1])
stderror(m, type="jackknife")
stderror(mc, type="jackknife")
@assert jk == jkcLSurvival.jackknife — Method
Obtain jackknife (leave-one-out) estimates from a Cox model by refitting the model n times
using LSurvival, Random, StatsBase
id, int, outt, data =
LSurvival.dgm(MersenneTwister(112), 100, 10; afun = LSurvival.int_0)
data[:, 1] = round.(data[:, 1], digits = 3)
d, X = data[:, 4], data[:, 1:3]
wt = rand(length(d))
wt ./= (sum(wt) / length(wt))
m = coxph(X,int, outt,d, wts=wt, id=ID.(id))
jk = jackknife(m);
bs = bootstrap(MersenneTwister(12321), m, 1000);
N = nobs(m)
#comparing estimate with jackknife estimate with bootstrap mean
hcat(coef(m), mean(jk, dims=1)[1,:], mean(bs, dims=1)[1,:])
semb = stderror(m)
sebs = std(bs, dims=1)
sero = stderror(m, type="robust")
sejk = stderror(m, type="jackknife")
sejk_manual = std(jk, dims=1, corrected=false) .* sqrt(N-1)
sqrt.(diag(LSurvival.jackknife_vcov(m)))
hcat(semb, sebs[1,:], sejk, sejk_manual[1,:], sero)
dat1 = (time = [1, 1, 6, 6, 8, 9], status = [1, 0, 1, 1, 0, 1], x = [1, 1, 1, 0, 0, 0])
dat1clust = (
id = [1, 2, 3, 3, 4, 4, 5, 5, 6, 6],
enter = [0, 0, 0, 1, 0, 1, 0, 1, 0, 1],
exit = [1, 1, 1, 6, 1, 6, 1, 8, 1, 9],
status = [1, 0, 0, 1, 0, 1, 0, 0, 0, 1],
x = [1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
)
m = coxph(@formula(Surv(time, status)~x),dat1)
mc = coxph(@formula(Surv(enter, exit, status)~x),dat1clust, id=ID.(dat1clust.id))
jk = jackknife(m);
jkc = jackknife(mc);
bs = bootstrap(mc, 100);
std(bs[:,1])
stderror(m, type="jackknife")
stderror(mc, type="jackknife")
@assert jk == jkcLSurvival.kaplan_meier — Method
Kaplan-Meier estimator for cumulative conditional risk
Signatures
StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}
kaplan_meier(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
; kwargs...)Keyword arguments
- wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observation
- id::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)
- atol = 0.00000001; absolute tolerance for defining tied event times
- censval = 0; value of the outcome to be considered a censored event
- keepy = true; keep the outcome vector after fitting (may save memory with large datasets)
- eps = 0.00000001; deprecated (replaced by atol)
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)or, equivalently:
kaplan_meier(enter, t, d, wts=wt)LSurvival.lgh! — Method
dat1 = (time = [1, 1, 6, 6, 8, 9], status = [1, 0, 1, 1, 0, 1], x = [1, 1, 1, 0, 0, 0]) enter = zeros(length(dat1.time)) t = dat1.time d = dat1.status X = hcat(ones(length(dat1.x)), dat1.x) wt = ones(length(t))
dist = Exponential() P = PSParms(X[:,1:1], extraparms=length(dist)-1) P = PSParms(X, extraparms=length(dist)-1) P.B P.grad R = LSurvivalResp(dat1.time, dat1.status) # specification with ID only m = PSModel(R,P,dist)
λ=1 θ = rand(2) lgh!(m, θ) θ .+= inv(-m.P.hess) * m.P.grad * λ
lgh!(m, θ .+ [-.00, 0, 0]) lgh!(m, θ .+ [-.01, 0.05, 0.05])
m.P._LL
LSurvival.lgh_breslow! — Method
Update the partial likelihood, gradient and Hessian values from a Cox model fit (used during fitting, not generally useful for users).
Uses Breslow's partial likelihood.
Updates over all observations
Signature
lgh_breslow!(m::M, j, caseidx, risksetidx) where {M<:AbstractPH}LSurvival.lgh_efron! — Method
Update the partial likelihood, gradient and Hessian values from a Cox model fit (used during fitting, not generally useful for users).
Uses Efron's partial likelihood.
Updates over all observations
Signature
lgh_efron!(m::M, j, caseidx, risksetidx) where {M<:AbstractPH}LSurvival.loglik — Method
Log likelihood contribution for an observation in a parametric survival model
d = m.d
i = 1
enter = m.R.enter[i]
exit = m.R.exit[i]
y = m.R.y[i]
wts = m.R.wts[i]
x = m.P.X[i,:]
θ=[1,0,.4]
m.P._B
LSurvival.lpdf — Method
Log-likelihood calculation for Exponential regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lpdf_gradient(d, θ, t, x)LSurvival.lpdf — Method
Log probability distribution function: Exponential distribution
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lpdf(d, t)LSurvival.lpdf — Method
log probability distribution for generalized gamma regression
LSurvival.lpdf — Method
log probability distribution function, generalized gamma distribution
Location scale representation (Klein Moeschberger ch 12)
LSurvival.lpdf — Method
log probability distribution for Gamma regression
LSurvival.lpdf — Method
log probability distribution function, Gamma distribution
Location scale representation (Klein Moeschberger ch 12)
LSurvival.lpdf — Method
Log likelihood calculation for Lognormal regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lpdf(d, vcat(θ,ρ), t, x)LSurvival.lpdf — Method
log probability distribution function: Weibull distribution
Location scale representation (Klein Moeschberger ch 12)
LSurvival.lpdf — Method
Log-likelihood calculation for weibull regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lpdf(d, vcat(θ,ρ), t, x)LSurvival.lpdf — Method
Log probability distribution function: Weibull distribution
location scale representation (Klein Moeschberger ch 12)
α=0.1 # location
ρ=-1.2 # log(scale)
time=2
lpdf(Weibull(α, ρ), time)LSurvival.lpdf_gradient — Method
Gradient calculation for Exponential regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lpdf_gradient(d, θ, t, x)LSurvival.lpdf_gradient — Method
log probability distribution gradient for generalized gamma regression analytic gradient
LSurvival.lpdf_gradient — Method
log probability distribution gradient for Gamma regression analytic gradient
LSurvival.lpdf_gradient — Method
Gradient calculation for Lognormal regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lpdf_gradient(d, vcat(θ,ρ), t, x)LSurvival.lpdf_gradient — Method
Gradient calculation for weibull regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
#α = dot(β,x)
d = Weibull()
lpdf_gradient(d, vcat(θ,ρ), t, x)LSurvival.lpdf_hessian — Method
Hessian calculation for Exponential regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lpdf_hessian(d, θ, t, x)LSurvival.lpdf_hessian — Method
Hessian calculation for Weibull distribution: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential(α)
lpdf_hessian(d, t)LSurvival.lpdf_hessian — Method
Hessian calculation for generalized gamma regression: PDF
placeholder function: returns nothingLSurvival.lpdf_hessian — Method
Hessian calculation for generalized gamma distribution: PDF
placeholder function: returns nothing
LSurvival.lpdf_hessian — Method
Hessian calculation for Gamma regression: PDF
placeholder function: returns nothingLSurvival.lpdf_hessian — Method
Hessian calculation for Gamma distribution: PDF
placeholder function: returns nothing
LSurvival.lpdf_hessian — Method
Hessian calculation for Log-normal regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lpdf_hessian(d, vcat(θ,ρ), t, x)LSurvival.lpdf_hessian — Method
Hessian calculation for Log-normal distribution: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal(α, ρ)
lpdf_hessian(d, t)LSurvival.lpdf_hessian — Method
Hessian calculation for weibull regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lpdf_hessian(d, vcat(θ,ρ), t, x)LSurvival.lpdf_weibull — Method
α = -1.2 ρ = 1.8 t = 4.3 z = (log(t) - α) * exp(-ρ) z - exp(z) - ρ - log(t)
LSurvival.lsurv — Method
Log-likelihood calculation for Exponential regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lsurv(d, θ, t, x)LSurvival.lsurv — Method
Log survival function: Exponential distribution
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lsurv(d, t)LSurvival.lsurv — Method
log survival distribution for generalized gamma regression
LSurvival.lsurv — Method
log probability distribution function, generalized gamma distribution
Location scale representation (Klein Moeschberger ch 12)
LSurvival.lsurv — Method
log survival distribution for Gamma regression
LSurvival.lsurv — Method
log probability distribution function, Gamma distribution
Location scale representation (Klein Moeschberger ch 12)
LSurvival.lsurv — Method
Log likelihood calculation for Log-normal regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lsurv(d, vcat(θ,ρ), t, x)LSurvival.lsurv — Method
log probability distribution function: Weibull distribution
Location scale representation (Klein Moeschberger ch 12)
LSurvival.lsurv — Method
Log-likelihood calculation for weibull regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lsurv(d, vcat(θ,ρ), t, x)LSurvival.lsurv — Method
Log survival distribution function: Weibull distribution
location, log(scale) representation (Klein Moeschberger ch 12)
α=0.1 # location
ρ=-1.2 # log(scale)
time=2
lsurv(Weibull(α, ρ), time)LSurvival.lsurv_gradient — Method
Gradient calculation for Exponential regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lsurv_gradient(d, θ, t, x)LSurvival.lsurv_gradient — Method
log survival distribution gradient for generalized gamma regression uses finite differences
LSurvival.lsurv_gradient — Method
log survival distribution gradient for Gamma regression uses finite differences
LSurvival.lsurv_gradient — Method
Gradient calculation for Log-normal regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lsurv_gradient(d, vcat(θ,ρ), t, x)LSurvival.lsurv_gradient — Method
Gradient calculation for weibull regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lsurv_gradient(d, vcat(θ,ρ), t, x)LSurvival.lsurv_hessian — Method
Hessian calculation for Exponential regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lsurv_hessian(d, θ, t, x)LSurvival.lsurv_hessian — Method
Hessian calculation for Exponential distribution: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential(α)
lsurv_hessian(d, t)LSurvival.lsurv_hessian — Method
Hessian calculation for generalized gamma regression: Survival
placeholder function: returns nothingLSurvival.lsurv_hessian — Method
Hessian calculation for generalized gamma distribution: Survival
placeholder function: returns nothingLSurvival.lsurv_hessian — Method
Hessian calculation for Gamma regression: Survival
placeholder function: returns nothingLSurvival.lsurv_hessian — Method
Hessian calculation for Gamma distribution: Survival
placeholder function: returns nothingLSurvival.lsurv_hessian — Method
Hessian calculation for Log-normal regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lsurv_hessian(d, vcat(θ,ρ), t, x)LSurvival.lsurv_hessian — Method
Hessian calculation for Log-normal distribution: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal(α, ρ)
lsurv_hessian(d, t)LSurvival.lsurv_hessian — Method
Hessian calculation for weibull regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lsurv_hessian(d, vcat(θ,ρ), t, x)LSurvival.lsurv_hessian — Method
Hessian calculation for Weibull distribution: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull(α, ρ)
lsurv_hessian(d, t)LSurvival.qstdnorm — Method
quantile function for a standard normal distribution depends on SpecialFunctions https://en.wikipedia.org/wiki/Normal_distribution
Source code, example: qstdnorm(p) = sqrt(2) * SpecialFunctions.erfinv(2.0 * p - 1.0)
qstdnorm(.975)LSurvival.qweibull — Method
Quantile function for the Weibull distribution
\[F(t) = egin{cases} 1 - e^{{-t/ρ}^{α}}&, t ≥ 0\ 0&, t < 0 nd{cases} Q(p) = ρ * (log(1/(1-p))^{1/α})\]
lightweight function used for simulation
Note that there is no checking that parameters α,ρ are positively bound, and p ∈ (0,1), and errors will be given if this is not the case
Signature:
qweibull(p::Real,α::Real,ρ::Real)Source code, example:
qweibull(p, α, ρ) = ρ * ((-log1p(-p))^(1 / α))
# cross reference the approach in the Distributions package
quantile(Distributions.Weibull(.75, 1.1), .3)
LSurvival.qweibull(0.3, .75, 1.1)LSurvival.randweibull — Method
Random draw from Weibull distribution
lightweight function used for simulation
Note that there is no checking that parameters α,ρ are positively bound, and errors will be given if this is not the case
Signatures:
randweibull(rng::MersenneTwister,α::Real,ρ::Real)
randweibull(α::Real,ρ::Real)Source code, example:
randweibull(rng, α, ρ) = qweibull(rand(rng), α, ρ)
randweibull(α, ρ) = randweibull(MersenneTwister(), α, ρ)
# cross reference the approach in the Distributions package
rand(Distributions.Weibull(.75, 1.1))
randweibull(.75, 1.1)LSurvival.risk_from_coxphmodels — Method
Survival curve estimation using multiple cox models
Signatures
risk_from_coxphmodels(fitlist::Vector{T}, args...; kwargs...) where {T<:PHModel}
fit(::Type{M}, fitlist::Vector{T}, ; fitargs...) where {M<:PHSurv,T<:PHModel}Optional keywords
coef_vectors= nothing(default) or vector of coefficient vectors from the cox models [will default to the coefficients from fitlist models]pred_profile= nothing(default) or vector of specific predictor values of the same length as the coef_vectors[1]
using LSurvival
using Random
# event variable is coded 0[referent],1,2
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
ft1 = coxph(hcat(x,z), enter, t, (event .== 1))
nidx = findall(event .!= 1)
ft2 = coxph(hcat(x,z)[nidx,:], enter[nidx], t[nidx], (event[nidx] .== 2))
# risk at referent levels of `x` and `z`
risk_from_coxphmodels([ft1,ft2])
# risk at average levels of `x` and `z`
mnx = sum(x)/length(x)
mnz = sum(z)/length(z)
# equivalent
fit(PHSurv, [ft1,ft2], pred_profile=[mnx,mnz])
risk_from_coxphmodels([ft1,ft2], pred_profile=[mnx,mnz])LSurvival.survivaldata — Method
Loading example survival analysis datasets
using LSurvival, Plots # note Plots does not install by default
heartdata, heartmeta = survivaldata("heart")
ft = coxph(@formula(Surv(start, stop, event)~surgery), heartdata);
# plot baseline cumulative hazard
basehazplot(ft)
# plot Schoenfeld residuals
coxdx(ft) RecipesBase.apply_recipe — Method
Recipe for aalen-johansen risk curve
using Plots, LSurvival
res = z, x, outt, d, event, weights = LSurvival.dgm_comprisk(MersenneTwister(123123), 100)
int = zeros(length(d)) # no late entry
c = fit(AJSurv, int, outt, event)
#risk2 = aalen_johansen(int, outt, event)
plot(c)RecipesBase.apply_recipe — Method
Plotting a kaplan meier curve
using Plots, LSurvival
dat4 = (
id = [1, 1, 2, 2, 2, 3, 4, 5, 5, 6],
enter = [1, 2, 5, 4, 6, 7, 3, 6, 8, 0],
exit = [2, 5, 6, 7, 8, 9, 6, 8, 14, 9],
status = [0, 1, 0, 0, 1, 0, 1, 0, 0, 1],
x = [0.1, 0.1, 1.5, 1.5, 1.5, 0, 0, 0, 0, 3],
z = [1, 1, 0, 0, 0, 0, 0, 1, 1, 0],
w = [0, 0, 0, 0, 0, 1, 1, 1, 1, 0],
)
R = LSurvivalResp(dat4.enter, dat4.exit, dat4.status)
k = kaplan_meier(dat4.enter, dat4.exit, dat4.status)
plot(k)RecipesBase.apply_recipe — Method
function name(::Type{T}) where {T}
#https://stackoverflow.com/questions/70043313/get-simple-name-of-type-in-julia
isempty(T.parameters) ? T : T.name.wrapper
end
using Plots, LSurvival
dat2 = (
enter = [1, 2, 5, 2, 1, 7, 3, 4, 8, 8],
exit = [2, 3, 6, 7, 8, 9, 9, 9, 14, 17],
status = [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
x = [1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
)
fte = survreg(@formula(Surv(enter, exit, status)~x), dat2)
# density function
aftdist(fte, label="X=0")
aftdist!(fte, covlevels=[1.0, 2.0], color="red", label="X=1")
# Survival function
aftdist(fte, type="surv", label="X=0")
aftdist!(fte, type="surv", covlevels=[1.0, 2.0], color="red", label="X=1")
# hazard function
aftdist(fte, type="haz", label="X=0")
aftdist!(fte, type="haz", covlevels=[1.0, 2.0], color="red", label="X=1")
# Cumulative incidence/risk function
aftdist(fte, type="risk", label="X=0")
aftdist!(fte, type="risk", covlevels=[1.0, 2.0], color="red", label="X=1")
RecipesBase.apply_recipe — Method
Plotting baseline hazard for a Cox model
using Plots, LSurvival
dat2 = (
enter = [1, 2, 5, 2, 1, 7, 3, 4, 8, 8],
exit = [2, 3, 6, 7, 8, 9, 9, 9, 14, 17],
status = [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
x = [1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
)
fte = coxph(@formula(Surv(enter, exit, status)~x), dat2, maxiter=0)
ftb = coxph(@formula(Surv(enter, exit, status)~x), dat2, ties="breslow", maxiter=0)
plot(fte, label="Efron")
plot!(ftb, label="Breslow")RecipesBase.apply_recipe — Method
using Plots, LSurvival
dat2 = (
enter = [1, 2, 5, 2, 1, 7, 3, 4, 8, 8],
exit = [2, 3, 6, 7, 8, 9, 9, 9, 14, 17],
status = [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
x = [1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
)
fte = coxph(@formula(Surv(enter, exit, status)~x), dat2)
coxdx(fte)
RecipesBase.apply_recipe — Method
using Plots, LSurvival
dat2 = (
enter = [1, 2, 5, 2, 1, 7, 3, 4, 8, 8],
exit = [2, 3, 6, 7, 8, 9, 9, 9, 14, 17],
status = [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
x = [1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
)
fte = coxph(@formula(Surv(enter, exit, status)~x), dat2)
coxinfluence(fte, type="jackknife", par=1)
coxinfluence!(fte, type="dfbeta", color=:red, par=1)
RecipesBase.apply_recipe — Method
Plotting baseline hazard for a Cox model
using Plots, LSurvival
dat4 = (
id = [1, 1, 2, 2, 2, 3, 4, 5, 5, 6],
enter = [1, 2, 5, 4, 6, 7, 3, 6, 8, 0],
exit = [2, 5, 6, 7, 8, 9, 6, 8, 14, 9],
status = [0, 1, 0, 0, 1, 0, 1, 0, 0, 1],
x = [0.1, 0.1, 1.5, 1.5, 1.5, 0, 0, 0, 0, 3],
z = [1, 1, 0, 0, 0, 0, 0, 1, 1, 0],
w = [0, 0, 0, 0, 0, 1, 1, 1, 1, 0],
)
k = kaplan_meier(dat4.enter, dat4.exit, dat4.status)
lognlogplot(k)RecipesBase.apply_recipe — Method
Plotting LSurvivalResp objects (outcomes in cox models, kaplan meier curves, parametric survival models)
Recipe for plotting time-to-event outcomes
using Plots, LSurvival
dat4 = (
id = [1, 1, 2, 2, 2, 3, 4, 5, 5, 6],
enter = [1, 2, 5, 4, 6, 7, 3, 6, 8, 0],
exit = [2, 5, 6, 7, 8, 9, 6, 8, 14, 9],
status = [0, 1, 0, 0, 1, 0, 1, 0, 0, 1],
x = [0.1, 0.1, 1.5, 1.5, 1.5, 0, 0, 0, 0, 3],
z = [1, 1, 0, 0, 0, 0, 0, 1, 1, 0],
w = [0, 0, 0, 0, 0, 1, 1, 1, 1, 0],
)
R = LSurvivalResp(dat4.enter, dat4.exit, dat4.status)
plot([[R.enter[i], R.exit[i]] for i in eachindex(R.enter)], [[i, i] for i in values(R.id)])RecipesBase.apply_recipe — Method
Recipe for cox-model based risk curves
using Plots, LSurvival, Random, StatsBase
res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 100)
X = hcat(z, x)
int = zeros(length(d)) # no late entry
ft1 = fit(PHModel, X, int, outt, d .* (event .== 1), wts=wts)
ft2 = fit(PHModel, X, int, outt, d .* (event .== 2), wts=wts)
c = risk_from_coxphmodels([ft1, ft2], pred_profile = mean(X, dims=1))
plot(c)StatsAPI.confint — Method
Greenwood's formula for variance and confidence intervals of a Aalen-Johansen risk function
Signatures:
StatsBase.stderror(m::AJSurv)
StatsBase.confint(m:AJSurv; level=0.95, method="normal")Keyword arguments
- method
- "normal" normality-based confidence intervals
- "lognlog" log(-log(S(t))) based confidence intervals
res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 100)
int = zeros(length(d)) # no late entry
m = fit(AJSurv, int, outt, event)
stderror(m)
confint(m, level=0.95)StatsAPI.confint — Method
Greenwood's formula for variance and confidence intervals of a Kaplan-Meier survival curve
Signatures:
StatsBase.stderror(m::KMSurv)
StatsBase.confint(m:KMSurv; level=0.95, method="normal")Keyword arguments
method:
- "normal" normality-based confidence intervals
- "lognlog" log(-log(S(t))) based confidence intervals
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)
stderror(m)
confint(m, method="normal")
confint(m, method="lognlog") # log-log transformationStatsAPI.fit! — Method
Kaplan-Meier estimator for cumulative conditional risk
Signatures
StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}
kaplan_meier(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
; kwargs...)Keyword arguments
- wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observation
- id::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)
- atol = 0.00000001; absolute tolerance for defining tied event times
- censval = 0; value of the outcome to be considered a censored event
- keepy = true; keep the outcome vector after fitting (may save memory with large datasets)
- eps = 0.00000001; deprecated (replaced by atol)
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)or, equivalently:
kaplan_meier(enter, t, d, wts=wt)Aalen-Johansen estimator for cumulative cause-specific risk (in the presence of competing events)
Signatures
StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}
aalen_johansen(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
; kwargs...)
Keyword arguments
wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observationid::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)atol = 0.00000001; absolute tolerance for defining tied event timeskeepy = true; keep the outcome vector after fitting (may save memory with large datasets)eps = 0.00000001; deprecated (replaced by atol)
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
# event variable is coded 0[referent],1,2
m = fit(AJSurv, enter, t, event)
mw = fit(AJSurv, enter, t, event, wts=wt)or, equivalently:
aalen_johansen(enter, t, event, wts=wt)Survival curve estimation using multiple cox models
Signatures
risk_from_coxphmodels(fitlist::Vector{T}, args...; kwargs...) where {T<:PHModel}
fit(::Type{M}, fitlist::Vector{T}, ; fitargs...) where {M<:PHSurv,T<:PHModel}Optional keywords
coef_vectors= nothing(default) or vector of coefficient vectors from the cox models [will default to the coefficients from fitlist models]pred_profile= nothing(default) or vector of specific predictor values of the same length as the coef_vectors[1]
using LSurvival
using Random
# event variable is coded 0[referent],1,2
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
ft1 = coxph(hcat(x,z), enter, t, (event .== 1))
nidx = findall(event .!= 1)
ft2 = coxph(hcat(x,z)[nidx,:], enter[nidx], t[nidx], (event[nidx] .== 2))
# risk at referent levels of `x` and `z`
risk_from_coxphmodels([ft1,ft2])
# risk at average levels of `x` and `z`
mnx = sum(x)/length(x)
mnz = sum(z)/length(z)
# equivalent
fit(PHSurv, [ft1,ft2], pred_profile=[mnx,mnz])
risk_from_coxphmodels([ft1,ft2], pred_profile=[mnx,mnz])StatsAPI.fit — Method
Survival curve estimation using multiple cox models
Signatures
risk_from_coxphmodels(fitlist::Vector{T}, args...; kwargs...) where {T<:PHModel}
fit(::Type{M}, fitlist::Vector{T}, ; fitargs...) where {M<:PHSurv,T<:PHModel}Optional keywords
coef_vectors= nothing(default) or vector of coefficient vectors from the cox models [will default to the coefficients from fitlist models]pred_profile= nothing(default) or vector of specific predictor values of the same length as the coef_vectors[1]
using LSurvival
using Random
# event variable is coded 0[referent],1,2
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
ft1 = coxph(hcat(x,z), enter, t, (event .== 1))
nidx = findall(event .!= 1)
ft2 = coxph(hcat(x,z)[nidx,:], enter[nidx], t[nidx], (event[nidx] .== 2))
# risk at referent levels of `x` and `z`
risk_from_coxphmodels([ft1,ft2])
# risk at average levels of `x` and `z`
mnx = sum(x)/length(x)
mnz = sum(z)/length(z)
# equivalent
fit(PHSurv, [ft1,ft2], pred_profile=[mnx,mnz])
risk_from_coxphmodels([ft1,ft2], pred_profile=[mnx,mnz])StatsAPI.fit — Method
Fit method for AbstractPH objects (Cox models)
Keyword arguments (used here, and passed on to internal structs)
ties"breslow" or "efron" (default)wtsobservation weightsties"breslow" or "efron" (default)offsetnot currently used at allfitargsarguments passed to other structs, which includeidcluster or individual level ID (defaults to a unique value for each row of data) see note below on IDcontrastsStatsModel style contrasts (dicts) that can be used for variable transformations/indicator variable creation (e.g. https://juliastats.org/StatsModels.jl/stable/contrasts/)
Arguments passed onto fitting routine:
eps(default: Float64 = 1e-9) tolerance for declaring convergence. Model is determined to be converged when relative change in log-partial likelihood is <eps.getbasehaz(default: true): estimate baseline hazardstart(default: nothing) nothing, or vector of floats corresponding to initial values for parameters. Note that this defaults to a vector of zeros when set to nothing, and setting to other values invalidates some of the test statistics reported by default withcoxph.keepx(default: true) logical. Keep design matrix in AbstractPH object output (set to false for slight computational gain).keepy(default: true)logical. Keep outcome in AbstractPH object output (set to false for slight computational gain).bootstrap_sample(default: false) Fit the model to a bootstrap sample of the data (not generally used by end-users, but provides some convenience toward bootstrap variance estimation).bootstrap_rng(default: Random.MersenneTwister()) Random number seed used when drawing a bootstrap sample of the data (not generally used by end-users, but provides some convenience toward bootstrap variance estimation).
Signatures
fit(::Type{M},
X::AbstractMatrix,#{<:FP},
enter::AbstractVector{<:Real},
exit::AbstractVector{<:Real},
y::Union{AbstractVector{<:Real},BitVector}
;
ties ="breslow",
wts::AbstractVector{<:Real} = similar(y, 0),
offset::AbstractVector{<:Real} = similar(y, 0),
fitargs...) where {M<:AbstractPH} coxph(f::FormulaTerm, data; kwargs...) coxph(X, enter, exit, y, args...; kwargs...) using LSurvival, Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
X = hcat(x,rand(length(x)));
m = fit(PHModel, X, enter, t, d, ties="efron")
m2 = fit(PHModel, X, enter, t, d, ties="breslow")
coeftable(m)Note on use of id keyword
id is not needed in person-period structure data for standard estimates or confidence intervals
using Random, LSurvival
id, int, outt, dat =
LSurvival.dgm(MersenneTwister(123123), 100, 100; afun = LSurvival.int_0)
data = (
int = int,
outt = outt,
d = dat[:,4] .== 1,
x = dat[:,1],
z = dat[:,2]
)
f = @formula(Surv(int, outt,d)~x+z)
coxph(f, data)BUT, you must specify id to get appropriate robust variance and some other statistics.
Here is an example where the same data are presented in two different ways, which should yield identical statistics when used in Cox model.
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1)
bic(ft)
nobs(ft)
dof_residual(ft)
# lrtest is another one
stderror(ft) # model based
stderror(ft, type="robust") # robust standard error, based on dfbeta residuals
ft
# now using "clustered" data with multiple observations per individual
dat1clust= (
id = [1,2,3,3,4,4,5,5,6,6],
enter = [0,0,0,1,0,1,0,1,0,1],
exit = [1,1,1,6,1,6,1,8,1,9],
status = [1,0,0,1,0,1,0,0,0,1],
x = [1,1,1,1,0,0,0,0,0,0]
)
# use the `id` parameter with the ID struct
ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id))
bic(ft2) # CORRECT
nobs(ft2) # CORRECT
dof_residual(ft2) # CORRECT
stderror(ft2) # model based (CORRECT)
stderror(ft2, type="robust") # robust standard error, based on `id` level dfbeta residuals (CORRECT)
# once robust SE is calculated, coefficient table uses the robust SE for confidence intervals and test statistics
ft2 # CORRECT (compare to `ft` object)NOTE THE FOLLOWING IS INCORRECT because the id keyword is omitted
ft2w = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust)
bic(ft2w) # INCORRECT
nobs(ft2w) # INCORRECT
dof_residual(ft2w) # INCORRECT
stderror(ft2w) # model based (CORRECT)
stderror(ft2w, type="robust") # robust variance (INCORRECT)
ft2w # the coefficient table now shows incorrect confidence intervals and test statistics
StatsAPI.fit — Method
Aalen-Johansen estimator for cumulative cause-specific risk (in the presence of competing events)
Signatures
StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}
aalen_johansen(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
; kwargs...)
Keyword arguments
wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observationid::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)atol = 0.00000001; absolute tolerance for defining tied event timeskeepy = true; keep the outcome vector after fitting (may save memory with large datasets)eps = 0.00000001; deprecated (replaced by atol)
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
# event variable is coded 0[referent],1,2
m = fit(AJSurv, enter, t, event)
mw = fit(AJSurv, enter, t, event, wts=wt)or, equivalently:
aalen_johansen(enter, t, event, wts=wt)StatsAPI.fit — Method
Kaplan-Meier estimator for cumulative conditional risk
Signatures
StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}
kaplan_meier(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
; kwargs...)Keyword arguments
- wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observation
- id::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)
- atol = 0.00000001; absolute tolerance for defining tied event times
- censval = 0; value of the outcome to be considered a censored event
- keepy = true; keep the outcome vector after fitting (may save memory with large datasets)
- eps = 0.00000001; deprecated (replaced by atol)
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)or, equivalently:
kaplan_meier(enter, t, d, wts=wt)StatsAPI.loglikelihood — Method
Maximum log partial likelihood for a fitted AbstractPH model Efron or Breslow (depending on the ties` parameter)
StatsAPI.loglikelihood — Method
Maximum log likelihood for a fitted PSModel model
StatsAPI.nullloglikelihood — Method
Null log-partial likelihood for a fitted AbstractPH model Efron or Breslow (depending on the ties` parameter)
Note: this is just the log partial likelihood at the initial values of the model, which default to 0. If initial values are non-null, then this function no longer validly returns the null log-partial likelihood.
StatsAPI.nullloglikelihood — Method
Null log-partial likelihood for a fitted PSModel model
Note: this is just the log partial likelihood at the initial values of the model, which default to 0. If initial values are non-null, then this function no longer validly returns the null log-partial likelihood.
StatsAPI.predict — Method
Cumulative probability of an outcome by some discrete time t
i.e. for a cox model λ(t|X) = λ0(t)exp(Xβ) estimate of F(t | X=newX) = 1 - exp(-Λ(t | X = newX)) = 1- exp(-∑ λ(t | X = newX)) = 1- exp(-∑ λ0(t)exp(newXβ)) where λ_0(t) is an estimate of the baseline hazard
using LSurvival
dat1clust= (
id = [1,2,3,3,4,4,5,5,6,6],
enter = [0,0,0,1,0,1,0,1,0,1],
exit = [1,1,1,6,1,6,1,8,1,9],
status = [1,0,0,1,0,1,0,0,0,1],
x = [1,1,1,1,0,0,0,0,0,0]
)
# use the `id` parameter with the ID struct
ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id))
# use the `id` parameter with the ID struct
newids = ID.([1,2,3,4,5])
newX = [1,1,1,0,0][:,:]
entertime = [0,0,0,0,0]
exittime = [8,8,9,10,9]
try
predict(ft2) # throws an error due to late entry in initial fit
catch e
println(e)
end
predict(ft2,newX,entertime,exittime,newids)[1] # warns about carrying forward covariate values
preds, times = predict(ft2,newX) # no warning
hcat(times, preds)
# id specific cumulative hazard
# contrast with competing-risk methods from risk_from_coxphmodels,
# which don't perform as well in small samples (overall survival is identical for cheng method)
risk_from_coxphmodels([ft2], pred_profile=[1][:,:], method="cheng")
risk_from_coxphmodels([ft2], pred_profile=[1][:,:], method="aalen")
StatsAPI.residuals — Method
####################################################################
Cox proportional hazards model residuals:
Signature
residuals(m::M; type = "martingale") where {M<:PHModel}where type is one of
martingaleschoenfeldscoredfbetajackknifedfbetas(scaled dfbeta)scaled_schoenfeldorschoenfelds(scaled Schoenfeld)Residuals from the residuals function are designed to exactly emulate those from the
survivalpackage in R. Currently, they are validated for single observation data (e.g. one data row per individual).####################################################################
Martingale residuals: Observed versus expected
# example from https://cran.r-project.org/web/packages/survival/vignettes/validate.pdf
# by Terry Therneau
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
# Nelson-Aalen type estimator for Breslow partial likelihood
ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="breslow")
residuals(ft, type="martingale") dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
# Fleming-Harrington type estimator for Efron partial likelihood
ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="efron")
residuals(ft, type="martingale")
####################################################################
Score residuals: Per observation contribution to score function
using LSurvival
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="breslow")
S = residuals(ft, type="score")[:]
ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="efron", maxiter=0)
S = residuals(ft, type="score")[:]####################################################################
Schoenfeld residuals: Per time contribution to score function
using LSurvival
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="breslow", maxiter=0)
X = ft.P.X
M = residuals(ft, type="martingale")
S = residuals(ft, type="schoenfeld")[:]
Ss = residuals(ft, type="scaled_schoenfeld")[:]####################################################################
dfbeta residuals: influence of individual observations on each parameter
using LSurvival
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1, ties="breslow")
residuals(ft, type="dfbeta")
# can also calculate from score residuals and Hessian matrix
L = residuals(ft, type="score") # n X p
H = ft.P._hess # p X p
dfbeta = L*inv(H)
robVar = dfbeta'dfbeta
sqrt(robVar)
using the id keyword argument
see help for LSurvival.vcov for what happens when id keyword is not used
dat1clust= (
id = [1,2,3,3,4,4,5,5,6,6],
enter = [0,0,0,1,0,1,0,1,0,1],
exit = [1,1,1,6,1,6,1,8,1,9],
status = [1,0,0,1,0,1,0,0,0,1],
x = [1,1,1,1,0,0,0,0,0,0]
)
ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id), ties="breslow")
# note these are still on the observation level (not the id level)!
residuals(ft2, type="dfbeta")
# getting id level dfbeta residuals
dfbeta = residuals(ft2, type="dfbeta")
id = values(ft2.R.id)
D = reduce(vcat, [sum(dfbeta[findall(id .== i),:], dims=1) for i in unique(id)])
D'D
vcov(ft, type="robust")
vcov(ft2, type="robust")####################################################################
jackknife residuals: influence of individual observations on each parameter using leave-one-out estimates
note there are other definitions of jackknife residuals See Chapter 7.1 of "Extending the Cox Model" by Therneau and Grambsch for an example of the type of jackknife residuals used here
Jackknife residuals $r_i$ for $i \in 1:n$ are given as the difference between the maximum partial likelihood estimate and the jackknife estimates for each observation
\[r_i = \hat\beta - \hat\beta_{(-i)}\]
where $\beta_{(-i)}$ is the maximum partial likelihood estimate of the log-hazard ratio vector obtained from a dataset in which observations belonging to individual i are removed
using LSurvival
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1, ties="breslow")
#jackknife(ft)
residuals(ft, type="jackknife")
StatsAPI.stderror — Method
Greenwood's formula for variance and confidence intervals of a Aalen-Johansen risk function
Signatures:
StatsBase.stderror(m::AJSurv)
StatsBase.confint(m:AJSurv; level=0.95, method="normal")Keyword arguments
- method
- "normal" normality-based confidence intervals
- "lognlog" log(-log(S(t))) based confidence intervals
res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 100)
int = zeros(length(d)) # no late entry
m = fit(AJSurv, int, outt, event)
stderror(m)
confint(m, level=0.95)StatsAPI.stderror — Method
Greenwood's formula for variance and confidence intervals of a Kaplan-Meier survival curve
Signatures:
StatsBase.stderror(m::KMSurv)
StatsBase.confint(m:KMSurv; level=0.95, method="normal")Keyword arguments
method:
- "normal" normality-based confidence intervals
- "lognlog" log(-log(S(t))) based confidence intervals
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)
stderror(m)
confint(m, method="normal")
confint(m, method="lognlog") # log-log transformationStatsAPI.vcov — Method
Covariance matrix for Cox proportional hazards models
Keyword arguments
typenothing or "robust": determines whether model based or robust (dfbeta based) variance is returned.See ?residuals for info on
dfbetaresiduals
using LSurvival
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1, id=ID.(collect(1:6)))
vcov(ft) # model based
vcov(ft, type="robust") # robust variance, based on dfbeta residuals
# once robust SE is calculated, coefficient table uses the robust SE for confidence intervals and test statistics
ftcluster robust standard errors using the id keyword argument
dat1clust= (
id = [1,2,3,3,4,4,5,5,6,6],
enter = [0,0,0,1,0,1,0,1,0,1],
exit = [1,1,1,6,1,6,1,8,1,9],
status = [1,0,0,1,0,1,0,0,0,1],
x = [1,1,1,1,0,0,0,0,0,0]
)
ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id))
vcov(ft2) # model based
vcov(ft2, type="robust") # robust variance, based on dfbeta residuals
stderror(ft2, type="robust") # robust variance, based on dfbeta residuals
confint(ft2, type="robust") # robust variance, based on dfbeta residuals
nobs(ft2) # id argument yields correct value of number of independent observations
# once robust SE is calculated, coefficient table uses the robust SE for confidence intervals and test statistics
ft2 NOTE THE FOLLOWING IS INCORRECT because the id keyword is omitted
ft2w = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust)
vcov(ft2w) # model based (CORRECT)
vcov(ft2w, type="robust") # robust variance (INCORRECT)
nobs(ft2w)
ft2wImplementation details and further help
- Likelihood functions for time-to-event observations subject to left-truncation and right censoring
- Non-parametric survival/risk estimation: Kaplan-Meier and Aalen-Johansen
- Cox models
- Parametric survival/risk estimation with Weibull AFT models