LSurvival
Survival analysis functions in Julia for time-to-event outcomes that can include:
- Loss-to-follow-up/right censoring
- Late entry/left truncation (not available in Survival.jl)
- "Person-period" data structures (not available in Survival.jl)
- Observation weights (not available in Survival.jl)
- Competing risks (not available in Survival.jl)
Capabilities include estimators for
- Kaplan-Meier non-parametric conditional risk functions
- Aalen-Johansen non-parametric cause-specific unconditional risk functions
- Cox proportional hazards model (Efron's or Breslow's methods for ties)
- Parametric survival models
Convenience functions enable:
- Non-parametric bootstrapping, cluster-bootstrapping, jackknife
- Estimating baseline hazards from a Cox Model
- Estimating cause-specific risk from an exhaustive set of Cox models for competing risk outcomes
- Simple simulation of competing and non-competing survival events
- Martingale, score, Schoenfeld, and dfbeta residuals
- Cluster robust variance estimation (Cox models)
Plans to include:
- Parametric survival models: more distributions
- Stratification in Cox models
- Parametric survival model diagnostics
The package has been tuned to follow the "survival" package from R in terms of specific estimators/results.
Report issues here
Installation
using Pkg; Pkg.add(url = "https://github.com/alexpkeil1/LSurvival.jl")
Quick examples
Single event type: Cox model and Kaplan-Meier curve
tab = ( in = int, out = out, d=d, x=X[:,1], z1=X[:,2], z2=X[:,3])
coxph(@formula(Surv(in, out, d)~x+z1+z2), tab, ties = "efron", wts = wt)
Output:
Maximum partial likelihood estimates (alpha=0.05):
─────────────────────────────────────────────────────────
ln(HR) StdErr LCI UCI Z P(>|Z|)
─────────────────────────────────────────────────────────
x 1.60222 0.530118 0.563208 2.64123 3.02 0.0025
z1 0.305929 0.43838 -0.55328 1.16514 0.70 0.4853
z2 1.98011 0.325314 1.34251 2.61771 6.09 <1e-08
─────────────────────────────────────────────────────────
Partial log-likelihood (null): -150.162
Partial log-likelihood (fitted): -131.512
LRT p-value (X^2=37.3, df=3): 3.9773e-08
Newton-Raphson iterations: 5
# can also be done if there is no late entry
coxph(@formula(Surv(out, d)~x+z1+z2), tab, ties = "efron", wts = wt)
# can also be done if there is no late entry and no right censoring (i.e. all times are failure times)
coxph(@formula(Surv(out)~x+z1+z2), tab, ties = "efron", wts = wt)
# Kaplan-Meier estimator of the cumulative risk/survival
kaplan_meier(int, outt, d)
Competing event analysis: Aalen-Johansen and Cox-model-based estimators of the cumulative risk/survival
# Aalen-Johansen estimator: marginal cause-specific risks
res_aj = aalen_johansen(enter, t, event; wts = wt);
res_aj
# Cox-model estimator: cause-specific risks at given levels of covariates
fit1 = fit(PHModel, X, enter, t, (event .== 1), ties = "efron", wts = wt)
n2idx = findall(event .!= 1)
fit2 = fit(PHModel, X[n2idx,:], enter[n2idx], t[n2idx], (event[n2idx] .== 2), ties = "efron", wts = wt[n2idx])
# risk at average levels of `x` and `z`
res_cph = risk_from_coxphmodels([fit1,fit2], coef_vectors=[coef(fit1), coef(fit2)], pred_profile=mean(X, dims=1))
# compare to Aalen-Johansen fit
res_aj
# this approach operates on left censored outcomes (which operate in the background in model fitting)
LSurvivalResp(enter, t, d, origintime=0)
LSurvivalCompResp(enter, t, event) # automatically infers origin
# can use the ID type to refer to units with multiple observations
id, int, outt, data = dgm(MersenneTwister(), 1000, 10; regimefun = int_0)
LSurvivalResp(int, outt, data[:,4], ID.(id))
Index of functions
LSurvival.AbstractLSurvivalParms
LSurvival.AbstractLSurvivalResp
LSurvival.AbstractNPSurv
LSurvival.AbstractPH
LSurvival.AbstractPSModel
LSurvival.AbstractSurvDist
LSurvival.ID
LSurvival.LSurvivalCompResp
LSurvival.LSurvivalResp
LSurvival.PHModel
LSurvival.PHSurv
LSurvival.Strata
LSurvival._update_PHParms!
LSurvival.aalen_johansen
LSurvival.bootstrap
LSurvival.bootstrap
LSurvival.bootstrap
LSurvival.bootstrap
LSurvival.bootstrap
LSurvival.bootstrap
LSurvival.bootstrap
LSurvival.calcp
LSurvival.cdfchisq
LSurvival.cdfnorm
LSurvival.coxph
LSurvival.ddloglik!
LSurvival.dgm
LSurvival.dgm_comprisk
LSurvival.dgm_phmodel
LSurvival.dloglik!
LSurvival.dlpdf_gengamma
LSurvival.dlsurv_gengamma
LSurvival.jackknife
LSurvival.jackknife
LSurvival.jackknife
LSurvival.kaplan_meier
LSurvival.lgh!
LSurvival.lgh_breslow!
LSurvival.lgh_efron!
LSurvival.loglik
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf_gradient
LSurvival.lpdf_gradient
LSurvival.lpdf_gradient
LSurvival.lpdf_gradient
LSurvival.lpdf_gradient
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_weibull
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv_gradient
LSurvival.lsurv_gradient
LSurvival.lsurv_gradient
LSurvival.lsurv_gradient
LSurvival.lsurv_gradient
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.qstdnorm
LSurvival.qweibull
LSurvival.randweibull
LSurvival.risk_from_coxphmodels
LSurvival.survivaldata
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
StatsAPI.confint
StatsAPI.confint
StatsAPI.confint
StatsAPI.fit
StatsAPI.fit
StatsAPI.fit
StatsAPI.fit
StatsAPI.fit!
StatsAPI.loglikelihood
StatsAPI.loglikelihood
StatsAPI.nullloglikelihood
StatsAPI.nullloglikelihood
StatsAPI.residuals
StatsAPI.stderror
StatsAPI.stderror
StatsAPI.vcov
Function help
LSurvival.AbstractLSurvivalParms
— TypeAbstractLsurvParms
Abstract type representing a model predictors and coefficient parameters
LSurvival.AbstractLSurvivalResp
— TypeAbstractLsurvResp
Abstract type representing a model response vector
LSurvival.AbstractNPSurv
— TypeAbstract type for non-parametric survival models, including Kaplan-Meier, Aalen Johansen, and Cox-model based estimates of survival using an Aalen-Johansen-like estimator
LSurvival.AbstractPH
— TypeAbstract type for proportional hazards models
LSurvival.AbstractPSModel
— TypeAbstractPS
Abstract type for parametric survival models
LSurvival.AbstractSurvDist
— TypeAbstractSurvDist
Abstract type for parametric survival distributions
LSurvival.ID
— TypeType for identifying individuals in survival outcomes.
Used for the id argument in
- Outcome types: LSurvivalResp, LSurvivalCompResp
- Model types: PHModel, KMRisk, AJRisk
Accepts any Number or String. There is no significance to having this particular struct, but it enables easier use of multiple dispatch.
[ID(i) for i in 1:10]
LSurvival.LSurvivalCompResp
— TypeOutcome type for competing risk survival outcomes subject to left truncation and right censoring (not generally needed for users)
Parameters
enter
Time at observation startexit
Time at observation endy
event occurrence in observationwts
observation weightseventtimes
unique event timesorigin
origin on the time scaleid
person level identifier (must be wrapped in ID() function)eventtypes
vector of unique event typeseventmatrix
matrix of indicators on the observation levelSignatures:
struct LSurvivalCompResp{
E<:AbstractVector,
X<:AbstractVector,
Y<:AbstractVector,
W<:AbstractVector,
T<:Real,
I<:AbstractLSurvivalID,
V<:AbstractVector,
M<:AbstractMatrix,
} <: AbstractLSurvivalResp
enter::E
exit::X
y::Y
wts::W
eventtimes::X
origin::T
id::Vector{I}
eventtypes::V
eventmatrix::M
end
LSurvivalCompResp(
enter::E,
exit::X,
y::Y,
wts::W,
id::Vector{I}
)
LSurvivalCompResp(
enter::E,
exit::X,
y::Y,
id::Vector{I}
)
LSurvivalCompResp(
enter::E,
exit::X,
y::Y,
wts::W,
)
LSurvivalCompResp(
enter::E,
exit::X,
y::Y,
)
LSurvivalCompResp(
exit::X,
y::Y,
) where {X<:Vector,Y<:Union{Vector{<:Real},BitVector}}
LSurvival.LSurvivalResp
— TypeOutcome type for survival outcome subject to left truncation and right censoring.
Will not generally be needed by users
Parameters
enter
: Time at observation startexit
: Time at observation endy
: event occurrence in observationwts
: observation weightseventtimes
: unique event timesorigin
: origin on the time scaleid
: person level identifier (must be wrapped in ID() function)
struct LSurvivalResp{
E<:AbstractVector,
X<:AbstractVector,
Y<:AbstractVector,
W<:AbstractVector,
T<:Real,
I<:AbstractLSurvivalID,
} <: AbstractLSurvivalResp
enter::E
exit::X
y::Y
wts::W
eventtimes::E
origin::T
id::Vector{I}
end
LSurvivalResp(
enter::E,
exit::X,
y::Y,
wts::W,
id::Vector{I},
) where {
E<:Vector,
X<:Vector,
Y<:Union{Vector{<:Real},BitVector},
W<:Vector,
I<:AbstractLSurvivalID,
}
LSurvivalResp(
enter::E,
exit::X,
y::Y,
id::Vector{I},
)
LSurvivalResp(
y::Vector{Y},
wts::W,
id::Vector{I},
) where {Y<:AbstractSurvTime,W<:Vector,I<:AbstractLSurvivalID}
LSurvivalResp(
enter::E,
exit::X,
y::Y,
) where {E<:Vector,X<:Vector,Y<:Union{Vector{<:Real},BitVector}}
LSurvivalResp(exit::X, y::Y) where {X<:Vector,Y<:Vector}
Examples
# no late entry
LSurvivalResp([.5, .6], [1,0])
LSurvival.PHModel
— TypePHModel: Mutable object type for proportional hazards regression (not generally needed for users)
Parameters
R
Survival responseP
# parametersties
String: "efron" or "breslow"fit
Bool: logical for whether the model has been fittedbh
AbstractMatrix: baseline hazard estimatesSignatures
mutable struct PHModel{G<:LSurvivalResp,L<:AbstractLSurvivalParms} <: AbstractPH
R::G # Survival response
P::L # parameters
ties::String #"efron" or"breslow"
fit::Bool
bh::AbstractMatrix
end
PHModel(
R::G,
P::L,
ties::String,
fit::Bool,
) where {G<:LSurvivalResp,L<:AbstractLSurvivalParms}
PHModel(R::G, P::L, ties::String) where {G<:LSurvivalResp,L<:AbstractLSurvivalParms}
PHModel(R::G, P::L) where {G<:LSurvivalResp,L<:AbstractLSurvivalParms}
Methods: fit, coef, confint, std_err, show
Example
using LSurvival
using Random
import LSurvival: _stepcox!, dgm_comprisk
z,x,t,d, event,wt = dgm_comprisk(MersenneTwister(1212), 100);
enter = zeros(length(t));
X = hcat(x,z);
R = LSurvivalResp(enter, t, Int.(d), wt)
P = PHParms(X)
mf = PHModel(R,P)
LSurvival._fit!(mf)
LSurvival.PHSurv
— TypeMutable type for proportional hazards models (not generally needed by users)
PHSsurv: Object type for proportional hazards regression
surv::Vector{Float64} risk::Matrix{Float64} basehaz::Vector{Float64} event::Vector{Float64}
fitlist
: vector of PHSurv objects (Cox model fits)eventtypes
: vector of unique event typestimes
: unique event timessurv
: Overall survival at each timerisk
: Cause-specific risk at each time (1 for each outcome type)basehaz
: baseline hazard for a specific event typeevent
: value of event type that occurred at each time
Methods: fit, show
mutable struct PHSurv{G<:Array{T} where {T<:PHModel}} <: AbstractNPSurv
fitlist::G
eventtypes::AbstractVector
times::AbstractVector
surv::Vector{Float64}
risk::Matrix{Float64}
basehaz::Vector{Float64}
event::Vector{Float64}
end
PHSurv(fitlist::Array{T}, eventtypes) where {T<:PHModel}
PHSurv(fitlist::Array{T}) where {T<:PHModel}
LSurvival.Strata
— TypeType for identifying individuals in survival outcomes. Used for the strata argument in PHModel (not yet implemented)
Accepts any Number or String. There is no significance to having this particular struct, but it enables easier use of multiple dispatch.
[Strata(i) for i in 1:10]
LSurvival._update_PHParms!
— MethodUpdate the partial likelihood, gradient and Hessian values from a Cox model fit (used during fitting, not generally useful for users).
Uses Breslow's or Efron's partial likelihood.
Updates over all observations
Signature
_update_PHParms!(
m::M,
# big indexes
ne::I,
caseidxs::Vector{Vector{T}},
risksetidxs::Vector{Vector{T}},
) where {M<:AbstractPH,I<:Int,T<:Int}
updatePHParms!(m, risksetidxs, caseidxs, ne, den)
LSurvival.aalen_johansen
— MethodAalen-Johansen estimator for cumulative cause-specific risk (in the presence of competing events)
Signatures
StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}
aalen_johansen(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
; kwargs...)
Keyword arguments
wts::Vector{<:Real} = similar(enter, 0)
; vector of case weights (or zero length vector) for each observationid::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]
; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)atol = 0.00000001
; absolute tolerance for defining tied event timeskeepy = true
; keep the outcome vector after fitting (may save memory with large datasets)eps = 0.00000001
; deprecated (replaced by atol)
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
# event variable is coded 0[referent],1,2
m = fit(AJSurv, enter, t, event)
mw = fit(AJSurv, enter, t, event, wts=wt)
or, equivalently:
aalen_johansen(enter, t, event, wts=wt)
LSurvival.bootstrap
— MethodBootstrapping coefficients of a proportional hazards model
Signatures
# single bootstrap draw, keeping the entire object
bootstrap(rng::MersenneTwister, m::PHModel)
bootstrap(m::PHModel)
# muliple bootstrap draws, keeping only coefficient estimates
bootstrap(rng::MersenneTwister, m::PHModel, iter::Int; kwargs...)
bootstrap(m::PHModel, iter::Int; kwargs...)
Returns:
- If using
bootstrap(m)
: a single bootstrap draw - If using
bootstrap(m, 10)
(e.g.): 10 bootstrap draws of the cumulative cause-specific risks at the end of follow up
using LSurvival, Random
id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 500, 5; afun = LSurvival.int_0)
d, X = data[:, 4], data[:, 1:3]
weights = rand(length(d))
# survival outcome:
R = LSurvivalResp(int, outt, d, ID.(id)) # specification with ID only
P = PHParms(X)
Mod = PHModel(R, P)
LSurvival._fit!(Mod, start=Mod.P._B, keepx=true, keepy=true)
# careful propogation of bootstrap sampling
idx, R2 = bootstrap(R)
P2 = bootstrap(idx, P)
Modb = PHModel(R2, P2)
LSurvival._fit!(Mod, start=Mod.P._B, keepx=true, keepy=true)
# convenience function for bootstrapping a model
Modc = bootstrap(Mod)
LSurvival._fit!(Modc, start=Modc.P._B);
Modc
Modc.P.X == nothing
Modc.R == nothing
Bootstrap Cox model coefficients
LSurvival._fit!(mb, keepx=true, keepy=true, start=[0.0, 0.0])
using LSurvival, Random
res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 200)
int = zeros(length(d)) # no late entry
X = hcat(z, x)
mainfit = fit(PHModel, X, int, outt, d .* (event .== 1), keepx=true, keepy=true)
function stddev_finite(x)
n = length(x)
mnx = sum(x)/n
ret = sum((x .- mnx) .^ 2)
ret /= n-1
sqrt(ret)
end
# bootstrap standard error versus asymptotic
mb = bootstrap(MersenneTwister(123123), mainfit, 200)
## bootstrap standard error
[stddev_finite(mb[:,i]) for i in 1:2]
## asymptotic standard error
stderror(mainfit)
LSurvival.bootstrap
— MethodBootstrap methods for Aalen-Johansen cumulative risk estimator
Signatures
# single bootstrap draw, keeping the entire object
bootstrap(rng::MersenneTwister, m::AJSurv)
bootstrap(m::AJSurv)
# muliple bootstrap draws, keeping only coefficient estimates
bootstrap(rng::MersenneTwister, m::AJSurv, iter::Int; kwargs...)
bootstrap(m::AJSurv, iter::Int; kwargs...)
Returns:
- If using
bootstrap(m)
: a single bootstrap draw - If using
bootstrap(m, 10)
(e.g.): 10 bootstrap draws of the cumulative cause-specific risks at the end of follow up
using LSurvival
using Random
z, x, t, d, event, wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 100)
id = 1:length(x)
enter = zeros(length(t))
aj1 = aalen_johansen(enter, t, event, id=ID.(id), wts=wt)
aj2 = bootstrap(aj1, keepy=false);
ajboot = bootstrap(aj1, 10, keepy=false);
aj1
aj1.R
aj2.R
LSurvival.bootstrap
— MethodBootstrap methods for Kaplan-Meier survival curve estimator
Signatures
# single bootstrap draw, keeping the entire object
bootstrap(rng::MersenneTwister, m::KMSurv)
bootstrap(m::KMSurv)
# muliple bootstrap draws, keeping only coefficient estimates
bootstrap(rng::MersenneTwister, m::KMSurv, iter::Int; kwargs...)
bootstrap(m::KMSurv, iter::Int; kwargs...)
Returns:
- If using
bootstrap(m)
: a single bootstrap draw - If using
bootstrap(m, 10)
(e.g.): 10 bootstrap draws of the survival probability at the end of follow up
using LSurvival
using Random
id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 20, 5; afun = LSurvival.int_0)
d, X = data[:, 4], data[:, 1:3]
wts = rand(length(d))
km1 = kaplan_meier(int, outt, d, id=ID.(id), wts=wts)
km2 = bootstrap(km1, keepy=false)
km3 = bootstrap(km1, 10, keepy=false)
km1
km1.R
km2.R
LSurvival.bootstrap
— MethodBootstrapping coefficients of a proportional hazards model
Signatures
# single bootstrap draw, keeping the entire object
bootstrap(rng::MersenneTwister, m::PHModel)
bootstrap(m::PHModel)
# muliple bootstrap draws, keeping only coefficient estimates
bootstrap(rng::MersenneTwister, m::PHModel, iter::Int; kwargs...)
bootstrap(m::PHModel, iter::Int; kwargs...)
Returns:
- If using
bootstrap(m)
: a single bootstrap draw - If using
bootstrap(m, 10)
(e.g.): 10 bootstrap draws of the cumulative cause-specific risks at the end of follow up
using LSurvival, Random
id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 500, 5; afun = LSurvival.int_0)
d, X = data[:, 4], data[:, 1:3]
weights = rand(length(d))
# survival outcome:
R = LSurvivalResp(int, outt, d, ID.(id)) # specification with ID only
P = PHParms(X)
Mod = PHModel(R, P)
LSurvival._fit!(Mod, start=Mod.P._B, keepx=true, keepy=true)
# careful propogation of bootstrap sampling
idx, R2 = bootstrap(R)
P2 = bootstrap(idx, P)
Modb = PHModel(R2, P2)
LSurvival._fit!(Mod, start=Mod.P._B, keepx=true, keepy=true)
# convenience function for bootstrapping a model
Modc = bootstrap(Mod)
LSurvival._fit!(Modc, start=Modc.P._B);
Modc
Modc.P.X == nothing
Modc.R == nothing
Bootstrap Cox model coefficients
LSurvival._fit!(mb, keepx=true, keepy=true, start=[0.0, 0.0])
using LSurvival, Random
res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 200)
int = zeros(length(d)) # no late entry
X = hcat(z, x)
mainfit = fit(PHModel, X, int, outt, d .* (event .== 1), keepx=true, keepy=true)
function stddev_finite(x)
n = length(x)
mnx = sum(x)/n
ret = sum((x .- mnx) .^ 2)
ret /= n-1
sqrt(ret)
end
# bootstrap standard error versus asymptotic
mb = bootstrap(MersenneTwister(123123), mainfit, 200)
## bootstrap standard error
[stddev_finite(mb[:,i]) for i in 1:2]
## asymptotic standard error
stderror(mainfit)
LSurvival.bootstrap
— MethodBootstrap sampling of a proportional hazards predictor object
using LSurvival, Random
id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 20, 5; afun = LSurvival.int_0)
d, X = data[:, 4], data[:, 1:3]
weights = rand(length(d))
# survival outcome:
R = LSurvivalResp(int, outt, d, ID.(id)) # specification with ID only
P = PHParms(X)
idx, R2 = bootstrap(R)
P2 = bootstrap(idx, P)
Mod = PHModel(R2, P2)
LSurvival._fit!(Mod, start=Mod.P._B)
LSurvival.bootstrap
— MethodBootstrapping sampling of a competing risk survival response
Signatures
bootstrap(rng::MersenneTwister, R::T) where {T<:LSurvivalCompResp}
bootstrap(R::T) where {T<:LSurvivalCompResp}
z,x,t,d,event,weights =
LSurvival.dgm_comprisk(MersenneTwister(1212), 300)
enter = zeros(length(event))
# survival outcome:
R = LSurvivalCompResp(enter, t, event, weights, ID.(collect(1:length(t)))) # specification with ID only
bootstrap(R) # note that entire observations/clusters identified by id are kept
LSurvival.bootstrap
— MethodBootstrapping sampling of a survival response
id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 20, 5; afun = LSurvival.int_0)
d, X = data[:, 4], data[:, 1:3]
weights = rand(length(d))
# survival outcome:
R = LSurvivalResp(int, outt, d, ID.(id)) # specification with ID only
LSurvival.calcp
— MethodTwo-tailed p-value for a (null) standard normal distribution depends on SpecialFunctions https://en.wikipedia.org/wiki/Normal_distribution
calcp(z) = 1.0 - SpecialFunctions.erf(abs(z) / sqrt(2))
calcp(1.96)
LSurvival.cdfchisq
— Methodquantile function for a chi-squared distribution depends on SpecialFunctions https://en.wikipedia.org/wiki/Chi-squared_distribution
Source code, example:
cdfchisq(df, x) = SpecialFunctions.gamma_inc(df / 2, x / 2, 0)[1]
cdfchisq(3, 3.45)
LSurvival.cdfnorm
— Methodquantile function for a standard normal distribution depends on SpecialFunctions https://en.wikipedia.org/wiki/Normal_distribution
Source code, example:
cdfnorm(z) = 0.5 * (1 + SpecialFunctions.erf(z / sqrt(2)))
cdfnorm(1.96)
LSurvival.coxph
— MethodFit method for AbstractPH objects (Cox models)
Keyword arguments (used here, and passed on to internal structs)
ties
"breslow" or "efron" (default)wts
observation weightsties
"breslow" or "efron" (default)offset
not currently used at allfitargs
arguments passed to other structs, which includeid
cluster or individual level ID (defaults to a unique value for each row of data) see note below on IDcontrasts
StatsModel style contrasts (dicts) that can be used for variable transformations/indicator variable creation (e.g. https://juliastats.org/StatsModels.jl/stable/contrasts/)
Arguments passed onto fitting routine:
eps
(default: Float64 = 1e-9) tolerance for declaring convergence. Model is determined to be converged when relative change in log-partial likelihood is <eps
.getbasehaz
(default: true): estimate baseline hazardstart
(default: nothing) nothing, or vector of floats corresponding to initial values for parameters. Note that this defaults to a vector of zeros when set to nothing, and setting to other values invalidates some of the test statistics reported by default withcoxph.
keepx
(default: true) logical. Keep design matrix in AbstractPH object output (set to false for slight computational gain).keepy
(default: true)logical. Keep outcome in AbstractPH object output (set to false for slight computational gain).bootstrap_sample
(default: false) Fit the model to a bootstrap sample of the data (not generally used by end-users, but provides some convenience toward bootstrap variance estimation).bootstrap_rng
(default: Random.MersenneTwister()) Random number seed used when drawing a bootstrap sample of the data (not generally used by end-users, but provides some convenience toward bootstrap variance estimation).
Signatures
fit(::Type{M},
X::AbstractMatrix,#{<:FP},
enter::AbstractVector{<:Real},
exit::AbstractVector{<:Real},
y::Union{AbstractVector{<:Real},BitVector}
;
ties ="breslow",
wts::AbstractVector{<:Real} = similar(y, 0),
offset::AbstractVector{<:Real} = similar(y, 0),
fitargs...) where {M<:AbstractPH}
coxph(f::FormulaTerm, data; kwargs...)
coxph(X, enter, exit, y, args...; kwargs...)
using LSurvival, Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
X = hcat(x,rand(length(x)));
m = fit(PHModel, X, enter, t, d, ties="efron")
m2 = fit(PHModel, X, enter, t, d, ties="breslow")
coeftable(m)
Note on use of id
keyword
id
is not needed in person-period structure data for standard estimates or confidence intervals
using Random, LSurvival
id, int, outt, dat =
LSurvival.dgm(MersenneTwister(123123), 100, 100; afun = LSurvival.int_0)
data = (
int = int,
outt = outt,
d = dat[:,4] .== 1,
x = dat[:,1],
z = dat[:,2]
)
f = @formula(Surv(int, outt,d)~x+z)
coxph(f, data)
BUT, you must specify id
to get appropriate robust variance and some other statistics.
Here is an example where the same data are presented in two different ways, which should yield identical statistics when used in Cox model.
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1)
bic(ft)
nobs(ft)
dof_residual(ft)
# lrtest is another one
stderror(ft) # model based
stderror(ft, type="robust") # robust standard error, based on dfbeta residuals
ft
# now using "clustered" data with multiple observations per individual
dat1clust= (
id = [1,2,3,3,4,4,5,5,6,6],
enter = [0,0,0,1,0,1,0,1,0,1],
exit = [1,1,1,6,1,6,1,8,1,9],
status = [1,0,0,1,0,1,0,0,0,1],
x = [1,1,1,1,0,0,0,0,0,0]
)
# use the `id` parameter with the ID struct
ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id))
bic(ft2) # CORRECT
nobs(ft2) # CORRECT
dof_residual(ft2) # CORRECT
stderror(ft2) # model based (CORRECT)
stderror(ft2, type="robust") # robust standard error, based on `id` level dfbeta residuals (CORRECT)
# once robust SE is calculated, coefficient table uses the robust SE for confidence intervals and test statistics
ft2 # CORRECT (compare to `ft` object)
NOTE THE FOLLOWING IS INCORRECT because the id
keyword is omitted
ft2w = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust)
bic(ft2w) # INCORRECT
nobs(ft2w) # INCORRECT
dof_residual(ft2w) # INCORRECT
stderror(ft2w) # model based (CORRECT)
stderror(ft2w, type="robust") # robust variance (INCORRECT)
ft2w # the coefficient table now shows incorrect confidence intervals and test statistics
LSurvival.ddloglik!
— MethodHessian contribution for an observation in a parametric survival model
d = m.d
i = 1
enter = m.R.enter[i]
exit = m.R.exit[i]
y = m.R.y[i]
wts = m.R.wts[i]
x = m.P.X[i,:]
θ=[1,0,.4]
LSurvival.dgm
— MethodGenerating discrete survival data without competing risks
Signatures
dgm(rng::MersenneTwister, n::Int, maxT:Int; afun = int_0, yfun = yprob, lfun = lprob)
dgm(n::Int, maxT::Int; kwargs...)
Usage: dgm(rng, n, maxT;afun=int0, yfun=yprob, lfun=lprob) dgm(n, maxT;afun=int0, yfun=yprob, lfun=lprob)
Where afun, yfun, and lfun are all functions that take arguments v,l,a and output time-specific values of a, y, and l respectively Example:
expit(mu) = inv(1.0+exp(-mu))
function aprob(v,l,a)
expit(-1.0 + 3*v + 2*l)
end
function lprob(v,l,a)
expit(-3 + 2*v + 0*l + 0*a)
end
function yprob(v,l,a)
expit(-3 + 2*v + 0*l + 2*a)
end
# 10 individuals followed for up to 5 times
LSurvival.dgm(10, 5;afun=aprob, yfun=yprob, lfun=lprob)
LSurvival.dgm_comprisk
— MethodGenerating continuous survival data with competing risks
Signatures
dgm_comprisk(rng::MersenneTwister, n::Int)
dgm_comprisk(n::Int)
- rng = random number generator
- n = sample size
Example:
using LSurvival
# 100 individuals with two competing events
z,x,t,d,event,weights = LSurvival.dgm_comprisk(100)
LSurvival.dgm_phmodel
— MethodProportional hazards model from a Weibull distribution with scale parameter λ
dgm_phmodel(rng::MersenneTwister, n::Int;
λ=1.25,
β=[0.0, 0.0]
)
keyword parameters:
- λ: Weibull scale parameter
- β: vector of regression coefficients
rng = MersenneTwister()
X, t, d, _ = dgm_phmodel(2000; λ=1.25,β=[1.0, -0.5])
coxph(@formula(Surv(t0,t,d)~x+z), (t=t,t0=t.*0,d=d,x=X[:,1],z=X[:,2]))
LSurvival.dloglik!
— MethodGradient contribution for an observation in a parametric survival model
d = m.d
i = 1
enter = m.R.enter[i]
exit = m.R.exit[i]
y = m.R.y[i]
wts = m.R.wts[i]
x = m.P.X[i,:]
θ=[1,0,.4]
LSurvival.dlpdf_gengamma
— Methodα=0.1 ρ =-1.2 κ=1.9 t = 2.0
exp((log(t) - α)exp(-ρ) - ρ) - exp(κ - ρ) (α - log(t))exp(κ - ρ) + (log(t) - α)exp((log(t) - α)exp(-ρ) - ρ) - 1 (log(t) - α)exp(κ - ρ) - exp(κ)SpecialFunctions.digamma(exp(κ))
LSurvival.dlsurv_gengamma
— Methodα=0.1 ρ =-1.2 κ=1.9 t = 2.0 dlsurv_gengamma(α, ρ, κ, t; fd = 1e-14)
LSurvival.jackknife
— MethodObtain jackknife (leave-one-out) estimates from a Aalen-Johansen risk curve (risk at end of follow-up) by refitting the model n times
Signatures
jackknife(m::M;kwargs...) where {M<:AJSurv}
LSurvival.jackknife
— MethodObtain jackknife (leave-one-out) estimates from a Kaplan-Meier survival curve (survival at end of follow-up) by refitting the model n times
Signatures
jackknife(m::M;kwargs...) where {M<:KMSurv}
using LSurvival, Random, StatsBase
dat1 = (time = [1, 1, 6, 6, 8, 9], status = [1, 0, 1, 1, 0, 1], x = [1, 1, 1, 0, 0, 0])
dat1clust = (
id = [1, 2, 3, 3, 4, 4, 5, 5, 6, 6],
enter = [0, 0, 0, 1, 0, 1, 0, 1, 0, 1],
exit = [1, 1, 1, 6, 1, 6, 1, 8, 1, 9],
status = [1, 0, 0, 1, 0, 1, 0, 0, 0, 1],
x = [1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
)
m = kaplan_meier(dat1.time, dat1.status)
a = aalen_johansen(dat1.time, dat1.status)
mc = kaplan_meier(dat1clust.enter, dat1clust.exit, dat1clust.status, id=ID.(dat1clust.id))
ac = aalen_johansen(dat1clust.enter, dat1clust.exit, dat1clust.status, id=ID.(dat1clust.id))
jk = jackknife(m);
jkc = jackknife(mc);
jka = jackknife(a);
bs = bootstrap(mc, 100);
std(bs[:,1])
stderror(m, type="jackknife")
stderror(mc, type="jackknife")
@assert jk == jkc
LSurvival.jackknife
— MethodObtain jackknife (leave-one-out) estimates from a Cox model by refitting the model n times
using LSurvival, Random, StatsBase
id, int, outt, data =
LSurvival.dgm(MersenneTwister(112), 100, 10; afun = LSurvival.int_0)
data[:, 1] = round.(data[:, 1], digits = 3)
d, X = data[:, 4], data[:, 1:3]
wt = rand(length(d))
wt ./= (sum(wt) / length(wt))
m = coxph(X,int, outt,d, wts=wt, id=ID.(id))
jk = jackknife(m);
bs = bootstrap(MersenneTwister(12321), m, 1000);
N = nobs(m)
#comparing estimate with jackknife estimate with bootstrap mean
hcat(coef(m), mean(jk, dims=1)[1,:], mean(bs, dims=1)[1,:])
semb = stderror(m)
sebs = std(bs, dims=1)
sero = stderror(m, type="robust")
sejk = stderror(m, type="jackknife")
sejk_manual = std(jk, dims=1, corrected=false) .* sqrt(N-1)
sqrt.(diag(LSurvival.jackknife_vcov(m)))
hcat(semb, sebs[1,:], sejk, sejk_manual[1,:], sero)
dat1 = (time = [1, 1, 6, 6, 8, 9], status = [1, 0, 1, 1, 0, 1], x = [1, 1, 1, 0, 0, 0])
dat1clust = (
id = [1, 2, 3, 3, 4, 4, 5, 5, 6, 6],
enter = [0, 0, 0, 1, 0, 1, 0, 1, 0, 1],
exit = [1, 1, 1, 6, 1, 6, 1, 8, 1, 9],
status = [1, 0, 0, 1, 0, 1, 0, 0, 0, 1],
x = [1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
)
m = coxph(@formula(Surv(time, status)~x),dat1)
mc = coxph(@formula(Surv(enter, exit, status)~x),dat1clust, id=ID.(dat1clust.id))
jk = jackknife(m);
jkc = jackknife(mc);
bs = bootstrap(mc, 100);
std(bs[:,1])
stderror(m, type="jackknife")
stderror(mc, type="jackknife")
@assert jk == jkc
LSurvival.kaplan_meier
— MethodKaplan-Meier estimator for cumulative conditional risk
Signatures
StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}
kaplan_meier(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
; kwargs...)
Keyword arguments
- wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observation
- id::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)
- atol = 0.00000001; absolute tolerance for defining tied event times
- censval = 0; value of the outcome to be considered a censored event
- keepy = true; keep the outcome vector after fitting (may save memory with large datasets)
- eps = 0.00000001; deprecated (replaced by atol)
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)
or, equivalently:
kaplan_meier(enter, t, d, wts=wt)
LSurvival.lgh!
— Methoddat1 = (time = [1, 1, 6, 6, 8, 9], status = [1, 0, 1, 1, 0, 1], x = [1, 1, 1, 0, 0, 0]) enter = zeros(length(dat1.time)) t = dat1.time d = dat1.status X = hcat(ones(length(dat1.x)), dat1.x) wt = ones(length(t))
dist = Exponential() P = PSParms(X[:,1:1], extraparms=length(dist)-1) P = PSParms(X, extraparms=length(dist)-1) P.B P.grad R = LSurvivalResp(dat1.time, dat1.status) # specification with ID only m = PSModel(R,P,dist)
λ=1 θ = rand(2) lgh!(m, θ) θ .+= inv(-m.P.hess) * m.P.grad * λ
lgh!(m, θ .+ [-.00, 0, 0]) lgh!(m, θ .+ [-.01, 0.05, 0.05])
m.P._LL
LSurvival.lgh_breslow!
— MethodUpdate the partial likelihood, gradient and Hessian values from a Cox model fit (used during fitting, not generally useful for users).
Uses Breslow's partial likelihood.
Updates over all observations
Signature
lgh_breslow!(m::M, j, caseidx, risksetidx) where {M<:AbstractPH}
LSurvival.lgh_efron!
— MethodUpdate the partial likelihood, gradient and Hessian values from a Cox model fit (used during fitting, not generally useful for users).
Uses Efron's partial likelihood.
Updates over all observations
Signature
lgh_efron!(m::M, j, caseidx, risksetidx) where {M<:AbstractPH}
LSurvival.loglik
— MethodLog likelihood contribution for an observation in a parametric survival model
d = m.d
i = 1
enter = m.R.enter[i]
exit = m.R.exit[i]
y = m.R.y[i]
wts = m.R.wts[i]
x = m.P.X[i,:]
θ=[1,0,.4]
m.P._B
LSurvival.lpdf
— MethodLog-likelihood calculation for Exponential regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lpdf_gradient(d, θ, t, x)
LSurvival.lpdf
— MethodLog probability distribution function: Exponential distribution
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lpdf(d, t)
LSurvival.lpdf
— Methodlog probability distribution for generalized gamma regression
LSurvival.lpdf
— Methodlog probability distribution function, generalized gamma distribution
Location scale representation (Klein Moeschberger ch 12)
LSurvival.lpdf
— Methodlog probability distribution for Gamma regression
LSurvival.lpdf
— Methodlog probability distribution function, Gamma distribution
Location scale representation (Klein Moeschberger ch 12)
LSurvival.lpdf
— MethodLog likelihood calculation for Lognormal regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lpdf(d, vcat(θ,ρ), t, x)
LSurvival.lpdf
— Methodlog probability distribution function: Weibull distribution
Location scale representation (Klein Moeschberger ch 12)
LSurvival.lpdf
— MethodLog-likelihood calculation for weibull regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lpdf(d, vcat(θ,ρ), t, x)
LSurvival.lpdf
— MethodLog probability distribution function: Weibull distribution
location scale representation (Klein Moeschberger ch 12)
α=0.1 # location
ρ=-1.2 # log(scale)
time=2
lpdf(Weibull(α, ρ), time)
LSurvival.lpdf_gradient
— MethodGradient calculation for Exponential regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lpdf_gradient(d, θ, t, x)
LSurvival.lpdf_gradient
— Methodlog probability distribution gradient for generalized gamma regression analytic gradient
LSurvival.lpdf_gradient
— Methodlog probability distribution gradient for Gamma regression analytic gradient
LSurvival.lpdf_gradient
— MethodGradient calculation for Lognormal regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lpdf_gradient(d, vcat(θ,ρ), t, x)
LSurvival.lpdf_gradient
— MethodGradient calculation for weibull regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
#α = dot(β,x)
d = Weibull()
lpdf_gradient(d, vcat(θ,ρ), t, x)
LSurvival.lpdf_hessian
— MethodHessian calculation for Exponential regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lpdf_hessian(d, θ, t, x)
LSurvival.lpdf_hessian
— MethodHessian calculation for Weibull distribution: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential(α)
lpdf_hessian(d, t)
LSurvival.lpdf_hessian
— MethodHessian calculation for generalized gamma regression: PDF
placeholder function: returns nothing
LSurvival.lpdf_hessian
— MethodHessian calculation for generalized gamma distribution: PDF
placeholder function: returns nothing
LSurvival.lpdf_hessian
— MethodHessian calculation for Gamma regression: PDF
placeholder function: returns nothing
LSurvival.lpdf_hessian
— MethodHessian calculation for Gamma distribution: PDF
placeholder function: returns nothing
LSurvival.lpdf_hessian
— MethodHessian calculation for Log-normal regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lpdf_hessian(d, vcat(θ,ρ), t, x)
LSurvival.lpdf_hessian
— MethodHessian calculation for Log-normal distribution: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal(α, ρ)
lpdf_hessian(d, t)
LSurvival.lpdf_hessian
— MethodHessian calculation for weibull regression: PDF
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lpdf_hessian(d, vcat(θ,ρ), t, x)
LSurvival.lpdf_weibull
— Methodα = -1.2 ρ = 1.8 t = 4.3 z = (log(t) - α) * exp(-ρ) z - exp(z) - ρ - log(t)
LSurvival.lsurv
— MethodLog-likelihood calculation for Exponential regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lsurv(d, θ, t, x)
LSurvival.lsurv
— MethodLog survival function: Exponential distribution
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lsurv(d, t)
LSurvival.lsurv
— Methodlog survival distribution for generalized gamma regression
LSurvival.lsurv
— Methodlog probability distribution function, generalized gamma distribution
Location scale representation (Klein Moeschberger ch 12)
LSurvival.lsurv
— Methodlog survival distribution for Gamma regression
LSurvival.lsurv
— Methodlog probability distribution function, Gamma distribution
Location scale representation (Klein Moeschberger ch 12)
LSurvival.lsurv
— MethodLog likelihood calculation for Log-normal regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lsurv(d, vcat(θ,ρ), t, x)
LSurvival.lsurv
— Methodlog probability distribution function: Weibull distribution
Location scale representation (Klein Moeschberger ch 12)
LSurvival.lsurv
— MethodLog-likelihood calculation for weibull regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lsurv(d, vcat(θ,ρ), t, x)
LSurvival.lsurv
— MethodLog survival distribution function: Weibull distribution
location, log(scale) representation (Klein Moeschberger ch 12)
α=0.1 # location
ρ=-1.2 # log(scale)
time=2
lsurv(Weibull(α, ρ), time)
LSurvival.lsurv_gradient
— MethodGradient calculation for Exponential regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lsurv_gradient(d, θ, t, x)
LSurvival.lsurv_gradient
— Methodlog survival distribution gradient for generalized gamma regression uses finite differences
LSurvival.lsurv_gradient
— Methodlog survival distribution gradient for Gamma regression uses finite differences
LSurvival.lsurv_gradient
— MethodGradient calculation for Log-normal regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lsurv_gradient(d, vcat(θ,ρ), t, x)
LSurvival.lsurv_gradient
— MethodGradient calculation for weibull regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lsurv_gradient(d, vcat(θ,ρ), t, x)
LSurvival.lsurv_hessian
— MethodHessian calculation for Exponential regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lsurv_hessian(d, θ, t, x)
LSurvival.lsurv_hessian
— MethodHessian calculation for Exponential distribution: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential(α)
lsurv_hessian(d, t)
LSurvival.lsurv_hessian
— MethodHessian calculation for generalized gamma regression: Survival
placeholder function: returns nothing
LSurvival.lsurv_hessian
— MethodHessian calculation for generalized gamma distribution: Survival
placeholder function: returns nothing
LSurvival.lsurv_hessian
— MethodHessian calculation for Gamma regression: Survival
placeholder function: returns nothing
LSurvival.lsurv_hessian
— MethodHessian calculation for Gamma distribution: Survival
placeholder function: returns nothing
LSurvival.lsurv_hessian
— MethodHessian calculation for Log-normal regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lsurv_hessian(d, vcat(θ,ρ), t, x)
LSurvival.lsurv_hessian
— MethodHessian calculation for Log-normal distribution: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal(α, ρ)
lsurv_hessian(d, t)
LSurvival.lsurv_hessian
— MethodHessian calculation for weibull regression: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lsurv_hessian(d, vcat(θ,ρ), t, x)
LSurvival.lsurv_hessian
— MethodHessian calculation for Weibull distribution: Survival
β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull(α, ρ)
lsurv_hessian(d, t)
LSurvival.qstdnorm
— Methodquantile function for a standard normal distribution depends on SpecialFunctions https://en.wikipedia.org/wiki/Normal_distribution
Source code, example:
qstdnorm(p) = sqrt(2) * SpecialFunctions.erfinv(2.0 * p - 1.0)
qstdnorm(.975)
LSurvival.qweibull
— MethodQuantile function for the Weibull distribution
\[F(t) = egin{cases} 1 - e^{{-t/ρ}^{α}}&, t ≥ 0\ 0&, t < 0 nd{cases} Q(p) = ρ * (log(1/(1-p))^{1/α})\]
lightweight function used for simulation
Note that there is no checking that parameters α,ρ are positively bound, and p ∈ (0,1), and errors will be given if this is not the case
Signature:
qweibull(p::Real,α::Real,ρ::Real)
Source code, example:
qweibull(p, α, ρ) = ρ * ((-log1p(-p))^(1 / α))
# cross reference the approach in the Distributions package
quantile(Distributions.Weibull(.75, 1.1), .3)
LSurvival.qweibull(0.3, .75, 1.1)
LSurvival.randweibull
— MethodRandom draw from Weibull distribution
lightweight function used for simulation
Note that there is no checking that parameters α,ρ are positively bound, and errors will be given if this is not the case
Signatures:
randweibull(rng::MersenneTwister,α::Real,ρ::Real)
randweibull(α::Real,ρ::Real)
Source code, example:
randweibull(rng, α, ρ) = qweibull(rand(rng), α, ρ)
randweibull(α, ρ) = randweibull(MersenneTwister(), α, ρ)
# cross reference the approach in the Distributions package
rand(Distributions.Weibull(.75, 1.1))
randweibull(.75, 1.1)
LSurvival.risk_from_coxphmodels
— MethodSurvival curve estimation using multiple cox models
Signatures
risk_from_coxphmodels(fitlist::Vector{T}, args...; kwargs...) where {T<:PHModel}
fit(::Type{M}, fitlist::Vector{T}, ; fitargs...) where {M<:PHSurv,T<:PHModel}
Optional keywords
coef_vectors
= nothing(default) or vector of coefficient vectors from the cox models [will default to the coefficients from fitlist models]pred_profile
= nothing(default) or vector of specific predictor values of the same length as the coef_vectors[1]
using LSurvival
using Random
# event variable is coded 0[referent],1,2
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
ft1 = coxph(hcat(x,z), enter, t, (event .== 1))
nidx = findall(event .!= 1)
ft2 = coxph(hcat(x,z)[nidx,:], enter[nidx], t[nidx], (event[nidx] .== 2))
# risk at referent levels of `x` and `z`
risk_from_coxphmodels([ft1,ft2])
# risk at average levels of `x` and `z`
mnx = sum(x)/length(x)
mnz = sum(z)/length(z)
# equivalent
fit(PHSurv, [ft1,ft2], pred_profile=[mnx,mnz])
risk_from_coxphmodels([ft1,ft2], pred_profile=[mnx,mnz])
LSurvival.survivaldata
— MethodLoading example survival analysis datasets
using LSurvival, Plots # note Plots does not install by default
heartdata, heartmeta = survivaldata("heart")
ft = coxph(@formula(Surv(start, stop, event)~surgery), heartdata);
# plot baseline cumulative hazard
basehazplot(ft)
# plot Schoenfeld residuals
coxdx(ft)
RecipesBase.apply_recipe
— MethodRecipe for aalen-johansen risk curve
using Plots, LSurvival
res = z, x, outt, d, event, weights = LSurvival.dgm_comprisk(MersenneTwister(123123), 100)
int = zeros(length(d)) # no late entry
c = fit(AJSurv, int, outt, event)
#risk2 = aalen_johansen(int, outt, event)
plot(c)
RecipesBase.apply_recipe
— MethodPlotting a kaplan meier curve
using Plots, LSurvival
dat4 = (
id = [1, 1, 2, 2, 2, 3, 4, 5, 5, 6],
enter = [1, 2, 5, 4, 6, 7, 3, 6, 8, 0],
exit = [2, 5, 6, 7, 8, 9, 6, 8, 14, 9],
status = [0, 1, 0, 0, 1, 0, 1, 0, 0, 1],
x = [0.1, 0.1, 1.5, 1.5, 1.5, 0, 0, 0, 0, 3],
z = [1, 1, 0, 0, 0, 0, 0, 1, 1, 0],
w = [0, 0, 0, 0, 0, 1, 1, 1, 1, 0],
)
R = LSurvivalResp(dat4.enter, dat4.exit, dat4.status)
k = kaplan_meier(dat4.enter, dat4.exit, dat4.status)
plot(k)
RecipesBase.apply_recipe
— Methodfunction name(::Type{T}) where {T}
#https://stackoverflow.com/questions/70043313/get-simple-name-of-type-in-julia
isempty(T.parameters) ? T : T.name.wrapper
end
using Plots, LSurvival
dat2 = (
enter = [1, 2, 5, 2, 1, 7, 3, 4, 8, 8],
exit = [2, 3, 6, 7, 8, 9, 9, 9, 14, 17],
status = [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
x = [1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
)
fte = survreg(@formula(Surv(enter, exit, status)~x), dat2)
# density function
aftdist(fte, label="X=0")
aftdist!(fte, covlevels=[1.0, 2.0], color="red", label="X=1")
# Survival function
aftdist(fte, type="surv", label="X=0")
aftdist!(fte, type="surv", covlevels=[1.0, 2.0], color="red", label="X=1")
# hazard function
aftdist(fte, type="haz", label="X=0")
aftdist!(fte, type="haz", covlevels=[1.0, 2.0], color="red", label="X=1")
# Cumulative incidence/risk function
aftdist(fte, type="risk", label="X=0")
aftdist!(fte, type="risk", covlevels=[1.0, 2.0], color="red", label="X=1")
RecipesBase.apply_recipe
— MethodPlotting baseline hazard for a Cox model
using Plots, LSurvival
dat2 = (
enter = [1, 2, 5, 2, 1, 7, 3, 4, 8, 8],
exit = [2, 3, 6, 7, 8, 9, 9, 9, 14, 17],
status = [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
x = [1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
)
fte = coxph(@formula(Surv(enter, exit, status)~x), dat2, maxiter=0)
ftb = coxph(@formula(Surv(enter, exit, status)~x), dat2, ties="breslow", maxiter=0)
plot(fte, label="Efron")
plot!(ftb, label="Breslow")
RecipesBase.apply_recipe
— Method
using Plots, LSurvival
dat2 = (
enter = [1, 2, 5, 2, 1, 7, 3, 4, 8, 8],
exit = [2, 3, 6, 7, 8, 9, 9, 9, 14, 17],
status = [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
x = [1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
)
fte = coxph(@formula(Surv(enter, exit, status)~x), dat2)
coxdx(fte)
RecipesBase.apply_recipe
— Methodusing Plots, LSurvival
dat2 = (
enter = [1, 2, 5, 2, 1, 7, 3, 4, 8, 8],
exit = [2, 3, 6, 7, 8, 9, 9, 9, 14, 17],
status = [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
x = [1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
)
fte = coxph(@formula(Surv(enter, exit, status)~x), dat2)
coxinfluence(fte, type="jackknife", par=1)
coxinfluence!(fte, type="dfbeta", color=:red, par=1)
RecipesBase.apply_recipe
— MethodPlotting baseline hazard for a Cox model
using Plots, LSurvival
dat4 = (
id = [1, 1, 2, 2, 2, 3, 4, 5, 5, 6],
enter = [1, 2, 5, 4, 6, 7, 3, 6, 8, 0],
exit = [2, 5, 6, 7, 8, 9, 6, 8, 14, 9],
status = [0, 1, 0, 0, 1, 0, 1, 0, 0, 1],
x = [0.1, 0.1, 1.5, 1.5, 1.5, 0, 0, 0, 0, 3],
z = [1, 1, 0, 0, 0, 0, 0, 1, 1, 0],
w = [0, 0, 0, 0, 0, 1, 1, 1, 1, 0],
)
k = kaplan_meier(dat4.enter, dat4.exit, dat4.status)
lognlogplot(k)
RecipesBase.apply_recipe
— MethodPlotting LSurvivalResp
objects (outcomes in cox models, kaplan meier curves, parametric survival models)
Recipe for plotting time-to-event outcomes
using Plots, LSurvival
dat4 = (
id = [1, 1, 2, 2, 2, 3, 4, 5, 5, 6],
enter = [1, 2, 5, 4, 6, 7, 3, 6, 8, 0],
exit = [2, 5, 6, 7, 8, 9, 6, 8, 14, 9],
status = [0, 1, 0, 0, 1, 0, 1, 0, 0, 1],
x = [0.1, 0.1, 1.5, 1.5, 1.5, 0, 0, 0, 0, 3],
z = [1, 1, 0, 0, 0, 0, 0, 1, 1, 0],
w = [0, 0, 0, 0, 0, 1, 1, 1, 1, 0],
)
R = LSurvivalResp(dat4.enter, dat4.exit, dat4.status)
plot([[R.enter[i], R.exit[i]] for i in eachindex(R.enter)], [[i, i] for i in values(R.id)])
RecipesBase.apply_recipe
— MethodRecipe for cox-model based risk curves
using Plots, LSurvival, Random, StatsBase
res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 100)
X = hcat(z, x)
int = zeros(length(d)) # no late entry
ft1 = fit(PHModel, X, int, outt, d .* (event .== 1), wts=wts)
ft2 = fit(PHModel, X, int, outt, d .* (event .== 2), wts=wts)
c = risk_from_coxphmodels([ft1, ft2], pred_profile = mean(X, dims=1))
plot(c)
StatsAPI.confint
— MethodGreenwood's formula for variance and confidence intervals of a Aalen-Johansen risk function
Signatures:
StatsBase.stderror(m::AJSurv)
StatsBase.confint(m:AJSurv; level=0.95, method="normal")
Keyword arguments
- method
- "normal" normality-based confidence intervals
- "lognlog" log(-log(S(t))) based confidence intervals
res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 100)
int = zeros(length(d)) # no late entry
m = fit(AJSurv, int, outt, event)
stderror(m)
confint(m, level=0.95)
StatsAPI.confint
— MethodGreenwood's formula for variance and confidence intervals of a Kaplan-Meier survival curve
Signatures:
StatsBase.stderror(m::KMSurv)
StatsBase.confint(m:KMSurv; level=0.95, method="normal")
Keyword arguments
method
:
- "normal" normality-based confidence intervals
- "lognlog" log(-log(S(t))) based confidence intervals
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)
stderror(m)
confint(m, method="normal")
confint(m, method="lognlog") # log-log transformation
StatsAPI.confint
— Methodusing LSurvival
dat1= (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time, status) ~ x),dat1, keepx=true)
# model-based variance
confint(ft)
# robust variance
confint(ft, type="robust")
for cluster confidence intervals
dat1clust= (
id = [1,2,3,3,4,4,5,5,6,6],
enter = [0,0,0,1,0,1,0,1,0,1],
exit = [1,1,1,6,1,6,1,8,1,9],
status = [1,0,0,1,0,1,0,0,0,1],
x = [1,1,1,1,0,0,0,0,0,0]
)
ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id), keepx=true)
# model-based variance
confint(ft2)
# robust variance
confint(ft2, type="robust")
StatsAPI.fit!
— MethodKaplan-Meier estimator for cumulative conditional risk
Signatures
StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}
kaplan_meier(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
; kwargs...)
Keyword arguments
- wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observation
- id::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)
- atol = 0.00000001; absolute tolerance for defining tied event times
- censval = 0; value of the outcome to be considered a censored event
- keepy = true; keep the outcome vector after fitting (may save memory with large datasets)
- eps = 0.00000001; deprecated (replaced by atol)
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)
or, equivalently:
kaplan_meier(enter, t, d, wts=wt)
Aalen-Johansen estimator for cumulative cause-specific risk (in the presence of competing events)
Signatures
StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}
aalen_johansen(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
; kwargs...)
Keyword arguments
wts::Vector{<:Real} = similar(enter, 0)
; vector of case weights (or zero length vector) for each observationid::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]
; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)atol = 0.00000001
; absolute tolerance for defining tied event timeskeepy = true
; keep the outcome vector after fitting (may save memory with large datasets)eps = 0.00000001
; deprecated (replaced by atol)
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
# event variable is coded 0[referent],1,2
m = fit(AJSurv, enter, t, event)
mw = fit(AJSurv, enter, t, event, wts=wt)
or, equivalently:
aalen_johansen(enter, t, event, wts=wt)
Survival curve estimation using multiple cox models
Signatures
risk_from_coxphmodels(fitlist::Vector{T}, args...; kwargs...) where {T<:PHModel}
fit(::Type{M}, fitlist::Vector{T}, ; fitargs...) where {M<:PHSurv,T<:PHModel}
Optional keywords
coef_vectors
= nothing(default) or vector of coefficient vectors from the cox models [will default to the coefficients from fitlist models]pred_profile
= nothing(default) or vector of specific predictor values of the same length as the coef_vectors[1]
using LSurvival
using Random
# event variable is coded 0[referent],1,2
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
ft1 = coxph(hcat(x,z), enter, t, (event .== 1))
nidx = findall(event .!= 1)
ft2 = coxph(hcat(x,z)[nidx,:], enter[nidx], t[nidx], (event[nidx] .== 2))
# risk at referent levels of `x` and `z`
risk_from_coxphmodels([ft1,ft2])
# risk at average levels of `x` and `z`
mnx = sum(x)/length(x)
mnz = sum(z)/length(z)
# equivalent
fit(PHSurv, [ft1,ft2], pred_profile=[mnx,mnz])
risk_from_coxphmodels([ft1,ft2], pred_profile=[mnx,mnz])
StatsAPI.fit
— MethodSurvival curve estimation using multiple cox models
Signatures
risk_from_coxphmodels(fitlist::Vector{T}, args...; kwargs...) where {T<:PHModel}
fit(::Type{M}, fitlist::Vector{T}, ; fitargs...) where {M<:PHSurv,T<:PHModel}
Optional keywords
coef_vectors
= nothing(default) or vector of coefficient vectors from the cox models [will default to the coefficients from fitlist models]pred_profile
= nothing(default) or vector of specific predictor values of the same length as the coef_vectors[1]
using LSurvival
using Random
# event variable is coded 0[referent],1,2
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
ft1 = coxph(hcat(x,z), enter, t, (event .== 1))
nidx = findall(event .!= 1)
ft2 = coxph(hcat(x,z)[nidx,:], enter[nidx], t[nidx], (event[nidx] .== 2))
# risk at referent levels of `x` and `z`
risk_from_coxphmodels([ft1,ft2])
# risk at average levels of `x` and `z`
mnx = sum(x)/length(x)
mnz = sum(z)/length(z)
# equivalent
fit(PHSurv, [ft1,ft2], pred_profile=[mnx,mnz])
risk_from_coxphmodels([ft1,ft2], pred_profile=[mnx,mnz])
StatsAPI.fit
— MethodFit method for AbstractPH objects (Cox models)
Keyword arguments (used here, and passed on to internal structs)
ties
"breslow" or "efron" (default)wts
observation weightsties
"breslow" or "efron" (default)offset
not currently used at allfitargs
arguments passed to other structs, which includeid
cluster or individual level ID (defaults to a unique value for each row of data) see note below on IDcontrasts
StatsModel style contrasts (dicts) that can be used for variable transformations/indicator variable creation (e.g. https://juliastats.org/StatsModels.jl/stable/contrasts/)
Arguments passed onto fitting routine:
eps
(default: Float64 = 1e-9) tolerance for declaring convergence. Model is determined to be converged when relative change in log-partial likelihood is <eps
.getbasehaz
(default: true): estimate baseline hazardstart
(default: nothing) nothing, or vector of floats corresponding to initial values for parameters. Note that this defaults to a vector of zeros when set to nothing, and setting to other values invalidates some of the test statistics reported by default withcoxph.
keepx
(default: true) logical. Keep design matrix in AbstractPH object output (set to false for slight computational gain).keepy
(default: true)logical. Keep outcome in AbstractPH object output (set to false for slight computational gain).bootstrap_sample
(default: false) Fit the model to a bootstrap sample of the data (not generally used by end-users, but provides some convenience toward bootstrap variance estimation).bootstrap_rng
(default: Random.MersenneTwister()) Random number seed used when drawing a bootstrap sample of the data (not generally used by end-users, but provides some convenience toward bootstrap variance estimation).
Signatures
fit(::Type{M},
X::AbstractMatrix,#{<:FP},
enter::AbstractVector{<:Real},
exit::AbstractVector{<:Real},
y::Union{AbstractVector{<:Real},BitVector}
;
ties ="breslow",
wts::AbstractVector{<:Real} = similar(y, 0),
offset::AbstractVector{<:Real} = similar(y, 0),
fitargs...) where {M<:AbstractPH}
coxph(f::FormulaTerm, data; kwargs...)
coxph(X, enter, exit, y, args...; kwargs...)
using LSurvival, Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
X = hcat(x,rand(length(x)));
m = fit(PHModel, X, enter, t, d, ties="efron")
m2 = fit(PHModel, X, enter, t, d, ties="breslow")
coeftable(m)
Note on use of id
keyword
id
is not needed in person-period structure data for standard estimates or confidence intervals
using Random, LSurvival
id, int, outt, dat =
LSurvival.dgm(MersenneTwister(123123), 100, 100; afun = LSurvival.int_0)
data = (
int = int,
outt = outt,
d = dat[:,4] .== 1,
x = dat[:,1],
z = dat[:,2]
)
f = @formula(Surv(int, outt,d)~x+z)
coxph(f, data)
BUT, you must specify id
to get appropriate robust variance and some other statistics.
Here is an example where the same data are presented in two different ways, which should yield identical statistics when used in Cox model.
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1)
bic(ft)
nobs(ft)
dof_residual(ft)
# lrtest is another one
stderror(ft) # model based
stderror(ft, type="robust") # robust standard error, based on dfbeta residuals
ft
# now using "clustered" data with multiple observations per individual
dat1clust= (
id = [1,2,3,3,4,4,5,5,6,6],
enter = [0,0,0,1,0,1,0,1,0,1],
exit = [1,1,1,6,1,6,1,8,1,9],
status = [1,0,0,1,0,1,0,0,0,1],
x = [1,1,1,1,0,0,0,0,0,0]
)
# use the `id` parameter with the ID struct
ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id))
bic(ft2) # CORRECT
nobs(ft2) # CORRECT
dof_residual(ft2) # CORRECT
stderror(ft2) # model based (CORRECT)
stderror(ft2, type="robust") # robust standard error, based on `id` level dfbeta residuals (CORRECT)
# once robust SE is calculated, coefficient table uses the robust SE for confidence intervals and test statistics
ft2 # CORRECT (compare to `ft` object)
NOTE THE FOLLOWING IS INCORRECT because the id
keyword is omitted
ft2w = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust)
bic(ft2w) # INCORRECT
nobs(ft2w) # INCORRECT
dof_residual(ft2w) # INCORRECT
stderror(ft2w) # model based (CORRECT)
stderror(ft2w, type="robust") # robust variance (INCORRECT)
ft2w # the coefficient table now shows incorrect confidence intervals and test statistics
StatsAPI.fit
— MethodAalen-Johansen estimator for cumulative cause-specific risk (in the presence of competing events)
Signatures
StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}
aalen_johansen(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
; kwargs...)
Keyword arguments
wts::Vector{<:Real} = similar(enter, 0)
; vector of case weights (or zero length vector) for each observationid::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]
; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)atol = 0.00000001
; absolute tolerance for defining tied event timeskeepy = true
; keep the outcome vector after fitting (may save memory with large datasets)eps = 0.00000001
; deprecated (replaced by atol)
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
# event variable is coded 0[referent],1,2
m = fit(AJSurv, enter, t, event)
mw = fit(AJSurv, enter, t, event, wts=wt)
or, equivalently:
aalen_johansen(enter, t, event, wts=wt)
StatsAPI.fit
— MethodKaplan-Meier estimator for cumulative conditional risk
Signatures
StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}
kaplan_meier(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
; kwargs...)
Keyword arguments
- wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observation
- id::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)
- atol = 0.00000001; absolute tolerance for defining tied event times
- censval = 0; value of the outcome to be considered a censored event
- keepy = true; keep the outcome vector after fitting (may save memory with large datasets)
- eps = 0.00000001; deprecated (replaced by atol)
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)
or, equivalently:
kaplan_meier(enter, t, d, wts=wt)
StatsAPI.loglikelihood
— MethodMaximum log partial likelihood for a fitted AbstractPH
model Efron or Breslow (depending on the ties
` parameter)
StatsAPI.loglikelihood
— MethodMaximum log likelihood for a fitted PSModel
model
StatsAPI.nullloglikelihood
— MethodNull log-partial likelihood for a fitted AbstractPH
model Efron or Breslow (depending on the ties
` parameter)
Note: this is just the log partial likelihood at the initial values of the model, which default to 0. If initial values are non-null, then this function no longer validly returns the null log-partial likelihood.
StatsAPI.nullloglikelihood
— MethodNull log-partial likelihood for a fitted PSModel
model
Note: this is just the log partial likelihood at the initial values of the model, which default to 0. If initial values are non-null, then this function no longer validly returns the null log-partial likelihood.
StatsAPI.residuals
— Method####################################################################
Cox proportional hazards model residuals:
Signature
residuals(m::M; type = "martingale") where {M<:PHModel}
where type is one of
martingale
schoenfeld
score
dfbeta
jackknife
dfbetas
(scaled dfbeta)scaled_schoenfeld
orschoenfelds
(scaled Schoenfeld)Residuals from the residuals function are designed to exactly emulate those from the
survival
package in R. Currently, they are validated for single observation data (e.g. one data row per individual).####################################################################
Martingale residuals: Observed versus expected
# example from https://cran.r-project.org/web/packages/survival/vignettes/validate.pdf
# by Terry Therneau
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
# Nelson-Aalen type estimator for Breslow partial likelihood
ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="breslow")
residuals(ft, type="martingale")
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
# Fleming-Harrington type estimator for Efron partial likelihood
ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="efron")
residuals(ft, type="martingale")
####################################################################
Score residuals: Per observation contribution to score function
using LSurvival
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="breslow")
S = residuals(ft, type="score")[:]
ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="efron", maxiter=0)
S = residuals(ft, type="score")[:]
####################################################################
Schoenfeld residuals: Per time contribution to score function
using LSurvival
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="breslow", maxiter=0)
X = ft.P.X
M = residuals(ft, type="martingale")
S = residuals(ft, type="schoenfeld")[:]
Ss = residuals(ft, type="scaled_schoenfeld")[:]
####################################################################
dfbeta residuals: influence of individual observations on each parameter
using LSurvival
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1, ties="breslow")
residuals(ft, type="dfbeta")
# can also calculate from score residuals and Hessian matrix
L = residuals(ft, type="score") # n X p
H = ft.P._hess # p X p
dfbeta = L*inv(H)
robVar = dfbeta'dfbeta
sqrt(robVar)
using the id
keyword argument
see help for LSurvival.vcov for what happens when id
keyword is not used
dat1clust= (
id = [1,2,3,3,4,4,5,5,6,6],
enter = [0,0,0,1,0,1,0,1,0,1],
exit = [1,1,1,6,1,6,1,8,1,9],
status = [1,0,0,1,0,1,0,0,0,1],
x = [1,1,1,1,0,0,0,0,0,0]
)
ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id), ties="breslow")
# note these are still on the observation level (not the id level)!
residuals(ft2, type="dfbeta")
# getting id level dfbeta residuals
dfbeta = residuals(ft2, type="dfbeta")
id = values(ft2.R.id)
D = reduce(vcat, [sum(dfbeta[findall(id .== i),:], dims=1) for i in unique(id)])
D'D
vcov(ft, type="robust")
vcov(ft2, type="robust")
####################################################################
jackknife residuals: influence of individual observations on each parameter using leave-one-out estimates
note there are other definitions of jackknife residuals See Chapter 7.1 of "Extending the Cox Model" by Therneau and Grambsch for an example of the type of jackknife residuals used here
Jackknife residuals $r_i$ for $i \in 1:n$ are given as the difference between the maximum partial likelihood estimate and the jackknife estimates for each observation
\[r_i = \hat\beta - \hat\beta_{(-i)}\]
where $\beta_{(-i)}$ is the maximum partial likelihood estimate of the log-hazard ratio vector obtained from a dataset in which observations belonging to individual i
are removed
using LSurvival
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1, ties="breslow")
#jackknife(ft)
residuals(ft, type="jackknife")
StatsAPI.stderror
— MethodGreenwood's formula for variance and confidence intervals of a Aalen-Johansen risk function
Signatures:
StatsBase.stderror(m::AJSurv)
StatsBase.confint(m:AJSurv; level=0.95, method="normal")
Keyword arguments
- method
- "normal" normality-based confidence intervals
- "lognlog" log(-log(S(t))) based confidence intervals
res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 100)
int = zeros(length(d)) # no late entry
m = fit(AJSurv, int, outt, event)
stderror(m)
confint(m, level=0.95)
StatsAPI.stderror
— MethodGreenwood's formula for variance and confidence intervals of a Kaplan-Meier survival curve
Signatures:
StatsBase.stderror(m::KMSurv)
StatsBase.confint(m:KMSurv; level=0.95, method="normal")
Keyword arguments
method
:
- "normal" normality-based confidence intervals
- "lognlog" log(-log(S(t))) based confidence intervals
using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)
stderror(m)
confint(m, method="normal")
confint(m, method="lognlog") # log-log transformation
StatsAPI.vcov
— MethodCovariance matrix for Cox proportional hazards models
Keyword arguments
type
nothing or "robust": determines whether model based or robust (dfbeta based) variance is returned.See ?residuals for info on
dfbeta
residuals
using LSurvival
dat1 = (
time = [1,1,6,6,8,9],
status = [1,0,1,1,0,1],
x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1, id=ID.(collect(1:6)))
vcov(ft) # model based
vcov(ft, type="robust") # robust variance, based on dfbeta residuals
# once robust SE is calculated, coefficient table uses the robust SE for confidence intervals and test statistics
ft
cluster robust standard errors using the id
keyword argument
dat1clust= (
id = [1,2,3,3,4,4,5,5,6,6],
enter = [0,0,0,1,0,1,0,1,0,1],
exit = [1,1,1,6,1,6,1,8,1,9],
status = [1,0,0,1,0,1,0,0,0,1],
x = [1,1,1,1,0,0,0,0,0,0]
)
ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id))
vcov(ft2) # model based
vcov(ft2, type="robust") # robust variance, based on dfbeta residuals
stderror(ft2, type="robust") # robust variance, based on dfbeta residuals
confint(ft2, type="robust") # robust variance, based on dfbeta residuals
nobs(ft2) # id argument yields correct value of number of independent observations
# once robust SE is calculated, coefficient table uses the robust SE for confidence intervals and test statistics
ft2
NOTE THE FOLLOWING IS INCORRECT because the id
keyword is omitted
ft2w = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust)
vcov(ft2w) # model based (CORRECT)
vcov(ft2w, type="robust") # robust variance (INCORRECT)
nobs(ft2w)
ft2w
Implementation details and further help
- Likelihood functions for time-to-event observations subject to left-truncation and right censoring
- Non-parametric survival/risk estimation: Kaplan-Meier and Aalen-Johansen
- Cox models
- Parametric survival/risk estimation with Weibull AFT models