LSurvival

Survival analysis functions in Julia for time-to-event outcomes that can include:

Loss-to-follow-up/right censoring
Late entry/left truncation (not available in Survival.jl)
"Person-period" data structures (not available in Survival.jl)
Observation weights (not available in Survival.jl)
Competing risks (not available in Survival.jl)

Capabilities include estimators for

Kaplan-Meier non-parametric conditional risk functions
Aalen-Johansen non-parametric cause-specific unconditional risk functions
Cox proportional hazards model (Efron's or Breslow's methods for ties)
Parametric survival models

Convenience functions enable:

Non-parametric bootstrapping, cluster-bootstrapping, jackknife
Estimating baseline hazards from a Cox Model
Estimating cause-specific risk from an exhaustive set of Cox models for competing risk outcomes
Simple simulation of competing and non-competing survival events
Martingale, score, Schoenfeld, and dfbeta residuals
Cluster robust variance estimation (Cox models)

Plans to include:

Parametric survival models: more distributions
Stratification in Cox models
Parametric survival model diagnostics

The package has been tuned to follow the "survival" package from R in terms of specific estimators/results.

Report issues here

Installation

using Pkg; Pkg.add(url = "https://github.com/alexpkeil1/LSurvival.jl")

Quick examples

Single event type: Cox model and Kaplan-Meier curve

# generate some data under a discrete hazards model
using Random
id, int, out, data = LSurvival.dgm(MersenneTwister(1212), 100, 20)

data[:, 1] = round.(data[:, 1], digits = 3)
d, X = data[:, 4], data[:, 1:3]

tab = ( in = int, out = out, d=d, x=X[:,1], z1=X[:,2], z2=X[:,3]) 
wt = ones(length(d)) # weights of 1.0 just to demonstrate usage


coxph(@formula(Surv(in, out, d)~x+z1+z2), tab, ties = "efron", wts = wt)

Output:

Maximum partial likelihood estimates (alpha=0.05):
───────────────────────────────────────────────────────────
      ln(HR)    StdErr         LCI       UCI     Z  P(>|Z|)
───────────────────────────────────────────────────────────
x   1.12624   0.392651   0.356659   1.89582   2.87   0.0041
z1  0.434587  0.27122   -0.0969944  0.966168  1.60   0.1091
z2  1.70434   0.222878   1.26751    2.14117   7.65   <1e-13
───────────────────────────────────────────────────────────
Partial log-likelihood (null): -361.948
Partial log-likelihood (fitted): -332.063
LRT p-value (χ²=59.77, df=3): 6.5858e-13
Newton-Raphson iterations: 5

# can also be done if there is no late entry
coxph(@formula(Surv(out, d)~x+z1+z2), tab, ties = "efron", wts = wt)
# can also be done if there is no late entry and no right censoring (i.e. all times are failure times)
coxph(@formula(Surv(out)~x+z1+z2), tab, ties = "efron", wts = wt)

# Kaplan-Meier estimator of the cumulative risk/survival
kaplan_meier(int, outt, d)

Competing event analysis: Aalen-Johansen and Cox-model-based estimators of the cumulative risk/survival

# Aalen-Johansen estimator: marginal cause-specific risks
res_aj = aalen_johansen(enter, t, event; wts = wt);
res_aj

# Cox-model estimator: cause-specific risks at given levels of covariates
fit1 = fit(PHModel, X, enter, t, (event .== 1), ties = "efron",  wts = wt)
n2idx = findall(event .!= 1)
fit2 = fit(PHModel, X[n2idx,:], enter[n2idx], t[n2idx], (event[n2idx] .== 2), ties = "efron",  wts = wt[n2idx])

# risk at average levels of `x` and `z`
res_cph = risk_from_coxphmodels([fit1,fit2], coef_vectors=[coef(fit1), coef(fit2)], pred_profile=mean(X, dims=1))
# compare to Aalen-Johansen fit
res_aj


# this approach operates on left censored outcomes (which operate in the background in model fitting)
LSurvivalResp(enter, t, d, origintime=0)
LSurvivalCompResp(enter, t, event) # automatically infers origin


# can use the ID type to refer to units with multiple observations
id, int, outt, data = dgm(MersenneTwister(), 1000, 10; regimefun = int_0)
LSurvivalResp(int, outt, data[:,4], ID.(id))

Index of functions

LSurvival.AbstractLSurvivalParms
LSurvival.AbstractLSurvivalResp
LSurvival.AbstractNPSurv
LSurvival.AbstractPH
LSurvival.AbstractPSModel
LSurvival.AbstractSurvDist
LSurvival.ID
LSurvival.LSurvivalCompResp
LSurvival.LSurvivalResp
LSurvival.PHModel
LSurvival.PHSurv
LSurvival.Strata
LSurvival._update_PHParms!
LSurvival.aalen_johansen
LSurvival.bootstrap
LSurvival.bootstrap
LSurvival.bootstrap
LSurvival.bootstrap
LSurvival.bootstrap
LSurvival.bootstrap
LSurvival.bootstrap
LSurvival.calcp
LSurvival.cdfchisq
LSurvival.cdfnorm
LSurvival.coxph
LSurvival.cumhaz
LSurvival.ddloglik!
LSurvival.dgm
LSurvival.dgm_comprisk
LSurvival.dgm_phmodel
LSurvival.dloglik!
LSurvival.dlpdf_gengamma
LSurvival.dlsurv_gengamma
LSurvival.jackknife
LSurvival.jackknife
LSurvival.jackknife
LSurvival.kaplan_meier
LSurvival.lgh!
LSurvival.lgh_breslow!
LSurvival.lgh_efron!
LSurvival.loglik
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf
LSurvival.lpdf_gradient
LSurvival.lpdf_gradient
LSurvival.lpdf_gradient
LSurvival.lpdf_gradient
LSurvival.lpdf_gradient
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_hessian
LSurvival.lpdf_weibull
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv
LSurvival.lsurv_gradient
LSurvival.lsurv_gradient
LSurvival.lsurv_gradient
LSurvival.lsurv_gradient
LSurvival.lsurv_gradient
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.lsurv_hessian
LSurvival.qstdnorm
LSurvival.qweibull
LSurvival.randweibull
LSurvival.risk_from_coxphmodels
LSurvival.survivaldata
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
RecipesBase.apply_recipe
StatsAPI.confint
StatsAPI.confint
StatsAPI.fit
StatsAPI.fit
StatsAPI.fit
StatsAPI.fit
StatsAPI.fit!
StatsAPI.loglikelihood
StatsAPI.loglikelihood
StatsAPI.nullloglikelihood
StatsAPI.nullloglikelihood
StatsAPI.predict
StatsAPI.residuals
StatsAPI.stderror
StatsAPI.stderror
StatsAPI.vcov

Function help

LSurvival.AbstractLSurvivalParms — Type

AbstractLsurvParms

Abstract type representing a model predictors and coefficient parameters

source

LSurvival.AbstractLSurvivalResp — Type

AbstractLsurvResp

Abstract type representing a model response vector

source

LSurvival.AbstractNPSurv — Type

Abstract type for non-parametric survival models, including Kaplan-Meier, Aalen Johansen, and Cox-model based estimates of survival using an Aalen-Johansen-like estimator

source

LSurvival.AbstractPH — Type

Abstract type for proportional hazards models

source

LSurvival.AbstractPSModel — Type

AbstractPS

Abstract type for parametric survival models

source

LSurvival.AbstractSurvDist — Type

AbstractSurvDist

Abstract type for parametric survival distributions

source

LSurvival.ID — Type

Type for identifying individuals in survival outcomes.

Used for the id argument in

Outcome types: LSurvivalResp, LSurvivalCompResp
Model types: PHModel, KMRisk, AJRisk

Accepts any Number or String. There is no significance to having this particular struct, but it enables easier use of multiple dispatch.

[ID(i) for i in 1:10]

source

LSurvival.LSurvivalCompResp — Type

Outcome type for competing risk survival outcomes subject to left truncation and right censoring (not generally needed for users)

Parameters

enter Time at observation start
exit Time at observation end
y event occurrence in observation
wts observation weights
eventtimes unique event times
origin origin on the time scale
id person level identifier (must be wrapped in ID() function)
eventtypes vector of unique event types
eventmatrix matrix of indicators on the observation level
Signatures:

 struct LSurvivalCompResp{
 E<:AbstractVector,
 X<:AbstractVector,
 Y<:AbstractVector,
 W<:AbstractVector,
 T<:Real,
 I<:AbstractLSurvivalID,
 V<:AbstractVector,
 M<:AbstractMatrix,
 } <: AbstractLSurvivalResp
 enter::E
 exit::X
 y::Y
 wts::W
 eventtimes::X
 origin::T
 id::Vector{I}
 eventtypes::V
 eventmatrix::M
 end

 LSurvivalCompResp(
 enter::E,
 exit::X,
 y::Y,
 wts::W,
 id::Vector{I}
 )

 LSurvivalCompResp(
 enter::E,
 exit::X,
 y::Y,
 id::Vector{I}
 )

 LSurvivalCompResp(
 enter::E,
 exit::X,
 y::Y,
 wts::W,
 )

 LSurvivalCompResp(
 enter::E,
 exit::X,
 y::Y,
 )

 LSurvivalCompResp(
  exit::X,
  y::Y,
  ) where {X<:Vector,Y<:Union{Vector{<:Real},BitVector}}

source

LSurvival.LSurvivalResp — Type

Outcome type for survival outcome subject to left truncation and right censoring.

Will not generally be needed by users

Parameters

enter: Time at observation start
exit: Time at observation end
y: event occurrence in observation
wts: observation weights
eventtimes: unique event times
origin: origin on the time scale
id: person level identifier (must be wrapped in ID() function)

 struct LSurvivalResp{
 E<:AbstractVector,
 X<:AbstractVector,
 Y<:AbstractVector,
 W<:AbstractVector,
 T<:Real,
 I<:AbstractLSurvivalID,
 } <: AbstractLSurvivalResp
 enter::E
 exit::X
 y::Y
 wts::W
 eventtimes::E
 origin::T
 id::Vector{I}
 end

 LSurvivalResp(
    enter::E,
    exit::X,
    y::Y,
    wts::W,
    id::Vector{I},
  ) where {
    E<:Vector,
    X<:Vector,
    Y<:Union{Vector{<:Real},BitVector},
    W<:Vector,
    I<:AbstractLSurvivalID,
}

 LSurvivalResp(
 enter::E,
 exit::X,
 y::Y,
 id::Vector{I},
 )

 LSurvivalResp(
  y::Vector{Y},
  wts::W,
  id::Vector{I},
  ) where {Y<:AbstractSurvTime,W<:Vector,I<:AbstractLSurvivalID}

 LSurvivalResp(
  enter::E,
  exit::X,
  y::Y,
  ) where {E<:Vector,X<:Vector,Y<:Union{Vector{<:Real},BitVector}}

 LSurvivalResp(exit::X, y::Y) where {X<:Vector,Y<:Vector}

Examples

  # no late entry
  LSurvivalResp([.5, .6], [1,0])

source

LSurvival.PHModel — Type

PHModel: Mutable object type for proportional hazards regression (not generally needed for users)

Parameters

R Survival response
P # parameters
ties String: "efron" or "breslow"
fit Bool: logical for whether the model has been fitted
bh AbstractMatrix: baseline hazard estimates
Signatures

 mutable struct PHModel{G<:LSurvivalResp,L<:AbstractLSurvivalParms} <: AbstractPH
 R::G        # Survival response
 P::L        # parameters
 ties::String #"efron" or"breslow"
 fit::Bool
 bh::AbstractMatrix
 end

 PHModel(
 R::G,
 P::L,
 ties::String,
 fit::Bool,
 ) where {G<:LSurvivalResp,L<:AbstractLSurvivalParms}
 PHModel(R::G, P::L, ties::String) where {G<:LSurvivalResp,L<:AbstractLSurvivalParms}
 PHModel(R::G, P::L) where {G<:LSurvivalResp,L<:AbstractLSurvivalParms}

Methods: fit, coef, confint, std_err, show

Example

 using LSurvival
 using Random
 import LSurvival: _stepcox!, dgm_comprisk

 z,x,t,d, event,wt = dgm_comprisk(MersenneTwister(1212), 100);
 enter = zeros(length(t));
 X = hcat(x,z);
 R = LSurvivalResp(enter, t, Int.(d), wt)
 P = PHParms(X)
 mf = PHModel(R,P)
  LSurvival._fit!(mf)

source

LSurvival.PHSurv — Type

Mutable type for proportional hazards models (not generally needed by users)

PHSsurv: Object type for proportional hazards regression

surv::Vector{Float64} risk::Matrix{Float64} basehaz::Vector{Float64} event::Vector{Float64}

fitlist: vector of PHSurv objects (Cox model fits)
eventtypes: vector of unique event types
times: unique event times
surv: Overall survival at each time
risk: Cause-specific risk at each time (1 for each outcome type)
basehaz: baseline hazard for a specific event type
event: value of event type that occurred at each time

Methods: fit, show

mutable struct PHSurv{G<:Array{T} where {T<:PHModel}} <: AbstractNPSurv
fitlist::G        
eventtypes::AbstractVector
times::AbstractVector
surv::Vector{Float64}
risk::Matrix{Float64}
basehaz::Vector{Float64}
event::Vector{Float64}
end

PHSurv(fitlist::Array{T}, eventtypes) where {T<:PHModel}
PHSurv(fitlist::Array{T}) where {T<:PHModel}

source

LSurvival.Strata — Type

Type for identifying individuals in survival outcomes. Used for the strata argument in PHModel (not yet implemented)

Accepts any Number or String. There is no significance to having this particular struct, but it enables easier use of multiple dispatch.

[Strata(i) for i in 1:10]

source

LSurvival._update_PHParms! — Method

Update the partial likelihood, gradient and Hessian values from a Cox model fit (used during fitting, not generally useful for users).

Uses Breslow's or Efron's partial likelihood.

Updates over all observations

Signature

_update_PHParms!(
 m::M,
 # big indexes
 ne::I,
 caseidxs::Vector{Vector{T}},
 risksetidxs::Vector{Vector{T}},
 ) where {M<:AbstractPH,I<:Int,T<:Int}

updatePHParms!(m, risksetidxs, caseidxs, ne, den)

source

LSurvival.aalen_johansen — Method

Aalen-Johansen estimator for cumulative cause-specific risk (in the presence of competing events)

Signatures

 StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}

 aalen_johansen(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
   ; kwargs...)

Keyword arguments

wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observation
id::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)
atol = 0.00000001; absolute tolerance for defining tied event times
keepy = true; keep the outcome vector after fitting (may save memory with large datasets)
eps = 0.00000001; deprecated (replaced by atol)

using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
   # event variable is coded 0[referent],1,2
m = fit(AJSurv, enter, t, event)
mw = fit(AJSurv, enter, t, event, wts=wt)

or, equivalently:

aalen_johansen(enter, t, event, wts=wt)

source

LSurvival.bootstrap — Method

Bootstrapping coefficients of a proportional hazards model

Signatures

# single bootstrap draw, keeping the entire object
bootstrap(rng::MersenneTwister, m::PHModel)
bootstrap(m::PHModel)
# muliple bootstrap draws, keeping only coefficient estimates
bootstrap(rng::MersenneTwister, m::PHModel, iter::Int; kwargs...)
bootstrap(m::PHModel, iter::Int; kwargs...)

Returns:

If using bootstrap(m): a single bootstrap draw
If using bootstrap(m, 10) (e.g.): 10 bootstrap draws of the cumulative cause-specific risks at the end of follow up

using LSurvival, Random

id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 500, 5; afun = LSurvival.int_0)

d, X = data[:, 4], data[:, 1:3]
weights = rand(length(d))

# survival outcome:
R = LSurvivalResp(int, outt, d, ID.(id))    # specification with ID only
P = PHParms(X)

Mod = PHModel(R, P)
LSurvival._fit!(Mod, start=Mod.P._B, keepx=true, keepy=true)


# careful propogation of bootstrap sampling
idx, R2 = bootstrap(R)
P2 = bootstrap(idx, P)
Modb = PHModel(R2, P2)
LSurvival._fit!(Mod, start=Mod.P._B, keepx=true, keepy=true)

# convenience function for bootstrapping a model
Modc = bootstrap(Mod)
LSurvival._fit!(Modc, start=Modc.P._B);
Modc
Modc.P.X == nothing
Modc.R == nothing

Bootstrap Cox model coefficients

LSurvival._fit!(mb, keepx=true, keepy=true, start=[0.0, 0.0])

using LSurvival, Random
res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 200)
int = zeros(length(d)) # no late entry
X = hcat(z, x)

mainfit = fit(PHModel, X, int, outt, d .* (event .== 1), keepx=true, keepy=true)

function stddev_finite(x)
 n = length(x)
 mnx = sum(x)/n
 ret = sum((x .- mnx) .^ 2)
 ret /= n-1
 sqrt(ret)
end

# bootstrap standard error versus asymptotic
mb = bootstrap(MersenneTwister(123123), mainfit, 200)
## bootstrap standard error
[stddev_finite(mb[:,i]) for i in 1:2]
## asymptotic standard error
stderror(mainfit)

source

LSurvival.bootstrap — Method

Bootstrap methods for Aalen-Johansen cumulative risk estimator

Signatures

 # single bootstrap draw, keeping the entire object
 bootstrap(rng::MersenneTwister, m::AJSurv)
 bootstrap(m::AJSurv)

 # muliple bootstrap draws, keeping only coefficient estimates
 bootstrap(rng::MersenneTwister, m::AJSurv, iter::Int; kwargs...)
 bootstrap(m::AJSurv, iter::Int; kwargs...)

Returns:

If using bootstrap(m): a single bootstrap draw
If using bootstrap(m, 10) (e.g.): 10 bootstrap draws of the cumulative cause-specific risks at the end of follow up

using LSurvival
using Random

z, x, t, d, event, wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 100)
id = 1:length(x)
enter = zeros(length(t))

aj1 = aalen_johansen(enter, t, event, id=ID.(id), wts=wt)
aj2 = bootstrap(aj1, keepy=false);
ajboot = bootstrap(aj1, 10, keepy=false);
aj1


aj1.R
aj2.R

source

LSurvival.bootstrap — Method

Bootstrap methods for Kaplan-Meier survival curve estimator

Signatures

 # single bootstrap draw, keeping the entire object
 bootstrap(rng::MersenneTwister, m::KMSurv)
 bootstrap(m::KMSurv)

 # muliple bootstrap draws, keeping only coefficient estimates
 bootstrap(rng::MersenneTwister, m::KMSurv, iter::Int; kwargs...)
 bootstrap(m::KMSurv, iter::Int; kwargs...)

Returns:

If using bootstrap(m): a single bootstrap draw
If using bootstrap(m, 10) (e.g.): 10 bootstrap draws of the survival probability at the end of follow up

using LSurvival
using Random

id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 20, 5; afun = LSurvival.int_0)

d, X = data[:, 4], data[:, 1:3]
wts = rand(length(d))

km1 = kaplan_meier(int, outt, d, id=ID.(id), wts=wts)
km2 = bootstrap(km1, keepy=false)
km3 = bootstrap(km1, 10, keepy=false)
km1

km1.R
km2.R

source

LSurvival.bootstrap — Method

Bootstrapping coefficients of a proportional hazards model

Signatures

# single bootstrap draw, keeping the entire object
bootstrap(rng::MersenneTwister, m::PHModel)
bootstrap(m::PHModel)
# muliple bootstrap draws, keeping only coefficient estimates
bootstrap(rng::MersenneTwister, m::PHModel, iter::Int; kwargs...)
bootstrap(m::PHModel, iter::Int; kwargs...)

Returns:

If using bootstrap(m): a single bootstrap draw
If using bootstrap(m, 10) (e.g.): 10 bootstrap draws of the cumulative cause-specific risks at the end of follow up

using LSurvival, Random

id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 500, 5; afun = LSurvival.int_0)

d, X = data[:, 4], data[:, 1:3]
weights = rand(length(d))

# survival outcome:
R = LSurvivalResp(int, outt, d, ID.(id))    # specification with ID only
P = PHParms(X)

Mod = PHModel(R, P)
LSurvival._fit!(Mod, start=Mod.P._B, keepx=true, keepy=true)


# careful propogation of bootstrap sampling
idx, R2 = bootstrap(R)
P2 = bootstrap(idx, P)
Modb = PHModel(R2, P2)
LSurvival._fit!(Mod, start=Mod.P._B, keepx=true, keepy=true)

# convenience function for bootstrapping a model
Modc = bootstrap(Mod)
LSurvival._fit!(Modc, start=Modc.P._B);
Modc
Modc.P.X == nothing
Modc.R == nothing

Bootstrap Cox model coefficients

LSurvival._fit!(mb, keepx=true, keepy=true, start=[0.0, 0.0])

using LSurvival, Random
res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 200)
int = zeros(length(d)) # no late entry
X = hcat(z, x)

mainfit = fit(PHModel, X, int, outt, d .* (event .== 1), keepx=true, keepy=true)

function stddev_finite(x)
 n = length(x)
 mnx = sum(x)/n
 ret = sum((x .- mnx) .^ 2)
 ret /= n-1
 sqrt(ret)
end

# bootstrap standard error versus asymptotic
mb = bootstrap(MersenneTwister(123123), mainfit, 200)
## bootstrap standard error
[stddev_finite(mb[:,i]) for i in 1:2]
## asymptotic standard error
stderror(mainfit)

source

LSurvival.bootstrap — Method

Bootstrap sampling of a proportional hazards predictor object

using LSurvival, Random

id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 20, 5; afun = LSurvival.int_0)

d, X = data[:, 4], data[:, 1:3]
weights = rand(length(d))

# survival outcome:
R = LSurvivalResp(int, outt, d, ID.(id))    # specification with ID only
P = PHParms(X)
idx, R2 = bootstrap(R)
P2 = bootstrap(idx, P)

Mod = PHModel(R2, P2)
LSurvival._fit!(Mod, start=Mod.P._B)

source

LSurvival.bootstrap — Method

Bootstrapping sampling of a competing risk survival response

Signatures

bootstrap(rng::MersenneTwister, R::T) where {T<:LSurvivalCompResp}
bootstrap(R::T) where {T<:LSurvivalCompResp}

z,x,t,d,event,weights =
LSurvival.dgm_comprisk(MersenneTwister(1212), 300)
enter = zeros(length(event))

# survival outcome:
R = LSurvivalCompResp(enter, t, event, weights, ID.(collect(1:length(t))))    # specification with ID only
bootstrap(R) # note that entire observations/clusters identified by id are kept

source

LSurvival.bootstrap — Method

Bootstrapping sampling of a survival response

id, int, outt, data =
LSurvival.dgm(MersenneTwister(1212), 20, 5; afun = LSurvival.int_0)

d, X = data[:, 4], data[:, 1:3]
weights = rand(length(d))

# survival outcome:
R = LSurvivalResp(int, outt, d, ID.(id))    # specification with ID only

source

LSurvival.calcp — Method

Two-tailed p-value for a (null) standard normal distribution depends on SpecialFunctions https://en.wikipedia.org/wiki/Normal_distribution

    calcp(z) = 1.0 - SpecialFunctions.erf(abs(z) / sqrt(2))

    calcp(1.96)

source

LSurvival.cdfchisq — Method

quantile function for a chi-squared distribution depends on SpecialFunctions https://en.wikipedia.org/wiki/Chi-squared_distribution

Source code, example:

    cdfchisq(df, x) = SpecialFunctions.gamma_inc(df / 2, x / 2, 0)[1]

    cdfchisq(3, 3.45)

source

LSurvival.cdfnorm — Method

quantile function for a standard normal distribution depends on SpecialFunctions https://en.wikipedia.org/wiki/Normal_distribution

Source code, example:

    cdfnorm(z) = 0.5 * (1 + SpecialFunctions.erf(z / sqrt(2)))
    
    cdfnorm(1.96)

source

LSurvival.coxph — Method

Fit method for AbstractPH objects (Cox models)

Keyword arguments (used here, and passed on to internal structs)

ties "breslow" or "efron" (default)
wts observation weights
ties "breslow" or "efron" (default)
offset not currently used at all
fitargs arguments passed to other structs, which include
- id cluster or individual level ID (defaults to a unique value for each row of data) see note below on ID
- contrasts StatsModel style contrasts (dicts) that can be used for variable transformations/indicator variable creation (e.g. https://juliastats.org/StatsModels.jl/stable/contrasts/)
Arguments passed onto fitting routine:
- eps (default: Float64 = 1e-9) tolerance for declaring convergence. Model is determined to be converged when relative change in log-partial likelihood is < eps .
- getbasehaz (default: true): estimate baseline hazard
- start (default: nothing) nothing, or vector of floats corresponding to initial values for parameters. Note that this defaults to a vector of zeros when set to nothing, and setting to other values invalidates some of the test statistics reported by default with coxph.
- keepx (default: true) logical. Keep design matrix in AbstractPH object output (set to false for slight computational gain).
- keepy (default: true)logical. Keep outcome in AbstractPH object output (set to false for slight computational gain).
- bootstrap_sample (default: false) Fit the model to a bootstrap sample of the data (not generally used by end-users, but provides some convenience toward bootstrap variance estimation).
- bootstrap_rng (default: Random.MersenneTwister()) Random number seed used when drawing a bootstrap sample of the data (not generally used by end-users, but provides some convenience toward bootstrap variance estimation).
Signatures

  fit(::Type{M},
  X::AbstractMatrix,#{<:FP},
  enter::AbstractVector{<:Real},
  exit::AbstractVector{<:Real},
  y::Union{AbstractVector{<:Real},BitVector}
  ;
  ties ="breslow",
  wts::AbstractVector{<:Real}      = similar(y, 0),
  offset::AbstractVector{<:Real}   = similar(y, 0),
  fitargs...) where {M<:AbstractPH}

 coxph(f::FormulaTerm, data; kwargs...)

  coxph(X, enter, exit, y, args...; kwargs...)

   using LSurvival, Random
   z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
   enter = zeros(length(t));
   X = hcat(x,rand(length(x)));
    m = fit(PHModel, X, enter, t, d, ties="efron")
   m2 = fit(PHModel, X, enter, t, d, ties="breslow")
   coeftable(m)

Note on use of id keyword

id is not needed in person-period structure data for standard estimates or confidence intervals

  using Random, LSurvival
     id, int, outt, dat =
         LSurvival.dgm(MersenneTwister(123123), 100, 100; afun = LSurvival.int_0)
     data = (
             int = int,
             outt = outt,
             d = dat[:,4] .== 1,
             x = dat[:,1],
             z = dat[:,2]
     )

     f = @formula(Surv(int, outt,d)~x+z)
     coxph(f, data)

BUT, you must specify id to get appropriate robust variance and some other statistics.

Here is an example where the same data are presented in two different ways, which should yield identical statistics when used in Cox model.

 dat1 = (
    time = [1,1,6,6,8,9],
    status = [1,0,1,1,0,1],
    x = [1,1,1,0,0,0]
  )
  ft = coxph(@formula(Surv(time,status)~x),dat1)
  bic(ft)
  nobs(ft)
  dof_residual(ft)
  # lrtest is another one

  stderror(ft)                     # model based
  stderror(ft, type="robust")   # robust standard error, based on dfbeta residuals
  ft

  # now using "clustered" data with multiple observations per individual
 dat1clust= (
     id = [1,2,3,3,4,4,5,5,6,6],
     enter = [0,0,0,1,0,1,0,1,0,1],
     exit = [1,1,1,6,1,6,1,8,1,9],
     status = [1,0,0,1,0,1,0,0,0,1],
     x = [1,1,1,1,0,0,0,0,0,0]
 )
 
 # use the `id` parameter with the ID struct
 ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id))
 bic(ft2)                       # CORRECT        
 nobs(ft2)                      # CORRECT
 dof_residual(ft2)              # CORRECT
  
 stderror(ft2)                  # model based (CORRECT)
 stderror(ft2, type="robust")   # robust standard error, based on `id` level dfbeta residuals (CORRECT)
 # once robust SE is calculated, coefficient table uses the robust SE for confidence intervals and test statistics
 ft2   # CORRECT (compare to `ft` object)

NOTE THE FOLLOWING IS INCORRECT because the id keyword is omitted

 ft2w = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust)
 bic(ft2w)                          # INCORRECT 
 nobs(ft2w)                         # INCORRECT
 dof_residual(ft2w)                 # INCORRECT

 stderror(ft2w)                     # model based (CORRECT)
 stderror(ft2w, type="robust")      # robust variance (INCORRECT)
 
 ft2w # the coefficient table now shows incorrect confidence intervals and test statistics

source

LSurvival.cumhaz — Method

Cumulative hazard functions for sets of covariate values.

i.e. for a cox model λ(t|X) = λ_0(t)exp(Xβ)
estimate of  Λ(t | X = newX)
 = Λ(t | X = newX)
 = ∑ λ(t | X = newX)
 = ∑ λ_0(t)exp(newXβ)

  where λ_0(t) is an estimate of the baseline hazard

This function does not accomodate late entry

    using LSurvival
    dt = (
       id = [1,2,3,3,4,4,5,5,6,6],
       enter = [0,0,0,1,0,3,0,3,0,3],
       exit = [1,1,1,6,1,6,1,8,1,9],
       status = [1,1,0,1,0,1,0,1,0,0],
       x = [1,1,1,1,0,0,0,0,0,0]
    )
   
    # use the `id` parameter with the ID struct
    ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dt, id=ID.(dt.id))
    ft3 = coxph(@formula(Surv(enter, exit, status) ~ x),dt)
    newids = ID.([1,2,3,4,5])
    newX = [1,1,1,0,0][:,:]
    entertime = [0,0,0,0,0]
    exittime = [8,8,9,10,9]

    try 
        cumhaz(ft2) # throws an error due to late entry in initial fit
    catch e
        println(e)
    end
    cumhaz(ft2,newX,entertime,exittime,newids) # warns about carrying forward covariate values
    cumhaz(ft2,newX) # no warning

source

LSurvival.ddloglik! — Method

Hessian contribution for an observation in a parametric survival model

    d = m.d
    i = 1
    enter = m.R.enter[i]
    exit = m.R.exit[i]
    y = m.R.y[i]
    wts = m.R.wts[i]
    x = m.P.X[i,:]
    θ=[1,0,.4]

source

LSurvival.dgm — Method

Generating discrete survival data without competing risks

Signatures

dgm(rng::MersenneTwister, n::Int, maxT:Int; afun = int_0, yfun = yprob, lfun = lprob)

dgm(n::Int, maxT::Int; kwargs...)

Usage: dgm(rng, n, maxT;afun=int0, yfun=yprob, lfun=lprob) dgm(n, maxT;afun=int0, yfun=yprob, lfun=lprob)

Where afun, yfun, and lfun are all functions that take arguments v,l,a and output time-specific values of a, y, and l respectively Example:


expit(mu) =  inv(1.0+exp(-mu))

function aprob(v,l,a)
expit(-1.0 + 3*v + 2*l)
end
  
function lprob(v,l,a)
expit(-3 + 2*v + 0*l + 0*a)
end
  
function yprob(v,l,a)
expit(-3 + 2*v + 0*l + 2*a)
end
  # 10 individuals followed for up to 5 times
LSurvival.dgm(10, 5;afun=aprob, yfun=yprob, lfun=lprob)

source

LSurvival.dgm_comprisk — Method

Generating continuous survival data with competing risks

Signatures

dgm_comprisk(rng::MersenneTwister, n::Int)

dgm_comprisk(n::Int)

    - rng = random number generator    
    - n = sample size

Example:

using LSurvival
# 100 individuals with two competing events
z,x,t,d,event,weights = LSurvival.dgm_comprisk(100)

source

LSurvival.dgm_phmodel — Method

Proportional hazards model from a Weibull distribution with scale parameter λ

dgm_phmodel(rng::MersenneTwister, n::Int; 
    λ=1.25,
    β=[0.0, 0.0]
    )

keyword parameters:

λ: Weibull scale parameter
β: vector of regression coefficients

rng = MersenneTwister()
X, t, d, _ = dgm_phmodel(2000; λ=1.25,β=[1.0, -0.5])
coxph(@formula(Surv(t0,t,d)~x+z), (t=t,t0=t.*0,d=d,x=X[:,1],z=X[:,2]))

source

LSurvival.dloglik! — Method

Gradient contribution for an observation in a parametric survival model

    d = m.d
    i = 1
    enter = m.R.enter[i]
    exit = m.R.exit[i]
    y = m.R.y[i]
    wts = m.R.wts[i]
    x = m.P.X[i,:]
    θ=[1,0,.4]

source

LSurvival.dlpdf_gengamma — Method

α=0.1 ρ =-1.2 κ=1.9 t = 2.0

exp((log(t) - α)exp(-ρ) - ρ) - exp(κ - ρ) (α - log(t))exp(κ - ρ) + (log(t) - α)exp((log(t) - α)exp(-ρ) - ρ) - 1 (log(t) - α)exp(κ - ρ) - exp(κ)SpecialFunctions.digamma(exp(κ))

source

LSurvival.dlsurv_gengamma — Method

α=0.1 ρ =-1.2 κ=1.9 t = 2.0 dlsurv_gengamma(α, ρ, κ, t; fd = 1e-14)

source

LSurvival.jackknife — Method

Obtain jackknife (leave-one-out) estimates from a Aalen-Johansen risk curve (risk at end of follow-up) by refitting the model n times

Signatures

jackknife(m::M;kwargs...) where {M<:AJSurv}

source

LSurvival.jackknife — Method

Obtain jackknife (leave-one-out) estimates from a Kaplan-Meier survival curve (survival at end of follow-up) by refitting the model n times

Signatures

jackknife(m::M;kwargs...) where {M<:KMSurv}

using LSurvival, Random, StatsBase

dat1 = (time = [1, 1, 6, 6, 8, 9], status = [1, 0, 1, 1, 0, 1], x = [1, 1, 1, 0, 0, 0])

dat1clust = (
  id = [1, 2, 3, 3, 4, 4, 5, 5, 6, 6],
  enter = [0, 0, 0, 1, 0, 1, 0, 1, 0, 1],
  exit = [1, 1, 1, 6, 1, 6, 1, 8, 1, 9],
  status = [1, 0, 0, 1, 0, 1, 0, 0, 0, 1],
  x = [1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
)

m = kaplan_meier(dat1.time, dat1.status)
a = aalen_johansen(dat1.time, dat1.status)
mc = kaplan_meier(dat1clust.enter, dat1clust.exit, dat1clust.status, id=ID.(dat1clust.id))
ac = aalen_johansen(dat1clust.enter, dat1clust.exit, dat1clust.status, id=ID.(dat1clust.id))
jk = jackknife(m);
jkc = jackknife(mc);
jka = jackknife(a);
bs = bootstrap(mc, 100);
std(bs[:,1])
stderror(m, type="jackknife")
stderror(mc, type="jackknife")
@assert jk == jkc

source

LSurvival.jackknife — Method

Obtain jackknife (leave-one-out) estimates from a Cox model by refitting the model n times

using LSurvival, Random, StatsBase
id, int, outt, data =
LSurvival.dgm(MersenneTwister(112), 100, 10; afun = LSurvival.int_0)
data[:, 1] = round.(data[:, 1], digits = 3)
d, X = data[:, 4], data[:, 1:3]
wt = rand(length(d))
wt ./= (sum(wt) / length(wt))
m = coxph(X,int, outt,d, wts=wt, id=ID.(id))

jk = jackknife(m);
bs = bootstrap(MersenneTwister(12321), m, 1000);
N = nobs(m)
#comparing estimate with jackknife estimate with bootstrap mean
hcat(coef(m), mean(jk, dims=1)[1,:], mean(bs, dims=1)[1,:])
semb = stderror(m)
sebs = std(bs, dims=1)
sero = stderror(m, type="robust")
sejk = stderror(m, type="jackknife")
sejk_manual = std(jk, dims=1, corrected=false) .* sqrt(N-1)

sqrt.(diag(LSurvival.jackknife_vcov(m)))

hcat(semb, sebs[1,:], sejk, sejk_manual[1,:], sero)

dat1 = (time = [1, 1, 6, 6, 8, 9], status = [1, 0, 1, 1, 0, 1], x = [1, 1, 1, 0, 0, 0])
dat1clust = (
  id = [1, 2, 3, 3, 4, 4, 5, 5, 6, 6],
  enter = [0, 0, 0, 1, 0, 1, 0, 1, 0, 1],
  exit = [1, 1, 1, 6, 1, 6, 1, 8, 1, 9],
  status = [1, 0, 0, 1, 0, 1, 0, 0, 0, 1],
  x = [1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
)

m = coxph(@formula(Surv(time, status)~x),dat1)
mc = coxph(@formula(Surv(enter, exit, status)~x),dat1clust, id=ID.(dat1clust.id))
jk = jackknife(m);
jkc = jackknife(mc);
bs = bootstrap(mc, 100);
std(bs[:,1])
stderror(m, type="jackknife")
stderror(mc, type="jackknife")
@assert jk == jkc

source

LSurvival.kaplan_meier — Method

Kaplan-Meier estimator for cumulative conditional risk

Signatures

StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}

kaplan_meier(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
   ; kwargs...)

Keyword arguments

wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observation
id::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)
atol = 0.00000001; absolute tolerance for defining tied event times
censval = 0; value of the outcome to be considered a censored event
keepy = true; keep the outcome vector after fitting (may save memory with large datasets)
eps = 0.00000001; deprecated (replaced by atol)

using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)

or, equivalently:

kaplan_meier(enter, t, d, wts=wt)

source

LSurvival.lgh! — Method

dat1 = (time = [1, 1, 6, 6, 8, 9], status = [1, 0, 1, 1, 0, 1], x = [1, 1, 1, 0, 0, 0]) enter = zeros(length(dat1.time)) t = dat1.time d = dat1.status X = hcat(ones(length(dat1.x)), dat1.x) wt = ones(length(t))

dist = Exponential() P = PSParms(X[:,1:1], extraparms=length(dist)-1) P = PSParms(X, extraparms=length(dist)-1) P.B P.grad R = LSurvivalResp(dat1.time, dat1.status) # specification with ID only m = PSModel(R,P,dist)

λ=1 θ = rand(2) lgh!(m, θ) θ .+= inv(-m.P.hess) * m.P.grad * λ

lgh!(m, θ .+ [-.00, 0, 0]) lgh!(m, θ .+ [-.01, 0.05, 0.05])

m.P._LL

source

LSurvival.lgh_breslow! — Method

Update the partial likelihood, gradient and Hessian values from a Cox model fit (used during fitting, not generally useful for users).

Uses Breslow's partial likelihood.

Updates over all observations

Signature

lgh_breslow!(m::M, j, caseidx, risksetidx) where {M<:AbstractPH}

source

LSurvival.lgh_efron! — Method

Update the partial likelihood, gradient and Hessian values from a Cox model fit (used during fitting, not generally useful for users).

Uses Efron's partial likelihood.

Updates over all observations

Signature

lgh_efron!(m::M, j, caseidx, risksetidx) where {M<:AbstractPH}

source

LSurvival.loglik — Method

Log likelihood contribution for an observation in a parametric survival model

    d = m.d
    i = 1
    enter = m.R.enter[i]
    exit = m.R.exit[i]
    y = m.R.y[i]
    wts = m.R.wts[i]
    x = m.P.X[i,:]
    θ=[1,0,.4]
    
    m.P._B

source

LSurvival.lpdf — Method

Log-likelihood calculation for Exponential regression: PDF

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lpdf_gradient(d, θ, t, x)

source

LSurvival.lpdf — Method

Log probability distribution function: Exponential distribution

    β = [-2, 1.2]
    x = [2,.1]
    ρ = -0.5
    t = 3.0
    α = dot(β,x)
    d = Exponential()
    lpdf(d, t)

source

LSurvival.lpdf — Method

log probability distribution for generalized gamma regression

source

LSurvival.lpdf — Method

log probability distribution function, generalized gamma distribution

Location scale representation (Klein Moeschberger ch 12)

source

LSurvival.lpdf — Method

log probability distribution for Gamma regression

source

LSurvival.lpdf — Method

log probability distribution function, Gamma distribution

Location scale representation (Klein Moeschberger ch 12)

source

LSurvival.lpdf — Method

Log likelihood calculation for Lognormal regression: PDF

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lpdf(d, vcat(θ,ρ), t, x)

source

LSurvival.lpdf — Method

log probability distribution function: Weibull distribution

Location scale representation (Klein Moeschberger ch 12)

source

LSurvival.lpdf — Method

Log-likelihood calculation for weibull regression: PDF

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lpdf(d, vcat(θ,ρ), t, x)

source

LSurvival.lpdf — Method

Log probability distribution function: Weibull distribution

location scale representation (Klein Moeschberger ch 12)

α=0.1   # location
ρ=-1.2  # log(scale)
time=2
lpdf(Weibull(α, ρ), time)

source

LSurvival.lpdf_gradient — Method

Gradient calculation for Exponential regression: PDF

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lpdf_gradient(d, θ, t, x)

source

LSurvival.lpdf_gradient — Method

log probability distribution gradient for generalized gamma regression analytic gradient

source

LSurvival.lpdf_gradient — Method

log probability distribution gradient for Gamma regression analytic gradient

source

LSurvival.lpdf_gradient — Method

Gradient calculation for Lognormal regression: PDF

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lpdf_gradient(d, vcat(θ,ρ), t, x)

source

LSurvival.lpdf_gradient — Method

Gradient calculation for weibull regression: PDF

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
#α = dot(β,x)
d = Weibull()
lpdf_gradient(d, vcat(θ,ρ), t, x)

source

LSurvival.lpdf_hessian — Method

Hessian calculation for Exponential regression: PDF

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lpdf_hessian(d, θ, t, x)

source

LSurvival.lpdf_hessian — Method

Hessian calculation for Weibull distribution: PDF

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential(α)
lpdf_hessian(d, t)

source

LSurvival.lpdf_hessian — Method

Hessian calculation for generalized gamma regression: PDF

placeholder function: returns nothing

source

LSurvival.lpdf_hessian — Method

Hessian calculation for generalized gamma distribution: PDF

placeholder function: returns nothing

source

LSurvival.lpdf_hessian — Method

Hessian calculation for Gamma regression: PDF

placeholder function: returns nothing

source

LSurvival.lpdf_hessian — Method

Hessian calculation for Gamma distribution: PDF

placeholder function: returns nothing

source

LSurvival.lpdf_hessian — Method

Hessian calculation for Log-normal regression: PDF

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lpdf_hessian(d, vcat(θ,ρ), t, x)

source

LSurvival.lpdf_hessian — Method

Hessian calculation for Log-normal distribution: PDF

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal(α, ρ)
lpdf_hessian(d, t)

source

LSurvival.lpdf_hessian — Method

Hessian calculation for weibull regression: PDF

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lpdf_hessian(d, vcat(θ,ρ), t, x)

source

LSurvival.lpdf_weibull — Method

α = -1.2 ρ = 1.8 t = 4.3 z = (log(t) - α) * exp(-ρ) z - exp(z) - ρ - log(t)

source

LSurvival.lsurv — Method

Log-likelihood calculation for Exponential regression: Survival

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lsurv(d, θ, t, x)

source

LSurvival.lsurv — Method

Log survival function: Exponential distribution

    β = [-2, 1.2]
    x = [2,.1]
    ρ = -0.5
    t = 3.0
    α = dot(β,x)
    d = Exponential()
    lsurv(d, t)

source

LSurvival.lsurv — Method

log survival distribution for generalized gamma regression

source

LSurvival.lsurv — Method

log probability distribution function, generalized gamma distribution

Location scale representation (Klein Moeschberger ch 12)

source

LSurvival.lsurv — Method

log survival distribution for Gamma regression

source

LSurvival.lsurv — Method

log probability distribution function, Gamma distribution

Location scale representation (Klein Moeschberger ch 12)

source

LSurvival.lsurv — Method

Log likelihood calculation for Log-normal regression: Survival

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lsurv(d, vcat(θ,ρ), t, x)

source

LSurvival.lsurv — Method

log probability distribution function: Weibull distribution

Location scale representation (Klein Moeschberger ch 12)

source

LSurvival.lsurv — Method

Log-likelihood calculation for weibull regression: Survival

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lsurv(d, vcat(θ,ρ), t, x)

source

LSurvival.lsurv — Method

Log survival distribution function: Weibull distribution

location, log(scale) representation (Klein Moeschberger ch 12)

α=0.1   # location
ρ=-1.2  # log(scale)
time=2
lsurv(Weibull(α, ρ), time)

source

LSurvival.lsurv_gradient — Method

Gradient calculation for Exponential regression: Survival

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lsurv_gradient(d, θ, t, x)

source

LSurvival.lsurv_gradient — Method

log survival distribution gradient for generalized gamma regression uses finite differences

source

LSurvival.lsurv_gradient — Method

log survival distribution gradient for Gamma regression uses finite differences

source

LSurvival.lsurv_gradient — Method

Gradient calculation for Log-normal regression: Survival

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lsurv_gradient(d, vcat(θ,ρ), t, x)

source

LSurvival.lsurv_gradient — Method

Gradient calculation for weibull regression: Survival

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lsurv_gradient(d, vcat(θ,ρ), t, x)

source

LSurvival.lsurv_hessian — Method

Hessian calculation for Exponential regression: Survival

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential()
lsurv_hessian(d, θ, t, x)

source

LSurvival.lsurv_hessian — Method

Hessian calculation for Exponential distribution: Survival

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Exponential(α)
lsurv_hessian(d, t)

source

LSurvival.lsurv_hessian — Method

Hessian calculation for generalized gamma regression: Survival

placeholder function: returns nothing

source

LSurvival.lsurv_hessian — Method

Hessian calculation for generalized gamma distribution: Survival

placeholder function: returns nothing

source

LSurvival.lsurv_hessian — Method

Hessian calculation for Gamma regression: Survival

placeholder function: returns nothing

source

LSurvival.lsurv_hessian — Method

Hessian calculation for Gamma distribution: Survival

placeholder function: returns nothing

source

LSurvival.lsurv_hessian — Method

Hessian calculation for Log-normal regression: Survival

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal()
lsurv_hessian(d, vcat(θ,ρ), t, x)

source

LSurvival.lsurv_hessian — Method

Hessian calculation for Log-normal distribution: Survival

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Lognormal(α, ρ)
lsurv_hessian(d, t)

source

LSurvival.lsurv_hessian — Method

Hessian calculation for weibull regression: Survival

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull()
lsurv_hessian(d, vcat(θ,ρ), t, x)

source

LSurvival.lsurv_hessian — Method

Hessian calculation for Weibull distribution: Survival

β = [-2, 1.2]
x = [2,.1]
ρ = -0.5
t = 3.0
α = dot(β,x)
d = Weibull(α, ρ)
lsurv_hessian(d, t)

source

LSurvival.qstdnorm — Method

quantile function for a standard normal distribution depends on SpecialFunctions https://en.wikipedia.org/wiki/Normal_distribution

Source code, example:

    qstdnorm(p) = sqrt(2) * SpecialFunctions.erfinv(2.0 * p - 1.0)
    
    qstdnorm(.975)

source

LSurvival.qweibull — Method

Quantile function for the Weibull distribution

\[F(t) = egin{cases} 1 - e^{{-t/ρ}^{α}}&, t ≥ 0\ 0&, t < 0 nd{cases} Q(p) = ρ * (log(1/(1-p))^{1/α})\]

lightweight function used for simulation

Note that there is no checking that parameters α,ρ are positively bound, and p ∈ (0,1), and errors will be given if this is not the case

Signature:

qweibull(p::Real,α::Real,ρ::Real)

Source code, example:

qweibull(p, α, ρ) = ρ * ((-log1p(-p))^(1 / α))

# cross reference the approach in the Distributions package
quantile(Distributions.Weibull(.75, 1.1), .3)
LSurvival.qweibull(0.3, .75, 1.1)

source

LSurvival.randweibull — Method

Random draw from Weibull distribution

lightweight function used for simulation

Note that there is no checking that parameters α,ρ are positively bound, and errors will be given if this is not the case

Signatures:

randweibull(rng::MersenneTwister,α::Real,ρ::Real)
randweibull(α::Real,ρ::Real)

Source code, example:

randweibull(rng, α, ρ) = qweibull(rand(rng), α, ρ)
randweibull(α, ρ) = randweibull(MersenneTwister(), α, ρ)

# cross reference the approach in the Distributions package
rand(Distributions.Weibull(.75, 1.1))
randweibull(.75, 1.1)

source

LSurvival.risk_from_coxphmodels — Method

Survival curve estimation using multiple cox models

Signatures

  risk_from_coxphmodels(fitlist::Vector{T}, args...; kwargs...) where {T<:PHModel}

  fit(::Type{M}, fitlist::Vector{T}, ; fitargs...) where {M<:PHSurv,T<:PHModel}

Optional keywords

coef_vectors = nothing(default) or vector of coefficient vectors from the cox models [will default to the coefficients from fitlist models]
pred_profile = nothing(default) or vector of specific predictor values of the same length as the coef_vectors[1]

 using LSurvival
 using Random
 # event variable is coded 0[referent],1,2
 z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
 enter = zeros(length(t));

 ft1 = coxph(hcat(x,z), enter, t, (event .== 1))
 nidx = findall(event .!= 1)
 ft2 = coxph(hcat(x,z)[nidx,:], enter[nidx], t[nidx], (event[nidx] .== 2))

 # risk at referent levels of `x` and `z`
 risk_from_coxphmodels([ft1,ft2])

 # risk at average levels of `x` and `z`
 mnx = sum(x)/length(x)
 mnz = sum(z)/length(z)
 # equivalent
 fit(PHSurv, [ft1,ft2], pred_profile=[mnx,mnz])
 risk_from_coxphmodels([ft1,ft2], pred_profile=[mnx,mnz])

source

LSurvival.survivaldata — Method

Loading example survival analysis datasets

using LSurvival, Plots # note Plots does not install by default
heartdata, heartmeta = survivaldata("heart")
ft = coxph(@formula(Surv(start, stop, event)~surgery), heartdata);
# plot baseline cumulative hazard
basehazplot(ft)
# plot Schoenfeld residuals
coxdx(ft)

source

RecipesBase.apply_recipe — Method

Recipe for aalen-johansen risk curve

    using Plots, LSurvival
    res = z, x, outt, d, event, weights = LSurvival.dgm_comprisk(MersenneTwister(123123), 100)
    int = zeros(length(d)) # no late entry
    
        c = fit(AJSurv, int, outt, event)
        #risk2 = aalen_johansen(int, outt, event)
        plot(c)

source

RecipesBase.apply_recipe — Method

Plotting a kaplan meier curve

    using Plots, LSurvival
dat4 = (
    id = [1, 1, 2, 2, 2, 3, 4, 5, 5, 6],
    enter = [1, 2, 5, 4, 6, 7, 3, 6, 8, 0],
    exit = [2, 5, 6, 7, 8, 9, 6, 8, 14, 9],
    status = [0, 1, 0, 0, 1, 0, 1, 0, 0, 1],
    x = [0.1, 0.1, 1.5, 1.5, 1.5, 0, 0, 0, 0, 3],
    z = [1, 1, 0, 0, 0, 0, 0, 1, 1, 0],
    w = [0, 0, 0, 0, 0, 1, 1, 1, 1, 0],
)
R = LSurvivalResp(dat4.enter, dat4.exit, dat4.status)
    k = kaplan_meier(dat4.enter, dat4.exit, dat4.status)
    plot(k)

source

RecipesBase.apply_recipe — Method

function name(::Type{T}) where {T}
    #https://stackoverflow.com/questions/70043313/get-simple-name-of-type-in-julia
    isempty(T.parameters) ? T : T.name.wrapper
end

using Plots, LSurvival
dat2 = (
    enter = [1, 2, 5, 2, 1, 7, 3, 4, 8, 8],
    exit = [2, 3, 6, 7, 8, 9, 9, 9, 14, 17],
    status = [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
    x = [1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
)
fte = survreg(@formula(Surv(enter, exit, status)~x), dat2)

# density function
aftdist(fte, label="X=0")
aftdist!(fte, covlevels=[1.0, 2.0], color="red", label="X=1")

# Survival function
aftdist(fte, type="surv", label="X=0")
aftdist!(fte, type="surv", covlevels=[1.0, 2.0], color="red", label="X=1")

# hazard function
aftdist(fte, type="haz", label="X=0")
aftdist!(fte, type="haz", covlevels=[1.0, 2.0], color="red", label="X=1")

# Cumulative incidence/risk function
aftdist(fte, type="risk", label="X=0")
aftdist!(fte, type="risk", covlevels=[1.0, 2.0], color="red", label="X=1")

source

RecipesBase.apply_recipe — Method

Plotting baseline hazard for a Cox model

using Plots, LSurvival
dat2 = (
    enter = [1, 2, 5, 2, 1, 7, 3, 4, 8, 8],
    exit = [2, 3, 6, 7, 8, 9, 9, 9, 14, 17],
    status = [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
    x = [1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
)
fte = coxph(@formula(Surv(enter, exit, status)~x), dat2, maxiter=0)
ftb = coxph(@formula(Surv(enter, exit, status)~x), dat2, ties="breslow", maxiter=0)

plot(fte, label="Efron")
plot!(ftb, label="Breslow")

source

RecipesBase.apply_recipe — Method


using Plots, LSurvival
dat2 = (
    enter = [1, 2, 5, 2, 1, 7, 3, 4, 8, 8],
    exit = [2, 3, 6, 7, 8, 9, 9, 9, 14, 17],
    status = [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
    x = [1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
)
fte = coxph(@formula(Surv(enter, exit, status)~x), dat2)

coxdx(fte)

source

RecipesBase.apply_recipe — Method

using Plots, LSurvival
dat2 = (
    enter = [1, 2, 5, 2, 1, 7, 3, 4, 8, 8],
    exit = [2, 3, 6, 7, 8, 9, 9, 9, 14, 17],
    status = [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
    x = [1, 0, 0, 1, 0, 1, 1, 1, 0, 0],
)
fte = coxph(@formula(Surv(enter, exit, status)~x), dat2)

coxinfluence(fte, type="jackknife", par=1)
coxinfluence!(fte, type="dfbeta", color=:red, par=1)

source

RecipesBase.apply_recipe — Method

Plotting baseline hazard for a Cox model

using Plots, LSurvival
dat4 = (
    id = [1, 1, 2, 2, 2, 3, 4, 5, 5, 6],
    enter = [1, 2, 5, 4, 6, 7, 3, 6, 8, 0],
    exit = [2, 5, 6, 7, 8, 9, 6, 8, 14, 9],
    status = [0, 1, 0, 0, 1, 0, 1, 0, 0, 1],
    x = [0.1, 0.1, 1.5, 1.5, 1.5, 0, 0, 0, 0, 3],
    z = [1, 1, 0, 0, 0, 0, 0, 1, 1, 0],
    w = [0, 0, 0, 0, 0, 1, 1, 1, 1, 0],
)

k = kaplan_meier(dat4.enter, dat4.exit, dat4.status)

lognlogplot(k)

source

RecipesBase.apply_recipe — Method

Plotting LSurvivalResp objects (outcomes in cox models, kaplan meier curves, parametric survival models)

Recipe for plotting time-to-event outcomes

using Plots, LSurvival

dat4 = (
    id = [1, 1, 2, 2, 2, 3, 4, 5, 5, 6],
    enter = [1, 2, 5, 4, 6, 7, 3, 6, 8, 0],
    exit = [2, 5, 6, 7, 8, 9, 6, 8, 14, 9],
    status = [0, 1, 0, 0, 1, 0, 1, 0, 0, 1],
    x = [0.1, 0.1, 1.5, 1.5, 1.5, 0, 0, 0, 0, 3],
    z = [1, 1, 0, 0, 0, 0, 0, 1, 1, 0],
    w = [0, 0, 0, 0, 0, 1, 1, 1, 1, 0],
)
R = LSurvivalResp(dat4.enter, dat4.exit, dat4.status)
plot([[R.enter[i], R.exit[i]] for i in eachindex(R.enter)], [[i, i] for i in values(R.id)])

source

RecipesBase.apply_recipe — Method

Recipe for cox-model based risk curves

    using Plots, LSurvival, Random, StatsBase
    res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 100)
    X = hcat(z, x)
    int = zeros(length(d)) # no late entry
    ft1 = fit(PHModel, X, int, outt, d .* (event .== 1), wts=wts)
    ft2 = fit(PHModel, X, int, outt, d .* (event .== 2), wts=wts)
    c = risk_from_coxphmodels([ft1, ft2], pred_profile = mean(X, dims=1))
    
    plot(c)

source

StatsAPI.confint — Method

Greenwood's formula for variance and confidence intervals of a Aalen-Johansen risk function

Signatures:

StatsBase.stderror(m::AJSurv)

StatsBase.confint(m:AJSurv; level=0.95, method="normal")

Keyword arguments

method
- "normal" normality-based confidence intervals
- "lognlog" log(-log(S(t))) based confidence intervals

res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 100)
int = zeros(length(d)) # no late entry
m = fit(AJSurv, int, outt, event)
stderror(m)
confint(m, level=0.95)

source

StatsAPI.confint — Method

Greenwood's formula for variance and confidence intervals of a Kaplan-Meier survival curve

Signatures:

StatsBase.stderror(m::KMSurv)

StatsBase.confint(m:KMSurv; level=0.95, method="normal")

Keyword arguments

method:

"normal" normality-based confidence intervals
"lognlog" log(-log(S(t))) based confidence intervals

using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)
stderror(m)
confint(m, method="normal")
confint(m, method="lognlog") # log-log transformation

source

StatsAPI.fit! — Method

Kaplan-Meier estimator for cumulative conditional risk

Signatures

StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}

kaplan_meier(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
   ; kwargs...)

Keyword arguments

wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observation
id::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)
atol = 0.00000001; absolute tolerance for defining tied event times
censval = 0; value of the outcome to be considered a censored event
keepy = true; keep the outcome vector after fitting (may save memory with large datasets)
eps = 0.00000001; deprecated (replaced by atol)

using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)

or, equivalently:

kaplan_meier(enter, t, d, wts=wt)

Aalen-Johansen estimator for cumulative cause-specific risk (in the presence of competing events)

Signatures

 StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}

 aalen_johansen(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
   ; kwargs...)

Keyword arguments

wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observation
id::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)
atol = 0.00000001; absolute tolerance for defining tied event times
keepy = true; keep the outcome vector after fitting (may save memory with large datasets)
eps = 0.00000001; deprecated (replaced by atol)

using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
   # event variable is coded 0[referent],1,2
m = fit(AJSurv, enter, t, event)
mw = fit(AJSurv, enter, t, event, wts=wt)

or, equivalently:

aalen_johansen(enter, t, event, wts=wt)

Survival curve estimation using multiple cox models

Signatures

  risk_from_coxphmodels(fitlist::Vector{T}, args...; kwargs...) where {T<:PHModel}

  fit(::Type{M}, fitlist::Vector{T}, ; fitargs...) where {M<:PHSurv,T<:PHModel}

Optional keywords

coef_vectors = nothing(default) or vector of coefficient vectors from the cox models [will default to the coefficients from fitlist models]
pred_profile = nothing(default) or vector of specific predictor values of the same length as the coef_vectors[1]

 using LSurvival
 using Random
 # event variable is coded 0[referent],1,2
 z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
 enter = zeros(length(t));

 ft1 = coxph(hcat(x,z), enter, t, (event .== 1))
 nidx = findall(event .!= 1)
 ft2 = coxph(hcat(x,z)[nidx,:], enter[nidx], t[nidx], (event[nidx] .== 2))

 # risk at referent levels of `x` and `z`
 risk_from_coxphmodels([ft1,ft2])

 # risk at average levels of `x` and `z`
 mnx = sum(x)/length(x)
 mnz = sum(z)/length(z)
 # equivalent
 fit(PHSurv, [ft1,ft2], pred_profile=[mnx,mnz])
 risk_from_coxphmodels([ft1,ft2], pred_profile=[mnx,mnz])

source

StatsAPI.fit — Method

Survival curve estimation using multiple cox models

Signatures

  risk_from_coxphmodels(fitlist::Vector{T}, args...; kwargs...) where {T<:PHModel}

  fit(::Type{M}, fitlist::Vector{T}, ; fitargs...) where {M<:PHSurv,T<:PHModel}

Optional keywords

coef_vectors = nothing(default) or vector of coefficient vectors from the cox models [will default to the coefficients from fitlist models]
pred_profile = nothing(default) or vector of specific predictor values of the same length as the coef_vectors[1]

 using LSurvival
 using Random
 # event variable is coded 0[referent],1,2
 z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
 enter = zeros(length(t));

 ft1 = coxph(hcat(x,z), enter, t, (event .== 1))
 nidx = findall(event .!= 1)
 ft2 = coxph(hcat(x,z)[nidx,:], enter[nidx], t[nidx], (event[nidx] .== 2))

 # risk at referent levels of `x` and `z`
 risk_from_coxphmodels([ft1,ft2])

 # risk at average levels of `x` and `z`
 mnx = sum(x)/length(x)
 mnz = sum(z)/length(z)
 # equivalent
 fit(PHSurv, [ft1,ft2], pred_profile=[mnx,mnz])
 risk_from_coxphmodels([ft1,ft2], pred_profile=[mnx,mnz])

source

StatsAPI.fit — Method

Fit method for AbstractPH objects (Cox models)

Keyword arguments (used here, and passed on to internal structs)

ties "breslow" or "efron" (default)
wts observation weights
ties "breslow" or "efron" (default)
offset not currently used at all
fitargs arguments passed to other structs, which include
- id cluster or individual level ID (defaults to a unique value for each row of data) see note below on ID
- contrasts StatsModel style contrasts (dicts) that can be used for variable transformations/indicator variable creation (e.g. https://juliastats.org/StatsModels.jl/stable/contrasts/)
Arguments passed onto fitting routine:
- eps (default: Float64 = 1e-9) tolerance for declaring convergence. Model is determined to be converged when relative change in log-partial likelihood is < eps .
- getbasehaz (default: true): estimate baseline hazard
- start (default: nothing) nothing, or vector of floats corresponding to initial values for parameters. Note that this defaults to a vector of zeros when set to nothing, and setting to other values invalidates some of the test statistics reported by default with coxph.
- keepx (default: true) logical. Keep design matrix in AbstractPH object output (set to false for slight computational gain).
- keepy (default: true)logical. Keep outcome in AbstractPH object output (set to false for slight computational gain).
- bootstrap_sample (default: false) Fit the model to a bootstrap sample of the data (not generally used by end-users, but provides some convenience toward bootstrap variance estimation).
- bootstrap_rng (default: Random.MersenneTwister()) Random number seed used when drawing a bootstrap sample of the data (not generally used by end-users, but provides some convenience toward bootstrap variance estimation).
Signatures

  fit(::Type{M},
  X::AbstractMatrix,#{<:FP},
  enter::AbstractVector{<:Real},
  exit::AbstractVector{<:Real},
  y::Union{AbstractVector{<:Real},BitVector}
  ;
  ties ="breslow",
  wts::AbstractVector{<:Real}      = similar(y, 0),
  offset::AbstractVector{<:Real}   = similar(y, 0),
  fitargs...) where {M<:AbstractPH}

 coxph(f::FormulaTerm, data; kwargs...)

  coxph(X, enter, exit, y, args...; kwargs...)

   using LSurvival, Random
   z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
   enter = zeros(length(t));
   X = hcat(x,rand(length(x)));
    m = fit(PHModel, X, enter, t, d, ties="efron")
   m2 = fit(PHModel, X, enter, t, d, ties="breslow")
   coeftable(m)

Note on use of id keyword

id is not needed in person-period structure data for standard estimates or confidence intervals

  using Random, LSurvival
     id, int, outt, dat =
         LSurvival.dgm(MersenneTwister(123123), 100, 100; afun = LSurvival.int_0)
     data = (
             int = int,
             outt = outt,
             d = dat[:,4] .== 1,
             x = dat[:,1],
             z = dat[:,2]
     )

     f = @formula(Surv(int, outt,d)~x+z)
     coxph(f, data)

BUT, you must specify id to get appropriate robust variance and some other statistics.

Here is an example where the same data are presented in two different ways, which should yield identical statistics when used in Cox model.

 dat1 = (
    time = [1,1,6,6,8,9],
    status = [1,0,1,1,0,1],
    x = [1,1,1,0,0,0]
  )
  ft = coxph(@formula(Surv(time,status)~x),dat1)
  bic(ft)
  nobs(ft)
  dof_residual(ft)
  # lrtest is another one

  stderror(ft)                     # model based
  stderror(ft, type="robust")   # robust standard error, based on dfbeta residuals
  ft

  # now using "clustered" data with multiple observations per individual
 dat1clust= (
     id = [1,2,3,3,4,4,5,5,6,6],
     enter = [0,0,0,1,0,1,0,1,0,1],
     exit = [1,1,1,6,1,6,1,8,1,9],
     status = [1,0,0,1,0,1,0,0,0,1],
     x = [1,1,1,1,0,0,0,0,0,0]
 )
 
 # use the `id` parameter with the ID struct
 ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id))
 bic(ft2)                       # CORRECT        
 nobs(ft2)                      # CORRECT
 dof_residual(ft2)              # CORRECT
  
 stderror(ft2)                  # model based (CORRECT)
 stderror(ft2, type="robust")   # robust standard error, based on `id` level dfbeta residuals (CORRECT)
 # once robust SE is calculated, coefficient table uses the robust SE for confidence intervals and test statistics
 ft2   # CORRECT (compare to `ft` object)

NOTE THE FOLLOWING IS INCORRECT because the id keyword is omitted

 ft2w = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust)
 bic(ft2w)                          # INCORRECT 
 nobs(ft2w)                         # INCORRECT
 dof_residual(ft2w)                 # INCORRECT

 stderror(ft2w)                     # model based (CORRECT)
 stderror(ft2w, type="robust")      # robust variance (INCORRECT)
 
 ft2w # the coefficient table now shows incorrect confidence intervals and test statistics

source

StatsAPI.fit — Method

Aalen-Johansen estimator for cumulative cause-specific risk (in the presence of competing events)

Signatures

 StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}

 aalen_johansen(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
   ; kwargs...)

Keyword arguments

wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observation
id::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)
atol = 0.00000001; absolute tolerance for defining tied event times
keepy = true; keep the outcome vector after fitting (may save memory with large datasets)
eps = 0.00000001; deprecated (replaced by atol)

using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
   # event variable is coded 0[referent],1,2
m = fit(AJSurv, enter, t, event)
mw = fit(AJSurv, enter, t, event, wts=wt)

or, equivalently:

aalen_johansen(enter, t, event, wts=wt)

source

StatsAPI.fit — Method

Kaplan-Meier estimator for cumulative conditional risk

Signatures

StatsBase.fit!(m::T; kwargs...) where {T<:AbstractNPSurv}

kaplan_meier(enter::AbstractVector, exit::AbstractVector, y::AbstractVector,
   ; kwargs...)

Keyword arguments

wts::Vector{<:Real} = similar(enter, 0); vector of case weights (or zero length vector) for each observation
id::Vector{<:AbstractLSurvivalID} = [ID(i) for i in eachindex(y)]; Vector of AbstractSurvID objects denoting observations that form a single unit (used in bootstrap and jackknife methods)
atol = 0.00000001; absolute tolerance for defining tied event times
censval = 0; value of the outcome to be considered a censored event
keepy = true; keep the outcome vector after fitting (may save memory with large datasets)
eps = 0.00000001; deprecated (replaced by atol)

using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)

or, equivalently:

kaplan_meier(enter, t, d, wts=wt)

source

StatsAPI.loglikelihood — Method

Maximum log partial likelihood for a fitted AbstractPH model Efron or Breslow (depending on the ties` parameter)

source

StatsAPI.loglikelihood — Method

Maximum log likelihood for a fitted PSModel model

source

StatsAPI.nullloglikelihood — Method

Null log-partial likelihood for a fitted AbstractPH model Efron or Breslow (depending on the ties` parameter)

Note: this is just the log partial likelihood at the initial values of the model, which default to 0. If initial values are non-null, then this function no longer validly returns the null log-partial likelihood.

source

StatsAPI.nullloglikelihood — Method

Null log-partial likelihood for a fitted PSModel model

source

StatsAPI.predict — Method

Cumulative probability of an outcome by some discrete time t

i.e. for a cox model λ(t|X) = λ0(t)exp(Xβ) estimate of F(t | X=newX) = 1 - exp(-Λ(t | X = newX)) = 1- exp(-∑ λ(t | X = newX)) = 1- exp(-∑ λ0(t)exp(newXβ)) where λ_0(t) is an estimate of the baseline hazard

    using LSurvival
    dat1clust= (
       id = [1,2,3,3,4,4,5,5,6,6],
       enter = [0,0,0,1,0,1,0,1,0,1],
       exit = [1,1,1,6,1,6,1,8,1,9],
       status = [1,0,0,1,0,1,0,0,0,1],
       x = [1,1,1,1,0,0,0,0,0,0]
    )
   
    # use the `id` parameter with the ID struct
    ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id))
    # use the `id` parameter with the ID struct
    newids = ID.([1,2,3,4,5])
    newX = [1,1,1,0,0][:,:]
    entertime = [0,0,0,0,0]
    exittime = [8,8,9,10,9]

    try 
        predict(ft2) # throws an error due to late entry in initial fit
    catch e
        println(e)
    end
    predict(ft2,newX,entertime,exittime,newids)[1] # warns about carrying forward covariate values
    preds, times = predict(ft2,newX) # no warning
    hcat(times, preds)

    # id specific cumulative hazard 

    # contrast with competing-risk methods from risk_from_coxphmodels,
    #  which don't perform as well in small samples (overall survival is identical for cheng method)
    risk_from_coxphmodels([ft2], pred_profile=[1][:,:], method="cheng")
    risk_from_coxphmodels([ft2], pred_profile=[1][:,:], method="aalen")

source

StatsAPI.residuals — Method

####################################################################

Cox proportional hazards model residuals:

Signature

  residuals(m::M; type = "martingale") where {M<:PHModel}

where type is one of

martingale
schoenfeld
score
dfbeta
jackknife
dfbetas (scaled dfbeta)
scaled_schoenfeld or schoenfelds (scaled Schoenfeld)
Residuals from the residuals function are designed to exactly emulate those from the survival package in R. Currently, they are validated for single observation data (e.g. one data row per individual).
####################################################################
Martingale residuals: Observed versus expected

  # example from https://cran.r-project.org/web/packages/survival/vignettes/validate.pdf
  # by Terry Therneau

  dat1 = (
    time = [1,1,6,6,8,9],
    status = [1,0,1,1,0,1],
    x = [1,1,1,0,0,0]
  )

  # Nelson-Aalen type estimator for Breslow partial likelihood
  ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="breslow")
  residuals(ft, type="martingale")

  dat1 = (
    time = [1,1,6,6,8,9],
    status = [1,0,1,1,0,1],
    x = [1,1,1,0,0,0]
  )

  # Fleming-Harrington type estimator for Efron partial likelihood
  ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="efron")
  residuals(ft, type="martingale")

####################################################################

Score residuals: Per observation contribution to score function

  using LSurvival
  dat1 = (
    time = [1,1,6,6,8,9],
    status = [1,0,1,1,0,1],
    x = [1,1,1,0,0,0]
  )
  ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="breslow")
  S = residuals(ft, type="score")[:]
  ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="efron", maxiter=0)
  S = residuals(ft, type="score")[:]

####################################################################

Schoenfeld residuals: Per time contribution to score function

  using LSurvival
  dat1 = (
    time = [1,1,6,6,8,9],
    status = [1,0,1,1,0,1],
    x = [1,1,1,0,0,0]
  )
  ft = coxph(@formula(Surv(time,status)~x),dat1, keepx=true, keepy=true, ties="breslow", maxiter=0)


  X = ft.P.X
  M = residuals(ft, type="martingale")
  S = residuals(ft, type="schoenfeld")[:]
  Ss = residuals(ft, type="scaled_schoenfeld")[:]

####################################################################

dfbeta residuals: influence of individual observations on each parameter

  using LSurvival
  dat1 = (
    time = [1,1,6,6,8,9],
    status = [1,0,1,1,0,1],
    x = [1,1,1,0,0,0]
  )

  ft = coxph(@formula(Surv(time,status)~x),dat1, ties="breslow")
  residuals(ft, type="dfbeta")

  # can also calculate from score residuals and Hessian matrix
  L = residuals(ft, type="score") # n X p
  H = ft.P._hess   # p X p
  dfbeta = L*inv(H)
  robVar = dfbeta'dfbeta
  sqrt(robVar)

using the id keyword argument

see help for LSurvival.vcov for what happens when id keyword is not used

  dat1clust= (
    id = [1,2,3,3,4,4,5,5,6,6],
    enter = [0,0,0,1,0,1,0,1,0,1],
    exit = [1,1,1,6,1,6,1,8,1,9],
    status = [1,0,0,1,0,1,0,0,0,1],
    x = [1,1,1,1,0,0,0,0,0,0]
  )

  ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id), ties="breslow")

  # note these are still on the observation level (not the id level)! 
  residuals(ft2, type="dfbeta")

  # getting id level dfbeta residuals
  dfbeta = residuals(ft2, type="dfbeta")
  id = values(ft2.R.id)
  D = reduce(vcat, [sum(dfbeta[findall(id .== i),:], dims=1) for i in unique(id)])
  D'D
  vcov(ft, type="robust")
  vcov(ft2, type="robust")

####################################################################

jackknife residuals: influence of individual observations on each parameter using leave-one-out estimates

note there are other definitions of jackknife residuals See Chapter 7.1 of "Extending the Cox Model" by Therneau and Grambsch for an example of the type of jackknife residuals used here

Jackknife residuals $r_i$ for $i \in 1:n$ are given as the difference between the maximum partial likelihood estimate and the jackknife estimates for each observation

\[r_i = \hat\beta - \hat\beta_{(-i)}\]

where $\beta_{(-i)}$ is the maximum partial likelihood estimate of the log-hazard ratio vector obtained from a dataset in which observations belonging to individual i are removed

  using LSurvival
  dat1 = (
    time = [1,1,6,6,8,9],
    status = [1,0,1,1,0,1],
    x = [1,1,1,0,0,0]
  )

  ft = coxph(@formula(Surv(time,status)~x),dat1, ties="breslow")
  #jackknife(ft)
  residuals(ft, type="jackknife")

source

StatsAPI.stderror — Method

Greenwood's formula for variance and confidence intervals of a Aalen-Johansen risk function

Signatures:

StatsBase.stderror(m::AJSurv)

StatsBase.confint(m:AJSurv; level=0.95, method="normal")

Keyword arguments

method
- "normal" normality-based confidence intervals
- "lognlog" log(-log(S(t))) based confidence intervals

res = z, x, outt, d, event, wts = LSurvival.dgm_comprisk(MersenneTwister(123123), 100)
int = zeros(length(d)) # no late entry
m = fit(AJSurv, int, outt, event)
stderror(m)
confint(m, level=0.95)

source

StatsAPI.stderror — Method

Greenwood's formula for variance and confidence intervals of a Kaplan-Meier survival curve

Signatures:

StatsBase.stderror(m::KMSurv)

StatsBase.confint(m:KMSurv; level=0.95, method="normal")

Keyword arguments

method:

"normal" normality-based confidence intervals
"lognlog" log(-log(S(t))) based confidence intervals

using LSurvival
using Random
z,x,t,d, event,wt = LSurvival.dgm_comprisk(MersenneTwister(1212), 1000);
enter = zeros(length(t));
m = fit(KMSurv, enter, t, d)
mw = fit(KMSurv, enter, t, d, wts=wt)
stderror(m)
confint(m, method="normal")
confint(m, method="lognlog") # log-log transformation

source

StatsAPI.vcov — Method

Covariance matrix for Cox proportional hazards models

Keyword arguments

type nothing or "robust": determines whether model based or robust (dfbeta based) variance is returned.
See ?residuals for info on dfbeta residuals

using LSurvival
dat1 = (
  time = [1,1,6,6,8,9],
  status = [1,0,1,1,0,1],
  x = [1,1,1,0,0,0]
)
ft = coxph(@formula(Surv(time,status)~x),dat1, id=ID.(collect(1:6)))

vcov(ft)                   # model based
vcov(ft, type="robust")    # robust variance, based on dfbeta residuals
# once robust SE is calculated, coefficient table uses the robust SE for confidence intervals and test statistics
ft

cluster robust standard errors using the id keyword argument

dat1clust= (
  id = [1,2,3,3,4,4,5,5,6,6],
  enter = [0,0,0,1,0,1,0,1,0,1],
  exit = [1,1,1,6,1,6,1,8,1,9],
  status = [1,0,0,1,0,1,0,0,0,1],
  x = [1,1,1,1,0,0,0,0,0,0]
)

ft2 = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust, id=ID.(dat1clust.id))

vcov(ft2)                     # model based
vcov(ft2, type="robust")       # robust variance, based on dfbeta residuals
stderror(ft2, type="robust")   # robust variance, based on dfbeta residuals
confint(ft2, type="robust")    # robust variance, based on dfbeta residuals
nobs(ft2)                     # id argument yields correct value of number of independent observations
# once robust SE is calculated, coefficient table uses the robust SE for confidence intervals and test statistics
ft2

NOTE THE FOLLOWING IS INCORRECT because the id keyword is omitted

ft2w = coxph(@formula(Surv(enter, exit, status) ~ x),dat1clust)

vcov(ft2w)                   # model based (CORRECT)
vcov(ft2w, type="robust")    # robust variance (INCORRECT)
nobs(ft2w)

ft2w

source

Implementation details and further help