Event History (Survival) Analysis
* data type
* censored cases
* "survival" function
* "hazard" function
* log-rank test for differences in survival functions
* Cox's regression
# Load libraries.
library(survival)
# Open the data file.
data("stanford2", package="survival")
# We'll create groups by dividing the cases into "older than median" and "younger than median"
stanford2$group<-stanford2$age
stanford2$group<-ifelse(stanford2$age > median(stanford2$age),1,0)
# We can plot the survival function for both groups.
plot(survfit(Surv(time, status) ~ group, data=stanford2), main="Stanford Heart Transplant Data", ylab="Prob", xlab="Survival Time", lty=1:2)
legend("topright",legend=c("younger","older"), lty=1:2)
# We can calculate the log-rank test by age group.
survdiff(Surv(time, status) ~ group, data=stanford2)
# We can use Cox's regression (proportional hazards) to test the effect of causal variables
# on the event. Here, we'll test for the effect of age.
coxph(Surv(time,status)~age,data=stanford2)
Group exercise:
Select one of the other data frames in the survival package and generate a plot of the survival function, perform a log-rank test, and, if possible, calculate Cox's regression model and interpret the results.
Use the following function to get a list of the data frames:
library(help="survival")
You can use the summary function to see what variables are in the file:
summary("dataframe") # where "dataframe" is the name of the file