Statistics – Survival analysis
After a question this summer I think it could be time to explain how survival analysis is working in SPSS Statistics. Typical situation for doing survival analysis is when you want to measure the time to an event/happening – see the data file below. The red lines indicate which variable I need for the analysis:
In this example I would like to measure time to the event=dead. Time is measured as Days, (between start of the study and to the end for this person).
Status is the variable that indicates 2 things, the event and the censored event. Censored event= when a person fall out from the study/end the study of Another reason than the event “dead”. You need this variable in the analysis as these persons also have been alive for an amount of days, so you don’t under estimate the time.
If you don’t have the status variable you can as and alternative do an linear regression, not really the same measurements though.
Here is the command you choose from the “Analyze” menu: (the add on module Advanced statistics is needed):
Then you fill in all these. As you see you have to define the code for the event, so you can right click on your “Status” variable if you don’t remember the codings. I will use code 1 (“dead”):
Then there are some subcommands, the first is “Compare Factor”, that you can choose to test differences of times between your groups in the factor (here smoker-groups):
The other sub command is “Save” if you want to save extra risk scorings into your data file for each person.
The third sub command is “Options”, where you can add descriptive statistics like mean values of time to “Death” (=Time to event) and plots.
Here is the first descriptive result. You can see that the highest percentages of censored cases (alive people) is in the Group “non-smoker” with 67% alive, compared to the ex smoker Group that has 46%:
You also get a very detailed list of every person’s cumulative proportions of surviving. So if you have been alive for 73 Days then it’s 96.7% chance that you survive 34 more Days to 107 Days from the start date.
Here are the descriptive statistics like mean and median. You can see the differences of time to event for these 3 Groups. Highest mean value of survival time is for non-smoker, but be aware of that non-smoker is not so many: only 21 people, that’s why we got a missing value in estimate for median to the right.
I would like to know if there are any significant differences between the time to event for these 3 Groups.
As the non-smoker are not so many: only 21 people, the test is taking care of the small sample of 21 people in the test formula, so it will be harder to find a significant result for that group. It’s a significance value of 0.071 in the test between non-smoker and ex smoker, so very close to a significant difference between the 2 Groups. But as the significance value is NOT lower than 0.05, we can not prove that there are any significant differences between survival time for non smokers compared to ex smokers. The other groups doesn’t differ from each other in survival time.
Here is also a graph that shows the 3 Groups survival. The survival has a higher value in the upper line (non-smoker).