Risk factor analysis

To determine if each risk factor was associated with increased risk of a first fall, risk factors were assessed using bivariate logistic regression models stratified by place of residence (as risk factors may differ for those living in the community and those in residential aged care).

Risk factors found to have a significant association with an increased risk of falls were also assessed and reported in multivariable logistic regression models, again stratified by place of residence. This regression produces adjusted odds ratios to determine the association of each risk factor with falls while accounting for other influencing risk factors which may have confounding effects.

Further analysis was undertaken to determine if there were particular groups of risk factors which resulted in severe outcomes following a first hospitalised fall.

Logistic regression

All logistic regression was performed in SAS Enterprise Guide 8.3. The study population for this analysis was the total cohort outlined in Defining dementia and fall cohort – those with a dementia record but not fall record prior to 2019. The modelling outcome is a first hospitalised fall in 2019.

Bivariate logistic regression analysis

All risk factors identified from literature and converted to binary (present/ absent) variables were first assessed using bivariate logistic regression for statistically significant association (that is, p <0.05) with increased likelihood of experiencing a fall in the study cohort. This analysis was run for individuals in residential aged care and in the community separately, with risk factors associated with a significantly increased likelihood of falls retained for further analysis.

Stratified multivariable logistic regression

Important risk factors identified in the bivariate logistic regression were analysed using multivariable logistic regression models which examined association between risk factors and fall likelihood whilst controlling for other risk factors, as well as for age. These regression models were stratified by sex and by place of residence at reference date (residential aged care or community).
 

Cluster analysis

What is cluster analysis?

Cluster analysis aims to uncover structure in data, such as underlying subpopulations, by assessing similarity across a set of characteristics. This similarity is measurable and can be used to group the data in such a way that those in a group are most similar to each other and least similar to those in other groups. In this study, people with dementia were grouped based on risk factors that were found to increase their likelihood of falling. Each cluster is formed by grouping together people with similar risk factor profiles.

Using the risk factors identified from bivariate logistic regression modelling, a probabilistic clustering method was applied to the aged care and community sub-cohorts separately, to identify groups of individuals with different risk factor profiles. All cluster analyses were performed in R (Version 4.1.3) using the FlexMix package (Leisch 2004). 

All risk factors for assessment were first coerced to a matrix as binary variables (1=risk factor present, 0=risk factor absent), with the de-identified personal identifier retained as the row name.

This matrix was then subjected to probabilistic clustering (using Bernoulli mixture modelling methods) in order to allocate each individual to a cluster based upon their combination of risk factor presence. The clustering model ran this cluster allocation 10 times with the maximum likelihood solution retained. This number of iterations was chosen to provide sufficient runs to ensure stability of output without exceeding computational capabilities.

During the clustering process, the analyst defined the number of clusters to be formed and identified the optimum number of clusters. The cluster model was run with iterations of up to 5, 6, 7 or 8 clusters maximum, with the optimum cluster number found to be 5 for both residential aged care and community populations. Details on selecting the optimum number are outlined in ‘How is the ‘right’ number of clusters chosen?’ below.

As de-identified personal identifiers were retained during the clustering process, the cluster group number each individual was assigned to (either 1, 2, 3, 4 or 5) was then rejoined to the original dataset for further analysis of the features and outcomes by cluster group.