Key messages

  • Routine surveillance of communicable disease involves the systematic collection, analysis and interpretation of data; however, this process is less efficient, timely or complete than data linkage. Although complex, data linkage for communicable disease monitoring is of high value as it provides a rich ‘person level’ source of information.

    For example, linked data can tell us if a person has been diagnosed with a communicable disease, been vaccinated, had a hospital admission, been provided a prescription for medication, or died.

  • Understanding the full impact of COVID-19 on individuals and the health system in Australia was difficult during the height of the pandemic due to the complex data sharing arrangements between the national and state and territory jurisdictions.
  • To assist with the COVID-19 vaccination rollout, data from the Australian Immunisation Register (AIR) was added to the Person Level Integrated Data Asset (PLIDA), which enabled surveillance of vaccine uptake by various population groups. 
  • The AIHW has developed the COVID-19 Register – a linked data asset – which can be used to develop a deeper understanding of the health outcomes and health service use of COVID-19 cases, inform patient care decisions, and monitor health system needs over time. The Register demonstrates the data sharing pathways and technical processes for communicable disease monitoring that can be applied to future communicable diseases of interest.

Introduction

Australia has a long history of population-based data linkage. 

  • Efforts from as early as the 1970s in Western Australia led to the creation of the Western Australian Data Linkage System in 1995 which linked up to 40 years of data from over 30 collections for a historical population of 3.7 million (Holman et al. 2008). 
  • In the mid-1980s, the AIHW commenced the national linkage of cancer registry and death data from cancer registries in Australian states and territories; this is one of the first examples of cross-jurisdiction data linkage in Australia (Smith and Flack 2021). 
  • The mid-2000s saw the establishment of the Centre for Data Linkage as well as state and territory data linkage units, such as the SA NT DataLink Consortium, which are supported through a variety of university and non-government entities (Boyd et al. 2012; Boyle and Emery 2017; Smith and Flack 2021).

Over the years, these linkage systems have helped to build national and cross-jurisdictional linkage of de-identified data from a diverse and extensive range of health data sets to support population-level research. 

Data responses to COVID-19

COVID-19 emerged as a new disease in late 2019 and the World Health Organization declared it a pandemic in March 2020. As at May 2024, over 775 million cases have been confirmed globally, and 7 million deaths (WHO 2024). In Australia, there have been almost 12 million confirmed or probable COVID-19 cases, and more than 22,000 people have died from or with COVID-19 since the start of the pandemic to 29 February 2024 (ABS 2024; Department of Health and Aged Care 2024).

In response to the pandemic, state and territory health departments linked their communicable disease registers with other data holdings to inform jurisdictional decisions about public health action including contact tracing and the location of testing clinics (PHRN 2021). This data linkage at the state and territory-level promoted awareness of patient flow and health system demands, and continues to be used to understand the long-term impacts of COVID-19.

The COVID-19 pandemic demonstrated the need for strengthened national data infrastructure in Australia, such as linkable communicable disease registers, to support evidence-based public health policy and service planning decisions. Through a disease register and data linkage, the health service use and health outcomes of individuals affected by a particular disease can be monitored to assist with current and future planning and decision-making. 

This article describes the development by the AIHW of the COVID-19 Register to understand health outcomes and health service use of people diagnosed with COVID-19. The discussion progresses through the following topics:

  • the current status of communicable disease data sources in Australia
  • Australia’s response to the COVID-19 pandemic
  • an introduction to linked data, including the COVID-19 Register
  • the current landscape on data sharing
  • preliminary data findings from the COVID-19 Register
  • future directions for communicable disease monitoring using linked data. 

Communicable disease data in Australia

Communicable diseases are illnesses that spread between people. Some are fairly mild in impact, such as the common cold; others are of particular concern because they can cause serious illness (Department of Health and Aged Care 2023c).

National Notifiable Diseases Surveillance System data

In Australia, healthcare professionals are legally required to report certain diseases to state or territory authorities, including COVID-19. Every day, states and territories provide de-identified notification data about new cases of notifiable diseases to the Department of Health and Aged Care for inclusion in the National Notifiable Diseases Surveillance System (NNDSS) (Department of Health and Aged Care 2023d). The NNDSS is set up under the National Health Security Act 2007  (Cwlth) for the purposes of national communicable disease surveillance. Notifications, such as polymerase chain reaction (PCR) positive COVID-19 cases, come from various sources, including clinicians, hospitals and laboratories.

The NNDSS data are used to:

  • identify national disease trends and outbreaks
  • respond to potential outbreaks
  • support quarantine activities
  • allocate resources
  • meet international reporting requirements
  • track progress towards eradicating these diseases over time (Department of Health and Aged Care 2023d).

Other data for surveillance

States and territories use population-level communicable diseases data for surveillance purposes, including:

  • for rapid, real-time decision-making to manage outbreaks
  • in surveillance reports that outline the activity and severity of the communicable disease in the community.

Each state and territory has its own legislation, data management systems and outbreak responses. This diversity was reflected at the height of the COVID-19 pandemic in the often different rules among jurisdictions for isolation, quarantine and public health measures (such as mask mandates, and testing and reporting of Rapid Antigen Tests (RATs) and PCRs).

Additional complementary data, used together with communicable disease case information as part of national and jurisdictional surveillance activities include:

  • waste-water surveillance of viral genetic material shed by infected asymptomatic and symptomatic individuals, which can be detected before clinical cases are identified (Department of Health, Western Australia 2023)
  • mortality data, and data from the AIR for vaccination uptake, the Pharmaceutical Benefits Scheme (PBS) (for subsidised medications dispensed), hospital admissions records, and the National Medical Stockpile.

Surveillance systems

A range of other agencies already collect and collate communicable disease surveillance data that could be readily adapted to include surveillance of COVID-19 cases. Surveillance systems have been in hospitals and in primary health care services to monitor case severity and symptoms. Examples include:

  • FluCAN – a hospital collection that collects influenza and COVID-19 hospital admission data
  • Australian Sentinel Practices Research Network – a sentinel general practice surveillance system that collects the number of influenza-like illness presentations seen in participating practices each week
  • FluTracking – an online syndromic surveillance system collecting community-level information on influenza-like illness that has since expanded to include COVID-19-like illness (Department of Health and Aged Care 2023a).

As well, new systems have been established where there were gaps:

  • AusTrakka – developed to provide a national genomics surveillance platform for SARS-CoV-2 (the virus that causes COVID-19), with all state and territory public health laboratories uploading genomic sequences for nationally aggregated genomics analysis (CDGN 2022)
  • Short Period Incidence Study of Severe Acute Respiratory Infection (SPRINT-SARI) – a hospital-based surveillance database that collects COVID-19 data from the majority of adult and paediatric Australian intensive care units (ANZIC-RC 2024)
  • seroprevalence surveys – provide estimates on the total number of people who have been infected with SARS-CoV-2, including those infections that might have been missed (APPRISE 2022). For more information see the glossary.

Response to the COVID-19 pandemic

The systematic collection, analysis, and monitoring of data on COVID-19 was not readily available at the national level during the pandemic (Basseal et al. 2022). 

Managing cross-jurisdictional outbreaks and painting a national picture on the impacts of COVID-19 on individuals and the health system was difficult during the height of the pandemic. The Australian Constitution does not authorise the Australian Government to legislate for public health responses and so any necessary action is at the state and territory government level. This situation presents challenges to data sharing, timely information exchange, and real-time analyses and evaluation (Basseal et al. 2022). Nonetheless, facing and resolving these challenges is vital for informed and effective decision making in a rapidly evolving public health emergency and beyond.

Responding to the need for national data, states and territories collaborated to provide daily aggregate COVID-19 case data to the National Incident Room (NIR) for the first 3.5 years of the pandemic. These efforts were the precursor to the addition of COVID-19 case data to the NNDSS. Data from the NIR and the NNDSS were useful for reporting national daily statistics on COVID-19 cases to the Australian public and for monitoring the impact of public health interventions and health service use.

Enhancing data sharing arrangements, standardised definitions and interoperability between data sources are imperative to improving Australia’s response to future pandemics (Basseal et al. 2022; Shergold et al. 2022).

Introducing linked data

Data linkage is the process of identifying, matching, and merging records that correspond to the same person or entity from several data sets, to create a new combined data set. Linked data provide a valuable person-level source of information for health monitoring, beyond that available through routine disease surveillance and single data sources.

  • For example, linked data can efficiently confirm if a person diagnosed with COVID-19 has had a hospital admission, been provided with a prescription for medication outside of hospital, been vaccinated, or died.

Some data are linked through the personal information from individuals within data sources (such as full name, date of birth, and address), which enables records to be matched and merged. This personal information is used only to link the data; it is not shared for any purpose. Other data can be linked through unique identifiers, with personal information not required.

Described below are 3 examples of linked data assets for COVID-19 monitoring.

AIR-PLIDA

To examine COVID-19 vaccine uptake and the real-world effectiveness of COVID-19 vaccines, data from the Australian Immunisation Register (AIR) were added to the Australian Bureau of Statistics’ (ABS) PLIDA (formerly known as the Multi-Agency Data Integration Project (MADIP)). PLIDA combines information on health, education, government payments, income and taxation, employment, and population demographics over time (ABS n.d.).

This linked data asset, known as AIR-PLIDA, provided a source of surveillance during the vaccine rollout including vaccination rates by population characteristics such as occupation, cultural diversity, disability, and chronic health conditions (Department of Health and Aged Care 2023b; Welsh et al. 2023).

AIR-PLIDA data have shown that between July 2021 and January 2022, there were groups in the population with lower levels of first dose vaccine uptake than the national average (Biddle et al. 2022). These groups included:

  • Aboriginal and Torres Strait Islander (First Nations) people
  • people who speak a language other than English
  • non-citizens
  • people with a core activity need for assistance
  • people who were not employed
  • certain occupation groups (technicians, labourers and trade workers).

These data provide helpful guidance on which population groups to target for further vaccination programs, as well as insight into socioeconomic variables that predicted vaccine uptake, which were previously unavailable from the unlinked data.

AIR-PLIDA continues to be used to provide insights into the effectiveness of vaccination and to guide recommendations on COVID-19 vaccination boosters (Liu et al. 2023).

Aged care

From March 2022, the AIHW worked with the ABS to link aged care data with AIR-PLIDA. This linkage, together with provisional mortality data from the ABS, provided information on rates of vaccination, population characteristics and deaths among residential aged care residents. The vaccination data assisted with efforts to improve low vaccination rates. 

The AIHW also linked aged care data with Medicare data for the Department of Health and Aged Care, which improved responses to the needs of aged care services and recipients during the COVID-19 pandemic.

AIHW’s COVID-19 Register

In April 2022, the Medical Research Future Fund funded the AIHW to establish a national linked data platform that integrated relevant existing health data sets. This initiative was taken to strengthen evidence-based public health and health system planning and management for current and future pandemics. 

This linked data set, known as the COVID-19 Register, contains de-identified COVID-19 specific content data from the NNDSS linked to person level COVID-19 data received from state and territory disease registers. These data have then been linked to a range of administrative health data sets including:

  • medication dispensing through the PBS
  • health service use through the Medicare Benefits Schedule (MBS)
  • hospital admission (including intensive care)
  • aged care
  • deaths
  • vaccination through the AIR
  • data from the National Disability Insurance Scheme (AIHW 2023).

The COVID-19 Register is a proof-of-concept initiative to provide government and approved researchers with a unique and more complete picture of the issues and experiences of Australians who have been diagnosed with COVID-19. It includes people who have not been diagnosed with COVID-19 as a control group.

The COVID-19 Register has been developed to include COVID-19 case data up to December 2022, with updated health data linked as it becomes available. For more information on the linkage method and high-level linkage results, see COVID-19 Register: Linkage results.

Research outcomes of using the COVID-19 Register

Outcomes of research using the COVID-19 Register have broad application. For example, understanding the:

  • health outcomes after a COVID-19 diagnosis and the effectiveness of treatment options via linked data enables better targeting of treatments for newly diagnosed COVID-19 cases – including for specific population groups at higher risk of adverse health outcomes
  • health service use of people diagnosed with COVID-19 helps to guide resource planning to meet the current and future needs of not only people diagnosed with COVID-19 but also future communicable diseases of interest.

Linkage protocols

As with other data held by the AIHW, the AIHW adheres to strong data governance arrangements including the internationally recognised Five Safes approach which guides the assessment and management of risks associated with data sharing and release. The AIHW applies the Separation principle which includes the physically separate storage of identifying and content information, and use of virtual secure access environments to ensure users only see the identifying or the content information they are approved to see.

The AIHW data linkage protocols mean AIHW linkage staff with access to personal identifiers and analytical data do not have access to the identifiers and analytical data at the same time for the duration of the project. Strict data output vetting is also conducted to check the data are confidentialised and suitable for release before the data leaves the secure access environment. For more information see: Five Safes Framework.

Further development

During 2024, the COVID-19 case data from the COVID-19 Register will be integrated with the AIHW’s National Health Data Hub (NHDH), which combines several data sets into a single enduring linked data system. The interoperability of the COVID-19 Register also allows for its future integration with PLIDA.

Current landscape on data sharing

The process of sharing data for national surveillance and monitoring is currently complex.

The COVID-19 Register has been developed within this data sharing landscape. 

Where identified data or potentially re-identifiable data are involved (that is, personal information), the current arrangements for data sharing are subject to the Privacy Act 1988 (Cwlth), including the Australian Privacy Principles contained in the Act. One of the following conditions must be met for the receipt, use and release of these data:

  • consent is provided by each individual in the data set, or
  • an authorised by law exception applies, or
  • a waiver of consent pursuant to s. 95 of the Privacy Act is issued.

For state and territory data, additional processes and approvals are required to navigate their respective jurisdiction’s public health and privacy legislation.

The COVID-19 Register has been developed primarily through obtaining authorised by law exceptions, and waivers of consent through ethics approvals, noting that consent from individuals was not feasible in this instance, given the scale and routine nature of data collection. 

Oversight of, and access to, the COVID-19 Register

An Advisory Board, consisting of senior executives from the AIHW and the Department of Health and Aged Care, oversees the COVID-19 Register project. Members of the Communicable Diseases Network of Australia (CDNA) also provide advice.

The Register can be accessed by approved researchers whose project proposal has been approved by the Register’s Advisory Group (comprising state and territory representatives). Researchers who wish to use the data need to ensure their research question/s falls under the approved uses, which include:

  • epidemiological and statistical research
  • service use and medication dispensing and patient journeys
  • identifying groups or cohorts of interest
  • monitoring, evaluation and data quality improvement.

COVID-19 case data and personal identifiers

While the NNDSS contains the COVID-19 content data for people diagnosed with COVID-19, such as date of notification and diagnosis method, it does not include identifiable personal information of people diagnosed with COVID-19 (such as name and address). These identifying data are needed for linking to link the COVID-19 case data with other health data sets. For this reason and purpose, the AIHW sought these personal identifier data from the respective state and territory disease registers.

The AIHW received a waiver of consent through the overarching AIHW ethics approval for the COVID-19 Register, which allowed it to receive the COVID-19 personal identifier data from each of the participating jurisdictions. However, before this could happen, a separate approvals process was required to meet the different public health legislative requirements of the states and territories. For some jurisdictions, this was effected through a waiver of consent via an additional ethics committee approval; while for others, separate processes were required.

Health data sets

The health data sets in the COVID-19 Register were approved for data sharing through either a waiver of consent or authorised by law exception. The approach taken depended on current arrangements in place and the legal structures under which the data sets and collections were established.

Importantly, the Register does not contain any identifying information. The AIHW protects the privacy of an individual through a process of de-identification. This process involves removing identifying information (for example, a person’s name, address or Medicare number) so that researchers are unable to tell to whom the information belongs. Only aggregate data can be released from the secure environment in which the data are held, and these data are checked by AIHW against rules for suppression and confidentiality to ensure that individuals cannot be re-identified. For more information see the arrangements discussed in Data governance.

The design of the COVID-19 Register is illustrated in Figure CD 1. The AIHW’s data linkage team links COVID-19 case data from the state and territory communicable disease registers and the NNDSS with a range of health data. This linked data is presented as the COVID-19 Register, which is available in a secure environment to approved researchers. Linked deaths data are also provided back to the state and territory jurisdictions for incorporation into their notifications databases.

Figure CD 1: COVID-19 Register design

The figure shows the flow of data linked in the COVID-19 Register. Two boxes, one from the states and territories and the other from the Department of Health and Aged Care are pointing to AIHW, showing the flow of COVID-19 case information, and content information from the NNDSS, respectively. Subsequent boxes show how each of the data sets is added on to create a de-identified linked data, stored in a secure access environment.

Current and future projects

COVID-19 Register – a key role in disease monitoring

The AIHW has published initial findings using data from the first version of the COVID-19 Register. The report Demonstrating the utility of the COVID-19 Register explores potential analyses that could be conducted on the data using a subset of linked cases. The report explored deaths among people diagnosed with COVID-19, and health service use and prescriptions dispensed before and after a COVID‑19 diagnosis. 

Deaths

Existing surveillance systems can monitor how many people were dying from the acute effects of COVID‑19, however, there has been limited information on long-term mortality patterns among people diagnosed with this disease. A program of work using linked data from the COVID‑19 Register is underway to provide information on the extent to which deaths are associated with a prior COVID‑19 diagnosis.

Health service use and prescriptions dispensed

The COVID‑19 Register is also being used to examine health service use and prescriptions dispensed before and after a COVID‑19 diagnosis, using claims data from the MBS and the PBS. The Register makes it possible to examine patient pathways after a COVID‑19 diagnosis to determine if there are changes in the use of these 2 types of health service.

Future projects

Future research using the COVID-19 Register could further explore:

  • the real-world effectiveness of vaccines
  • the effect of different variants on individual health outcomes and the health system
  • use of antivirals
  • the impact of multiple COVID‑19 notifications.

As well, the Register could explore how these outcomes vary across priority populations.

Several approved COVID-19 Register research projects are already underway. They focus, respectively, on:

  • health and mortality outcomes for people living with dementia during the pandemic
  • health outcomes following a COVID-19 diagnosis by population groups and vaccination status
  • estimating COVID-19 expenditure and service use related to health and aged care.

Australian Centre for Disease Control

Establishing an Australian Centre for Disease Control (CDC) was an election commitment of the Australian Government. The centre’s work would include ensuring ongoing pandemic preparedness, leadership through a federal response to future infectious disease outbreaks, and a focus on preventing communicable and non-communicable diseases.

Ongoing work to establish this stand-alone Australian CDC is centred on delivering 5 key objectives, namely to:

  • increase independence and strengthen evidence-based and transparent decision-making to maintain trust
  • improve national coordination of effort and efficiencies, with stronger partnerships, including across Australian Government agencies and between jurisdictions
  • support national action through enhanced national capabilities, underpinned by the distinct and complementary roles and responsibilities of jurisdictions and the Australian Government
  • enhance international connections
  • increase and productively use resources to support preparedness and response across all jurisdictions, including nationally (Australian Centre for Disease Control 2024).

Progress to date

A joint Statement of Intent was reached during November 2023, receiving commitment from Commonwealth, state and territory governments to work together on the Australian CDC. Current work is progressing on the formal agreement (Australian Centre for Disease Control 2024). 

On 1 January 2024, the first major establishment milestone was reached with the launch of the Interim Australian CDC within the Department of Health and Aged Care. Its key priorities continue to focus on leading a national response to prepare for future pandemics and to work to prevent communicable and non-communicable diseases. 

The Interim Australian CDC is developing a data strategy to define the centre’s role in developing and delivering a strategic approach to collecting, sharing, and using data in the Australian public health system. The strategy will describe strategies and processes for:

  • establishing a new National Public Health Surveillance System
  • optimising the use of linked data for communicable disease monitoring, including ways to streamline the current complex processes for data sharing.

The Department of Health and Aged Care continues to engage with counterparts in other agencies and international partners to leverage expertise and incorporate lessons into the final design of the Australian CDC. Alongside this, other consultation is ongoing and will support a phased approach to its establishment.

Conclusion

The COVID-19 pandemic in Australia has promoted discussion and planning on how to better respond to the current pandemic and assist with future planning. This includes investigating what is needed to ensure such responses are informed by data and evidence. Linking communicable disease data with a range of health and non-health data sets will improve understanding of the impact of the disease on individuals and communities better than relying on a single data source.

The challenge in creating linked data to ensure responses to public health issues and pandemics are informed by the best data and evidence, is that current data sharing arrangements are complex to navigate. This includes navigating the requirements of the Privacy Act 1988 and the different public health, privacy and data sharing legislation of the states and territories. 

The COVID-19 Register provides a precedent for how these processes can be navigated and insights into the improvements that can be made to facilitate future monitoring of communicable diseases in particular, as part of the new stand-alone Australian CDC. The Register provides a foundation to improve medium, and longer-term health outcomes for COVID-19. Findings from research will provide information on vaccine effectiveness to help guide booster recommendations, better targeting of treatment options for people diagnosed with COVID-19, including for population groups at higher risk of adverse health outcomes, and resource planning to meet the current and future needs of people diagnosed with COVID-19 and future communicable diseases of interest. 

The development of the COVID-19 Register aligns with the identified need for a system that will enhance Australia’s monitoring of COVID-19 treatment and outcomes and response to future pandemics (Bennett 2023; Phelan et al. 2023). Integrating the COVID-19 Register with other national data assets will support future research and further Australia’s commitment to pandemic preparedness. 

Further reading

Related topic summaries