Data harmonisation is a sorting, matching and grouping process. In the ANZLEAD project datasets together from multiple sources have been analysed to ascertain how the data they contain can be modelled. Examples datasets from Australia and New Zealand are:
- Australian Election Study (1987- )
- Australian Election Results (2001-2019)
- ANUPoll (2008 – )
- New Zealand Election Study (1990-2020)
The datasets contain political science data in common, but that data is conceptually structured, captured and arranged in different ways.
A Data Harmonisation Framework is in development to capture the context and methodology for the data integration (in accord with FAIR data principles). The framework ensures that the process of organising this political science data is well documented and data harmonisation can be undertaken systematically ongoing. The high level view of the data collections available has been established that relate to the five key populations that political scientists investigate: the voting public, political elites, electorates, elections and parliaments.
For data harmonisation, a conceptual map has been developed to understand how data from those different collections can be brought together. Capturing the social relationships between the entities represented in the data e.g., a political candidate can be part of a political party; and identifying legislative structures e.g., that the Governor General represents the Crown, are critical steps in understanding how to meaningfully and practically bring these data collections together.
Click the image to view the concept map in full.
Having a concept map is an important building block for the next phase of the project. In the next phase the project team are investigating creating and integrating controlled vocabularies. The Data Harmonisation Framework serves as a support structure for building a national data asset for political science research in Australia and New Zealand.
Image credit: sputnik Connect the Dots cc-by 2.0