How to Get the Best Quality Data
in Public Health
In public health, accurate and reliable data plays a pivotal role in guiding decision-making, shaping policies, and optimizing resource allocation. With the increasing complexity of challenges faced by public health organizations, particularly the rapidly evolving health threats posed by syndemics (two or more synergistic conditions affecting a specific population or geographic area), ensuring data quality is critically important. In this blog post, we will explore six crucial dimensions of data quality – accuracy, completeness, consistency, timeliness, uniqueness, and validity – and discuss their relevance to public health. In the process, we will also offer examples of how public health organizations can enhance their data quality to more effectively utilize their data for the benefit of their clients, communities, and the common good of public health.
When we refer to accuracy in public health data, we’re referencing the degree to which data reflects the actual state of disease prevalence, vaccination rates, healthcare disparities, and more. In dealing with this dimension of data quality, the question we’re asking is, “Is this data correct?”
Data accuracy, which can be impacted by the practices and tools used for data collection, is essential for making informed decisions and implementing effective public health strategies. Inaccurate data can lead to misguided strategies and inefficient use of resources. To promote data accuracy, public health organizations can be sure to employ sound collection methods, provide thorough training on these methods to staff members who use them, and conduct regular audits of their collected data. When an organization’s staff members are key-entering data into a data management system, features that protect against data entry fatigue, such as simple and clear user interfaces, and consistent and predictable design, can be additional safeguards; after all, data entry fatigue can be one of the ways that errors get introduced into datasets!
Completeness in data quality means having all the necessary information in a dataset.
Complete data allows for comprehensive analysis and a holistic understanding of health trends and patterns. Incomplete data, on the other hand, can result in inaccurate assessments and hinder the identification of disparities, as well as compromise the effectiveness of intervention strategies. To ensure data completeness, public health organizations can establish clear protocols for data collection, define minimum data requirements, and take measures to address any gaps or missing data revealed by alerts or data quality reports available in their data management system. By ensuring data completeness, organizations can enhance the reliability and comprehensiveness of their epidemiological insights.
When we refer to consistency as a dimension of data quality, we mean uniformity and coherence across different datasets, sources, and time periods.
Consistent data in public health enables accurate comparisons, trend analyses, and the identification of disparities. Inconsistent data can lead to compromised bases for comparison, misleading analyses, and challenges in resource allocation and quality monitoring. This inconsistency can arise from variations in an organization’s data collection methods, coding systems, or reporting standards. To prevent problems related to inconsistent data, public health organizations can establish standardized data collection protocols, promote data quality assurance practices, and invest in data integration and interoperability strategies. Using a data system with tools that help ensure data consistency, such as validation checks and alerts about missing or incongruent data, can also be helpful.
Timeliness in public health data quality refers to the ability to access and disseminate relevant data as promptly as possible.
Ready access to real-time data can be crucial in potential public health emergencies, as it enables swift detection of and response to emerging threats. Delayed or outdated data can allow problems to escalate, perhaps worsening in their impact before being identified and addressed. To promote data timeliness, public health organizations can employ tools and practices for both data collection and reporting that minimize delays and ensure the timely availability of critical health information. Staff members might use mobile devices in the field, for example, enabling them to access, populate, and get reports from their data management system even when they are not in the office. By prioritizing timeliness as an important dimension of data quality, organizations set themselves up for being better able to take quick action in response to emerging threats before these threats become dire.
Uniqueness in data quality means avoiding duplicate or redundant records in a dataset.
Duplicate data entries in public health can distort epidemiological analyses, misrepresenting disease prevalence and undermining the reliability of findings. To mitigate these problems, public health organizations can implement data deduplication strategies, use unique identifiers or record-matching algorithms to avoid duplications in the first place, and run data quality reports on a regular basis to ensure uniqueness. By prioritizing data uniqueness as an important dimension of data quality, organizations significantly enhance the accuracy and reliability of their reporting, enabling more informed decisions regarding responses, plans, and resource allocation, and ultimately improving public health outcomes.
Whereas data accuracy is about data being correct, data validity is about data representing what it is intended to measure or describe through adherence to established standards and guidelines for collecting the type of data in question.
Data validity ensures the relevance and reliability of health information and enables meaningful analysis and interpretation. Invalid data can arise from data entry errors and inconsistencies in data collection procedures. To ensure data validity, public health organizations can establish consistent rules and guidelines for data collection, use automated processes called validation checks in their data system to verify the integrity and accuracy of incoming data, and conduct regular audits and data cleaning. In a data system, skip logic can also help ensure validity by making sure users never see questions that aren’t relevant or to which answers would contradict the data they’ve just entered. By upholding data validity, public health organizations can reduce the risk of misleading analyses, build confidence in their findings, support evidence-based decision-making, and drive effective public health interventions for their clients and communities.
In conclusion, these six aspects of data quality – accuracy, completeness, consistency, timeliness, uniqueness, and validity – are essential for effective public health data management. By prioritizing data quality initiatives, investing in appropriate technology, and promoting data stewardship practices that support these dimensions, public health organizations can fully utilize their data to improve outcomes, address disparities, and promote equity.
Public health organizations may have unique data quality requirements beyond what we’ve discussed in this blog post. At Luther Consulting, we’re always available to consult with new and existing customers to help them assess their specific needs and tailor their data management practices accordingly. Our client-centered public health data system, Aphirm®, can be customized to meet those unique requirements. To schedule a conversation with us, you can use the contact form below.
For more information about accuracy, completeness, consistency, uniqueness, validity, and timeliness as dimensions of data quality, see The Six Primary Dimensions for Data Quality Assessment: Defining Data Quality Dimensions (DAMA UK).