Legacy data warehousing’s dirty little secret
If organizations want to make a dent in the fight against rising health care costs, they need quality data.
Driving better benefits strategies and health care decisions requires high-quality data, yet with legacy data warehouse solutions, benefits and HR leaders struggle to achieve it. Health care data is messy and complex — but if organizations want to make a dent in the fight against declining employee health and rising health care costs, they must face this problem head-on.
Related: Health care data analytics: true meaning, actionability and value
Benefits and HR leaders need efficient and automated data quality processes to achieve the accuracy and flexibility they need to make better decisions and drive better benefits outcomes. Without moving beyond the slow and outdated practices of legacy data warehousing, organizations will never get there.
Today’s top challenges
Software training company O’Reilly’s recent report, “The State of Data Quality in 2020,” identifies the top challenges organizations experience with data quality:
- Too many data sources and inconsistent data (more than 60%)
- Disorganized data stores and lack of metadata (nearly 50%)
- Poor data quality controls at data entry (nearly 47%)
- Too few resources available to address data quality issues (nearly 44%)
These issues ring especially true with employer health benefits. Some organizations must ingest and normalize disparate data from hundreds of sources, including medical claims, Rx claims, lab results, etc. Most organizations still rely on legacy solutions, which often lag behind the latest industry standards in data quality controls and automation. By the time organizations actually get the data, the implications are no longer relevant.
On top of this lack of speed, legacy data warehouses can’t deliver the accuracy and confidence benefits and HR leaders need. Once innovative and ahead of their time, they continue to use manual processes that introduce great possibility for human error. Their on-premises technology means any new fixes or process improvements come slowly or not at all, and organizations run into the same data problems year after year because the solution doesn’t scale or iterate quickly.
Whether they know it or not, benefits and HR leaders are making highly strategic decisions — decisions that impact the business’s bottom line and their employees’ lives — based on unreliable, opaque healthcare data.
3 critical phases of data quality
The first phase of a robust data quality process involves securely ingesting the data, mapping it to a common format, and normalizing it to match members and claims into coherent records. Mapping and normalizing the data files into a common format is complicated — many of the files often have missing fields, invalid codes, inconsistent values, or a host of other issues.
A highly proactive QA approach is needed to run robust quality control rules to inspect the content and assess the raw data files’ quality. When this process is automated, it takes hours rather than weeks. After all the data is mapped, it must be normalized and matched to the organization’s members and claims. This is critical to ensure benefits and HR leaders can understand the story of their data and know which population segments and health issues must be addressed first.
The next step to ensure data quality is enrichment, which takes place by incorporating industry-standard episode grouping, risk grouping, and evidence-based medicine. This phase requires healthcare and data science expertise to group the data according to episodes and risk levels and identify insights into predicted diagnoses.
The final phase in an effective data quality process is anomaly detection. Before promoting the normalized and enriched data to production, it must go through an anomaly detection process to uncover unusual patterns such as significant increases or decreases in spend or enrollment and any missing data during these tests. We also look for changes in data that don’t support the other, such as enrollment and spend moving in different directions. The best anomaly detection practices have a closed feedback loop where any issues and detections are immediately incorporated back into the data mapping and normalization phase to prevent those issues next time.
Why it matters
When it comes to data, integrity is everything. Timely access is also essential, so you can take quick action on the most up-to-date information. What’s the point of even using data to build healthcare plans, adjust strategy, or design interventions if it’s not trustworthy and timely? In 2020, benefits and HR leaders deserve scalable, automated, and flexible processes to ensure the highest level of data quality. The health of their people, and their organization, depends on it.
Roger Deetz is vice president of technology at Springbuk and brings almost 20 years of experience at software companies, 15 of those in the Indianapolis tech community. Before joining the Springbuk team, Roger served as the Vice President of Engineering at Angie’s List, where he oversaw their software engineering modernization project.
Read more: