Price transparency in healthcare has long been sought after in the American healthcare industry.

The goal is for a consumer to have easy access to the price of basic healthcare services, like a colonoscopy, a vaccine shot, or a routine office visit. However, between the contracts of payers and doctors, what payers will or will not cover, and what constitutes as ‘in-network’, there are endless challenges to bringing healthcare cost transparency to the consumer.

My latest project as an InfoWorks data consultant is focused on overcoming the challenges in interpreting and parsing the data to achieve healthcare pricing transparency. With the initial data files to get us to full transparency rolling out, I’ll discuss the challenges we’ve experienced in parsing and understanding this data. Most of the problems lie in the current price transparency files themselves. Size, format, and practicality of the data all raise technical issues.

Challenges Encountered in Interpreting and Parsing Healthcare Pricing Data

Three primary challenges we have come across in our price transparency analytics work are:

  • Invalid file formats and links
  • Size of the data
  • Hidden information

Let’s take a look at each of these barriers.

Invalid Healthcare Pricing File Formats and Links

In order for automation to be successful, the structure of the data files must meet specific criteria. We experienced several roadblocks of this type when trying to parse healthcare pricing information. For instance:

Unreadable JSON

The current expectation is for all price transparency files to be machine readable in JSON. If the JSON is not properly formatted, it cannot be read by machines, making it impossible to parse rates from the file.

Broken Links

In the same way, if a payer’s page with links to their price transparency files in the open cloud space are broken, no one outside of the company will be able to access the prices within those files.

Non-compliant Schema

The Centers for Medicare & Medicaid Services (CMS) has released a standard schema for healthcare organizations to follow. Files must be compliant to the schema because parsers are expecting this schema when reading files. Non-compliant schema can lead to delays in claims processing and billing errors, at a minimum.

The solutions for format related price transparency issues are obvious but not necessarily simple to achieve. To identify and correct any instances where JSON is unreadable, links are broken, or non-compliant schema are being used, agencies need to do an audit of all related code and links. Additionally, periodic audits of each should be performed to check for any issues that may have arisen. And, going forward, JSON, links, and schema should be checked prior to publishing to ensure they work as intended.

Inconsistent File Structuring

In-network price transparency files are structured differently from out-of-network price transparency files. In fact, even within in-network files, the data can follow different structures. This complicates development, requiring duplication of effort to write queries to grab data from all the various structures.

Developing a standard structure that all healthcare organizations follow for in- and out-of-network healthcare pricing would make parsing healthcare pricing data considerably easier.

Price Transparency Data Size

Price transparency in healthcare comes with an immense amount of data. When you have thousands of providers, thousands of different services, and hundreds of health insurance companies with a wide range of health plans, the data enlarges quickly. The largest healthcare payers’ data can contain millions of files and billions of rows of rates. The sheer enormity of the data makes summarizing the data difficult. Getting a price for just one service from a provider is an arduous task if looking across all payers. Trying to find a few rates in a sea of 100 billion rates is a hard challenge even for the biggest processors.

While it isn’t surprising that the file sizes are incredibly large, there are a couple of inherent problems that are exacerbating the issue.

Lack of One True Rates

There can be multiple rates in the files for the same payer, provider, and service, down to the exact modifiers. This duplication creates confusion about the true rate. For example, we’ve found instances where a payer file has two rates for a CT scan at the same provider for their PPO plan.

Optimally, what we should find is that each payer, provider, health insurance plan, and service combo has only one unique rate, so we know exactly what the price is. However, the current structure of the files is missing crucial information, making it hard to find the true rate.

Impractical Rates

There are many rates in the price transparency files that do not make sense practically and would never be used.

For example, you would expect to find a rate for a colonoscopy being performed at an ambulatory surgery center. In contrast, you would not expect to find a rate for cataract surgery being performed at an OB-GYN clinic. This, obviously, makes no sense. However, because the price transparency files list every possible service against every provider covered, most rates in the files are actually nonsense, since most clinics provide a small range of specialized services.

Hidden Information

Another problem we ran into is that some rates have important information hidden in free text columns. This makes it difficult for a machine to decipher the most accurate code for the type of service performed.

Take revenue code 0490, for example. This is the code for generic ambulatory surgical care. It does not reveal the type of ambulatory surgical care that was performed. That information is a billing code that lives in a completely different column with a lot of other text. In this scenario, the likelihood is that wrong code would be used for billing.

To eliminate this problem, all information related to healthcare pricing should be included the machine-readable code and follow standard formatting and schema best practices and guidelines for healthcare pricing transparency.

The Quest for Healthcare Pricing Data Transparency

As we’ve worked our way through billions of data points in the quest for pricing transparency in healthcare, we’ve uncovered multiple, complicated obstacles that must be overcome. They are largely technical in nature, with the price files themselves driving the bulk of the issues. Corroboration between teams has been a key factor in eliminating faulty code and broken links, optimizing systems, and reducing data size.

Ensuring that the system structure properly captures every piece of information we could possibly need to evaluate a rate, so healthcare pricing can be accessed transparently in a consumer-friendly format is an immense undertaking; but it is a challenge I and the rest of the team at InfoWorks are excited to be a part of solving. Reach out to our healthcare data analytics team if you are looking for support with your healthcare price transparency tools.

About Meghan Norris

Ms. Norris is a data scientist with experience in machine learning, predictive modeling, database design, and data engineering. She also has over seven years of experience in full stack development. Ms. Norris has a bachelor’s degree in computer science and a master’s degree in computer science with a specialization in Interactive Intelligence. She can extract patterns and important features in data to best describe the stories and insights found in data using a wide range of skills from data visualization, data transformation, and machine learning.

More Resources from Meghan

We look forward to hearing what initiatives you’re working on and how we can help you accelerate success. Let’s talk.