What You Missed in Healthcare IT: April Edition

Understanding Strategic Investments with Cleveland Clinic Innovations' JD Friedland

Colin DuRant

Director of Research, Elion

s weekly market map series where we break down critical vendor categories and the key players in them. For more, become a member and sign up for our email 

Real-world data is gaining momentum in healthcare—not just because there’s more of it, but because it offers a clearer view of what’s actually happening across the patient journey. From EHRs and medical claims to imaging and genomics, these data sources are helping providers, health systems, and researchers spot patterns, close care gaps, and make more informed decisions at scale.

However, the sensitive and highly-regulated nature of health data means there are barriers to compliantly using this data to develop AI/ML models both internally or externally, analyze health system performance utilizing external benchmarking data, or engage in novel partnerships requiring in external sources of data.

There are many cases where patient-identifiable data sharing is not feasible, but analyses on linked datasets still need to be executed at the patient-level. 

 emerges as a critical solution, opening up options for sharing and linking while maintaining security and compliance.

What Is Patient Data Tokenization (and Why Is It Necessary)?

Patient data tokenization is a “...sophisticated solution for safeguarding sensitive healthcare information by replacing identifiable data elements with unique tokens. This process ensures the confidentiality of patient data while maintaining data integrity and complying with regulatory requirements” (

). Once a dataset has been tokenized, that dataset can then be shared externally with trusted partners for use and linkage with other tokenized datasets.

 always allowed for de-identification, as part of the HITECH Act in 2009, Congress 

 HHS to provide clearer guidance on the two approved methods: Safe Harbor and Expert Determination. Data de-identified through 

—removal of names, most geographic information, dates tied to an individual, and more—loses significant research utility, so most patient data tokenization will de-identify data to meet expert determination requirements. 

does not designate specific data elements for removal, but instead, requires an expert to certify a dataset has a “very small” risk of allowing patient re-identification.

Let’s use an example of a health system in partnership with an external data partner who may want to combine external utilization data with their own data to perform a 

 analysis. Using a patient data tokenization service will allow the health system and data partner to combine datasets without exposing PHI.

Prior to export, most patient data tokenization partners will require a second hashing or encryption of tokens to close the loop of de-identification and prevent reverse linking of tokens back to identifiable data.

The attributes that differentiate patient data tokenization vendors are: Are they a closed ecosystem, semi-open, or fully open-source; do they handle expert determination in addition to tokenization; and are they part of a larger health data platform?

, focus on de-identification and tokenization of multi-modal data like images or video or act more as data vaults, like 

Skyflow’s Health Care Data Privacy Vault

As data privacy and cybersecurity concerns grow in healthcare IT, patient data tokenization vendors must stay atop a changing regulatory landscape. Additionally, as

 becomes viable for healthcare research and analytics, we expect vendors to expand beyond traditional de-identification offerings into “lookalike” synthetic data generation.

Real-world data is gaining momentum in healthcare. However, the sensitive and highly-regulated nature of health data means there are barriers to compliantly using this data to develop AI/ML models both internally or externally, analyze health system performance utilizing external benchmarking data, or engage in novel partnerships requiring in external sources of data. Patient data tokenization emerges as a critical solution, opening up options for sharing and linking while maintaining security and compliance.