All healthcare data shared by an organization both internally and externally must adhere to the HIPAA Privacy Rule as defined by the U.S. Department of Health and Human Services. The HIPAA Privacy Rule restricts the unauthorized disclosure and release of all protected health information (PHI). PHI identifiers, as defined by the HIPAA Privacy Rule, are comprised of the following eighteen types of individual identifiers. These identifiers must be removed or in some way de-identified prior to sharing the data. Once the PHI data is de-identified, it is no longer subject to the HIPAA Privacy Rule and can be freely released and shared by an organization.
The Privacy Rule outlines two distinct de-identification methodologies: Safe Harbor and Expert Determination.
The Safe Harbor method requires the removal of the eighteen types of HIPAA individual identifiers defined above from the data set. The remaining data set cannot be used to identify an individual. The data set can then be shared within and between organizations.
The Expert Determination method requires that a statistician analyze and certify the data set. The statistician applies statistical methodologies to determine the risk of re-identification of individuals within the data set. The statistician provides a certification document detailing the results of the analysis and guidance regarding the data fields requiring de-identification and/or additional transformation to minimize the risk of re-identification of individuals within the data set. The field level de-identification and transformation guidance must be implemented prior to sharing the data set.
There are multiple approaches to the de-identification of PHI data. In practice, the approaches may be used alone or in combination. The selection of a de-identification approach should be based on the projected use of the data.
Through partnerships with leading data providers, the MSA Healthcare Data Management team provides disparate data, including Laboratory Diagnostic data, Medical Claims data, Electronic Medical Record (EMR) clinical data, and Digital Behavioral data, enabling customers an up-to 360 degree view of the patient journey.
The format of the data drives the configuration of the MSA De-Identification Engine. It also helps the MSA Healthcare Data Management team determine if the data must be pre-processed prior to de-identification and patient-matching processing.
The most common data formats we receive are fixed or de-limited flat files. We can also process EDI, HL7, EMR, and other formats – these formats typically require pre-processing.
The frequency of the data sent to MSA Healthcare Data Management varies by each data type from daily, weekly, monthly, quarterly, etc. The volume of data also varies by data type.
The MSA De-Identification Engine creates Patient Tokens from the raw PHI fields in the data record. The Patient Tokens are used in the MSA Data@Factory patient matching process to assign an MSA Patient ID to each record.
A set of raw PHI fields such as first name, last name, and date of birth are very commonly used in token creation. It is important that this data is available across all of the data types to ensure consistent token creation across all data sources.
The MSA Healthcare Data Management team can de-identify, integrate, and aggregate any data set containing the key patient tokens elements, such as Patient First Name, Last Name, and Date of Birth. Data sets without these token data elements can be integrated and aggregated by linking, using data elements common between the data sets such as Plan ID, Product ID, or Claim ID.
Data sets without token data may still require processing through MSA’s De-Identification Engine to consistently encrypt/obfuscate non-token data used to align data sets.
The list of data types that MSA de-identifies includes:
Rx Claims Data
Medical Claims Data
Rx and Medical Claims Remit Data
EMR Data
Digital Behavioral Data
Consumer Data
SP Hub Data
Patient Assistance Plan Data
Yes; our process is as follows:
Each data source is de-identified using the MSA De-Identification Engine to create one or more patient tokens used by the MSA Data@Factory patient matching process. The patient matching process assigns a Patient ID based on customer-defined patient matching business rules. The MSA Data@Factory creates and maintains an anonymous patient database to enable the longitudinal alignment of the disparate data from multiple data sources to provide a complete view of the patient journey.
The MSA Data@Factory can be configured to support multiple complex patient matching strategies including multi-pass patient matching to adjust for variations in the data provider data.
The size of the patient population is critical to the HIPAA certification of the de-identified output. Very small patient populations may require data elements to be aggregated to prevent the potential identification of patients in the de-identified data set through probabilistic forecasting techniques. E.g., in a very small de-identified, it may be necessary to use two digit zip codes in place of three digit zip codes in the de-identified output file.
The MSA De-Identification Engine HIPAA certification requires the certification of each de-identified output format. The HIPAA certification will determine if any additional aggregation is required for the data set.
The MSA De-Identification Engine is typically installed on a Red Hat Linux or Microsoft Windows environment.
A set of PHI data elements must be available for processing by the patented MSA De-Identification Engine. The PHI elements are used to create patient tokens for use by the patient matching process to assign a common Patient ID to the data. A Patient ID cannot be assigned if any of the token fields are not available. Data without a Patient ID cannot be longitudinally aligned in the anonymous patient database.
Typical patient token data requirements include the patient’s First Name, Last Name, Date of Birth, Gender, and Zip Code.
The MSA De-Identification Engine can be deployed either within a data provider’s environment or at MSA. The operating system and hardware specification requirements listed are listed below are for data providers choosing to have the MSA De-Identification Engine deployed within their environment. A Business Associate Agreement (BAA) is required for all data providers opting to send the raw PHI to MSA for de-identification within the MSA HITRUST-certified environment.
MSA De-Identification Engine Operating System Specifications
MSA De-Identification Hardware Specifications