Our physician office business for medical and laboratory products is now part of Henry Schein Medical — the world’s largest provider of health care products and services for office-based medical, dental, and animal health practitioners.
FOCUS on qualitative dataText mining through natural language processing
The promise of EHRs transforming the practice of medicine reads like a story of unrequited love. Unstructured data mining has been more problematic than initially thought. Natural language processing software may be the technology solution that is key to big data pattern recognition.
With increasing demand for including natural language processing (NLP) of text data for use in health economics and outcomes research (HEOR), there is a need to document and identify best practices.
In HEOR, qualitative data provides greater detail beyond the existing fields captured in electronic medical or health records (EMR/EHR) and have helped bolster patient-related information to be used by physician practices and in research. However, the use of NLP is wrought with challenges stemming from incomplete, missing, or inaccurately entered data. Here, we present one case study with the process, issues, and potential recommendations for using text information in HEOR. As the search for the full potential of NLP in electronic data (the analogous “gold mine”) continues, we must respect the caveats and rigor behind the process (the analogous “miner”).
Overview of EMR/EHR Data
EMR/EHR data can be useful for surveillance, healthcare quality measurement, and in research. One limitation is that data extraction can be difficult.1 Censoring of data might not capture all points in time, there may be bias in testing and treatment documentation, there is inconsistent use of coding and standards, and there is personal variance in documentation styles.2 EMR data may also be specific to the site of service, so it may be representative of a community clinic, unless it is captured through an integrated system. For example, we may only see the outpatient or office visits for a patient. Other resource and site usage may have to be gleaned from NLP-based text mining, such as identifying diagnostic tests taken at other facilities, radiation therapies, emergency room visits, or inpatient visits and hospital stays.
In EMR data, structured data may capture fields such as diagnosis, demographic information, clinical information, treatment orders, and laboratory tests. Unstructured data is captured through text mining to search for specific mentions of a term, phrase, or value. Unstructured data may include comment fields, physician- or other healthcare provider– dictated notes, or supplemental data not otherwise in the EMR. NLP uses algorithms to apply rules, such as sentence boundaries; terms associated with one another; and variations, subgroups, and the mention of a token or term.3 We may be able to search for qualitative measures not otherwise captured.
"There is a growing body of evidence supporting the use of NLP to enrich data captured at the point of care to improve quality of care and outcomes, such as noting side effects of medications to help reduce adverse events and increase treatment efficacy."
There is a growing body of evidence supporting the use of NLP to enrich data captured at the point of care to improve quality of care and outcomes, such as noting side effects of medications to help reduce adverse events and increase treatment efficacy.4
(One published study used EMR data to follow patients with heart failure and identify risk for readmission or death.5
) Moreover, there is an opportunity to help practices and providers improve documentation by assessing the data captured across text fields. In a study validating NLP techniques, a consecutive series of pathology reports from prostatic biopsies were reviewed, and the overall ability of an NLP application to accurately extract variables from the pathology reports was 97.6%.6
These findings support the extended value of data collected through NLP and help us to set the gold standard for the processes to capture valid and meaningful information.
Issues and Suggestions
Using NLP to gather additional data from EMR data sources is becoming common practice in HEOR studies, and the following are examples noted in our work at Cardinal Health Specialty Solutions (CHSS).
At the onset of a project using such data, the research team develops search terms that will be returned along with any potential result terms on a given date. For instance, we may search for “mutation BRCA1” with the result of “positive or negative” and the date of record. The results will then be aggregated with structured data to provide a more complete representation of the patient population of interest.
Issue: NLP mentions a term multiple times on the same date. For example, a patient with chronic myelogenous leukemia (CML) will have more than one disease state noted. This presents a problem to researchers because there are different disease phases in CML that would not simultaneously occur and with different implications for the patient.
Suggestion: Use treatment information also found on the EMR to make sense of which disease state might be the accurate one at the time, because there are very specific prescribing orders and dosages for patients in chronic phase versus blast crisis.
Issue: Frequency of returned search terms and representativeness of data. In a cohort of 152 CML patients identified through EMR data, 90% of the patients had any description of adverse events found in the study time period. Of the entire cohort, 64% specifically had mention of the word “pain,” 43% specifically had mention of “anemia,” and 3% specifically had mention of “febrile neutropenia.”7
Suggestion: Though this information is useful in understanding the frequency of adverse events in a chemotherapy-treated population, the frequency of these text phrases may be underreported. Further comparison against rates in published literature and presenting caveats in analyses is warranted.
Issue: NLP mentions of a word leads to deciphering the presence or absence. For example, if a mutation is listed on the record, one could interpret that a mutation test was performed and a mutation was detected, but it could also mean that that the mutation was simply discussed.
Suggestion: Identifying subgroups of information that may be associated with mutations or the field of interest may help decipher the meaning. For example, if additional terms appear with a known mutation (“positive,” “negative,” “undetected”), this will help strengthen the finding.
Issue: If “NRAS” or “KRAS” biomarker is mentioned, the research team does not know if that means the patient was tested or needs testing. The issue is with context.
Suggestion: Identifying and searching for more text surrounding the phrase of interest may help uncover the specific context. For example, “test results” would be more meaningful for analyses than “test discussed” or “test performed.”
Issue: The frequency of events may be widespread throughout the EMR. For example, the mention of an adverse event may occur as much as 1,000 times per patient.7
Suggestion: Understanding if an event may occur once or multiple times will help the use of its mention. While adverse events or side effects may be more commonly reported, an event such as surgery may only occur once or twice.
Suggestion: Moreover, capturing when a search term was reported will have different implications in evaluating outcomes. Adverse events or mentions of hospitalizations or emergency room visits during treatment should be evaluated separately from those that occur outside of the treatment window.
Through the experiences noted at CHSS above, our research team has been able to identify and offer potential suggestions for the best use of NLP applied to electronic data. Our recommendations include the following:
Parsing out phrases that occur before and after the target term will help enrich the context through the use of modifiers and additional information.
Learning how to integrate NLP data with known outcomes (eg, breaks in treatment or treatment duration) may help us to understand if the search term was closely linked to that outcome.
Defining the time windows for which search terms will be captured is important to the research question (eg, pre- or post-treatment).
For example, adverse event information layered upon treatment duration or treatment holiday would allow us to further understand if a break in treatment occurred due to the presence of the adverse event.
Another example would be applying treatment guidelines. As mentioned, CML patients in chronic phase would have a different expected dosage for treatment with imatinib compared with patients in accelerated phase or blast crisis.8 Comparing the search terms for disease phase along with treatment order data would help provide insights of expected versus actual dosing information.
Validate testing. If this can be performed internally, it will help determine the rate at which NLP is properly identifying a search term.
For example, associating known data points from structured data against unstructured data mentions of a word may give us an idea of the completeness of NLP on the data. If “metastatic” shows up through text mining and “stage IV” is indicated on a structured field, an association can be made for analyses.
Lastly, applying clinical expertise may help us think about how data is captured on a chart and how it might be represented.3 Having the review of search terms performed by a clinical nurse or oncologist has allowed us to further refine the terms and variations of terms to search.
Common abbreviations or shorthand used in medical dictation may be applied to the search term criteria. If “NSCLC” and “non-small cell” and “non-small cell lung cancer” are each searched, there is a larger net to catch the phrase in question.
The use of NLP applied to data to supplement public health practices or clinical research involving outcomes is becoming more common and may help extend the data found in secondary sources such as EHRs/EMRs. Understanding how to search for key associations with known data points will strengthen the utility of the data found through NLP. Setting caveats in the final presentation of those analyses is important.
Also, identifying strings of terms rather than search terms alone may help us to use modifiers for context (or the presence or absence of an event or outcome). There is growing potential for the use of NLP data in clinical research, and further documenting the potential problems, degree of completeness, and limitations will help future mining techniques and, thus, the quality of data uncovered. At CHSS, we are also learning from the potential obstacles of using data extracted through NLP to update the process and become more efficient.
At its core, NLP is a field of computer science and artificial intelligence that is seeking to advance associations between computers and human language. We can educate ourselves on the techniques used, software, and future directions. With the growing body of literature using unstructured data run through NLP, perhaps we may find a common methodology that can be applied in our own HEOR studies. Respecting the “miner” is as valuable as the “gold.”
1 Schulman KL, Berenson K, Shih Y-C, et al. A checklist for ascertaining study cohorts in oncology health services research using secondary data: report of the ISPOR Oncology Good Outcomes Research Practices Working Group. Value Health 2013; 16:655-669. 2 Weiner M. Evidence generation using data-centric, prospective, outcomes research methodologies. Presented at the 2011 American Medical Informatics Association Summit on Clinical Research Informatics; San Francisco, CA; March 2011. 3 Nadkarni PM, Ohno-Machado L, Chapman WW. Natural language processing: an introduction. J Am Med Inform Assoc 2011; 18:544-551. 4 Roop ES. A powerful combination. For the Record 2012; 24:18. 5 Amarasingham R, Moore BJ, Tabak YP, et al. An automated model to identify heart failure patients at risk for 30-day readmission or death using electronic medical record data. Medical Care 2010; 48:981-988. 6 Thomas AA, Zheng C, Jung H, et al. Extracting data from electronic medical records: validation of a natural language processing program to assess prostate biopsy results. World J Urol 2014; 32:99-103. 7Comparative effectiveness of third-line treatments in chronic myeloid leukemia using real-world (EMR) data. 2015; CHSS data on file. 8Gleevec [package insert]. East Hanover, NJ: Novartis Pharmaceuticals; 2015.
Explore the importance of Health Economics and Outcomes Research (HEOR) and real-world data in meeting the demands of a dynamic healthcare system.