Overview
We are in the midst of a fundamental shift in the way that payers, physicians, hospital systems, and suppliers exchange and share information. Consumers are becoming more active in their diagnosis and treatment choices. Providers need to capture and move data more efficiently to meet new regulatory and reporting requirements. Payors want to ensure that best practices and evidence-based medicine is being used to minimize costs. Drug makers and researchers are keen to leverage experience to improve research. And everyone in the process is interested in ensuring that data is being mined to ensure the best outcomes. As a result, all of the stakeholders in healthcare must learn how to best utilize the new technologies and networks being deployed to meet the needs described above, and potentially adopt and integrate them into their business and technology strategies.
Knowledge becomes Guidance
Traditionally science gives knowledge to clinicians, who in turn use this knowledge to inform and treat patients. However, in a modern healthcare setting providers and patients need assistance in the acquisition of knowledge relevant to their conditions, treatment and the opportunity to discuss this knowledge with their peers, friends, family, advocates and other clinicians.
The problem is not a lack of will or incentive. Cutting through this Gordian Knot involves getting enough information from the physician, nurse and patients to ensure the proper outcome. Clinical databases are scarce in healthcare today and most analytical tools have trouble extracting meaningful information from free text, claims and pharmacy data. Most EMR’s have not been designed as clinical systems and have trouble sending or accepting 3rd party clinical data. Every day the US generates millions of dictated and typed medical reports each requiring a laborious and time-consuming process of data entry and manipulation, where highly trained and expensive experts manually cull the needed information from each report. Many HIT vendors can’t support the granular clinical data culled out of these reports to meet meaningful use and other regulatory requirements put in place by the Federal Government.
With this guidance in mind clinicians now use a combination of Data Capture methods and devices along with Natural Language Processing, Semantic Search and knowledge driven software to solve the myriad of logistical, technological, financial and administrative challenges inherent in the clinical documentation and care giving process. The Health Information Technology for Economic and Clinical Health (HITECH) Act brings together, for the first time, a focus on clinical documentation standards (CDA), interoperability of healthcare data (Snomed, ICD-9, ICD-10), EHR Adoption, Portability of Data (HIE, RHIO), and physicians incentives designed to speedily and accurately match disparate patients information when and where it’s needed, securely. Natural Language Processing (NLP) allows clinicians to document care any way they chose (pen, keyboard, and dictation) by providing a solution that can identify, encode and extract a meaningful set of appropriate healthcare data into a Certified EMR without changing their workflow. This unique solution combines front end data capture, NLP and Semantic Search supported by a Semantic Knowledge Base and a XML based Common Application Framework required to ascertain this goal.
The information within a clinical note is not solely contained within fixed highly structured fields we normally associate with a database – such as name, address, phone number, etc. The description of the chronic problems, current medications, plan, HPI, and the specific Past Medical History, Review of Systems, Vitals, Labs, Social/Family History, Assessment, and plan are often expressed in free text narratives that require a skilled clinician or abstractor to interpret. This facet of clinical documentation makes up the “black art” of deciphering the active diagnosis against which to match a patient’s condition or status.
The description, storage and classification of patient information are a fundamental component of the clinical documentation process. Patient information must be gathered from multiple sources (the patient interview, the medical history, admission report and radiology reports) and be placed in a normalized or “canonical” format so that a match against meaningful use criteria and passed into an EMR.
Natural language Processing is a multi-step process, starting with the lexical analysis of free text into its grammatical components: nouns, verbs, adjectives, etc. and its syntactic architecture of phrases, sentences, tables etc. In order to move beyond the specific language based elements of vocabulary, grammar and syntax one needs to “understand” meaning. Meaning is expressed in terms of concepts and relationships.
In order to extract meaning from “free text” one needs to associate subjects, verbs and objects with previously defined concepts. For example, “amoxicillin is an antibiotic” and by storing this concept of “what amoxicillin is” in a computer one can “interpret” the text containing these words (amoxicillin, antibiotic…). The most common relationship is an “is a” relationship, that is pharyngitis “is a” disease, tamoxifen “is a” estrogen antagonist and so on. Similarly one can build a list of diseases such adenosarcoma, small cell carcinoma, etc. or a hierarchy or taxonomy of diseases.
By parsing, the text for words in the taxonomy the computer can deduce that the word is a disease and then by utilizing the syntax, decide if a relationship exists. By classifying the word as either the subject or object of a phrase, the computer can extract a concept. Finally, by adding synonyms to the taxonomy the computer can greatly enhance its extraction of meaning from free text because it understands more words that are similar to those for which it “knows” a concept.
Once all the Clinically Relevant data has been normalized and tagged it is housed in a Common Application Framework the process of interpreting and analyzing the data it contains can begin. Once encoded, the information is easily available and accessible for further clinical processes like billing, reimbursement, quality assurance analytics, Pharmacovigilance, Clinical trial identification, registry population, Adverse Event Identification, suspicious findings, and data mining. The whole process takes place with little or no change to existing workflows supporting front end data capture, near real time decision support and retrospective analysis. One of the expected results of the PCAST report. A natural language query based on a Semantic Output enables either patients, physicians or administrators answers to natural language questions to identify a select body of data. (Patient medical information in the case of the patient is used to screen and/or match patients to trials. For Meaningful Use, Core Measures, PQRI, Infection Control, Registry Population) Finally, by extracting meaning from the clinical narrative data the VA will be able to categorize the clinical data parameters into a canonical form that may be used for matching or screening of patients with appropriate conditions. The power, specificity, and flexibility of semantic search allows for the discovery and dissemination of clinical information for optimal decision making.
Semantic Search is helping users:
• Unlock unstructured clinical data for real time coding and extraction of information
• Uncover trends and patterns to reach organizational goals
• Anticipate business changes and needs
• Reduce operational risk by anticipating clinical events
• Identify candidates for clinical trials
• Develop early warning systems for Infection Control across all facilities or geographic regions -
• Federate searches across all systems, file directories, journals & web sites across all data types
Semantic Knowledge Base
An adult human rarely has to think hard in order to realize that a dog is an animal, which is a living thing, which is a material biological entity that therefore can eat, move, and procreate. In order to “understand” a document, first one has to map out concepts into a knowledge base or ontology. Quite simply an ontology is the description of knowledge about a certain domain (e.g. medicine, genomics, and clinical trials) as expressed in precisely defined terms that are interconnected through their relationships. The implications of such knowledge are easily available to the human mind but computers, as fast and powerful as they are, have no knowledge of this sort and can make no such inferences. However, we must rely on computers to do much of our searching and thinking and we must, therefore, know how to get computers to understand what we want. This in part means getting computers to understand and act on what we know about our world.
In the field of medicine, these efforts have been sponsored by the NIH for over 20 years and medical ontology’s and taxonomies are many and varied. There are many tools for managing ontological databases – both commercial and open source.
Common Application Framework
A Common Application Framework (CAF) is a platform that provides a set of services used to build applications and support business processes based on content. A CAF native data format is XML. XML content is accepted in “as is” form. Content in other formats is converted to an XML representation when loaded into the server. A CAF manages its own content repository and is accessed using the W3C–standard XQuery language or Semantic Search Engine. (By analogy, a relational database is a specialized server that manages its own repository and is accessed through SQL.)
The next generation EMR will have a CAF which should do much more than just store documents. It must be a secure, infinitely extensible platform for building content–driven applications. It needs to be flexible, granular, portable and modular in nature. To support secondary data use it should have the ability to redact identifiable information while creating a longitudinal record. Foster a Simi Open Architecture allowing for easy collaboration with other solutions vendors that can add value with no or little integration cost to the end user. Smart problem lists driven via NLP and dictation. Adverse event identification and notification Medication reconciliation, identification of best practices based on clinical documentation, identification and notification when a physician’s patient meets the criteria for a clinical trial and make that information available to care givers providing a 360 degree view to all stakeholders.