Smart Information Extraction from EHR Documents

Eliminated 15-25% of manual efforts.

Achieved OCR accuracy of ~92% and ASR accuracy of greater than 90%.

smart-information-extraction

SITUATION

The client aimed to extract:

  • Text from documents in different formats (searchable PDFs, scanned PDFs, image files, etc.).
  • Major structural components from the document (eg. Prescription, medical history, etc.).

ADDITIONAL REQUIREMENTS

The platform had to integrate search functionality using elastic search and perform Automatic Speech Recognition (ASR) to navigate the EMR application.

CHALLENGES

Conventional methods were time and cost-intensive. The data structure was heterogeneous and ranged from fully structured to completely unstructured plain texts, affecting the accuracy of automatic data analysis.

HIGHLIGHTS

  • Performed OCR on the complex (skewed, rotated) historical (poor quality) scanned documents.
  • Modeled the documents to extract the structure as required by the client.
  • Optimized the architecture and pipeline to return instantaneous search results using elastic stack.
  • Tuned and trained the open-source ASR engines for the specific case.
  • Applied two-stepped ASR processing to optimize data transfer load for voice navigation.

 

BENEFITS

Achieved OCR accuracy of ~92% and ASR accuracy of greater than 90%. The search solution helped end-users (doctors) eliminate 15-25% of manual effort. The voice-based navigation also helped improve the quality and efficiency of the patient-doctor interaction.