A SCALABLE, LOW-COST FRAMEWORK FOR MULTILINGUAL INTELLIGENT DOCUMENT PROCESSING FOR CONTINUITY OF CARE

Kale, Apoorwa; Khandelwal, Yash; Pandhare, Vibhor; Ghosh, Atreyee; Pathak, Nidhi; Lad, Bhupesh Kumar

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/17803

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kale, Apoorwa	en_US
dc.contributor.author	Khandelwal, Yash	en_US
dc.contributor.author	Pandhare, Vibhor	en_US
dc.contributor.author	Ghosh, Atreyee	en_US
dc.contributor.author	Pathak, Nidhi	en_US
dc.contributor.author	Lad, Bhupesh Kumar	en_US
dc.date.accessioned	2026-02-10T15:50:11Z	-
dc.date.available	2026-02-10T15:50:11Z	-
dc.date.issued	2025	-
dc.identifier.citation	Kale, A., Khandelwal, Y., Pandhare, V., Ghosh, A., Pathak, N., Saitya, B. S., & Lad, B. K. (2025). A SCALABLE, LOW-COST FRAMEWORK FOR MULTILINGUAL INTELLIGENT DOCUMENT PROCESSING FOR CONTINUITY OF CARE. IET Conference Proceedings, 2025(28), 161–166. https://doi.org/10.1049/icp.2025.3682	en_US
dc.identifier.isbn	9781807050351	-
dc.identifier.isbn	9781807050207	-
dc.identifier.isbn	9781837247257	-
dc.identifier.isbn	9781837249916	-
dc.identifier.isbn	9781807050375	-
dc.identifier.isbn	9781837245277	-
dc.identifier.isbn	9781837247295	-
dc.identifier.isbn	9781837247264	-
dc.identifier.isbn	9781837247325	-
dc.identifier.isbn	9781839537776	-
dc.identifier.other	EID(2-s2.0-105027169575)	-
dc.identifier.uri	https://dx.doi.org/10.1049/icp.2025.3682	-
dc.identifier.uri	https://dspace.iiti.ac.in/handle/123456789/17803	-
dc.description.abstract	Paper-based prescriptions and reports constitute a major part of the medical health records across the globe. Accordingly, paper-based manual data recording is a common practice in the Community Health Centers (CHCs) in India. These records result in poor handling, fragmented information, inefficient data retrieval, sharing, and storage of clinical data. To address this gap, we present Intelligent Document Processing Application (IDPA), a low-cost, scalable data digitization pipeline combining Optical Character Recognition (OCR) and Vision-Language Models (VLMs) for converting bilingual (Hindi-English), handwritten, and numerical medical records into structured digital formats. IDPA comprises a two-stage pipeline, where Stage 1 employs table cell segmentation using OpenCV and Stage 2 uses OCR extraction with PaliGemma VLM. As a proof-of-concept, the application was tested using a dataset of 150 patient records of the Indian population, which exhibited prevalent data input issues including overwritten texts, obscured columns, and the application of whiteners. PaliGemma, refined using over 650 labelled table cell images, attained 74% accuracy and a 13% Character Error Rate (CER), outperforming other open-source VLM models in extracting the medical records. The extracted data is organized into structured dataframes, served through a FastAPI endpoint, and accessible through a Progressive Web App (PWA). The interface supports secure user authentication via Clerk API and enables real-time image upload, editable tabular outputs, and data export in CSV/PDF formats. Together, these digital tools offer an affordable, user-centric approach to improve healthcare data management in low-resource settings. They hold strong potential for integration with national health systems, improvement of continuity of care, enablement of longitudinal monitoring, and expansion into predictive analytics for clinical decision support. © This is an open access article published by the IET under the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/)	en_US
dc.language.iso	en	en_US
dc.publisher	Institution of Engineering and Technology	en_US
dc.source	IET Conference Proceedings	en_US
dc.title	A SCALABLE, LOW-COST FRAMEWORK FOR MULTILINGUAL INTELLIGENT DOCUMENT PROCESSING FOR CONTINUITY OF CARE	en_US
dc.type	Conference Paper	en_US
Appears in Collections:	Department of Mechanical Engineering IITI DRISHTI CPS Foundation Mehta Family School of Biosciences and Biomedical Engineering

Files in This Item:

There are no files associated with this item.

Show simple item record

Altmetric Badge: