Science curiculum vitae personally - University of Da Nang




	Thông tin chung

	English

	Đề tài NC khoa học
	Bài báo, báo cáo khoa học
	Hướng dẫn Sau đại học
	Sách và giáo trình
	Các học phần và môn giảng dạy
	Giải thưởng khoa học, Phát minh, sáng chế
	Khen thưởng
	Thông tin khác

	Tài liệu tham khảo

	Hiệu chỉnh


Số người truy cập: 109,888,504

Improvement of Machine Learning Method by Combining Flow Text and Layout Text in Extracting Information from Scanned Healthcare Documents

Tác giả hoặc Nhóm tác giả: Van-Minh Le, Thi Thanh Ha Hoang

Nơi đăng: Frontiers in Intelligent Computing: Theory and Applications; Số: 1;Từ->đến trang: 265-274;Năm: 2020

Lĩnh vực: Công nghệ thông tin; Loại: Bài báo khoa học; Thể loại: Quốc tế

TÓM TẮT

Electronic Health Records system plays a very important part of hospitals and also clinical units. The basic problem of this kind of system is how to migrate data from traditional system which contains typical documents to new one. The main solution of this paper is to extract data from scanned healthcare documents (.pdf file) and insert them into database. Normally, a .pdf file can be converted into text files with two formats: flow text and keep layout text. While flow text shows recognized words one by one, line by line, keep layout text presents these words column by column. Some columns represent data and others represent description of data. In this paper, we propose a method to take advantages of description of data to improve the accuracy of data extraction. This method combines flow text format and keep layout to make feature of words for the machine learning approach to extract data.

ABSTRACT

Địa chỉ: 41 Lê Duẩn Thành phố Đà Nẵng

Điện thoại: (84) 0236 3822 041 ; Email: dhdn@ac.udn.vn