 Data Warehouse Designing for Vietnamese Textual Document-based Plagiarism Detection System
Tác giả hoặc Nhóm tác giả: Phan Hieu Ho, Trung Hung Vo, Ngoc Anh Thi Nguyen
Nơi đăng: IEEE International Conference on System Science and Engineering; Số: ICSSE 2017;Từ->đến trang: 254-258;Năm: 2017
Lĩnh vực: Công nghệ thông tin; Loại: Bài báo khoa học; Thể loại: Quốc tế
In this paper, the significance role of data warehouse designing for textual anti-plagiarism system is investigated. The paper covers the central issues of data warehousing modeling including: (1) formulating the data representation, (2) establishing the foundations of storage structure, (3) proposing corresponding architecture allowing to store, update and manage data. Consequently, two levels are considered in this paper to address the above mentioned research axes. First, at a theoretical level, the objective is to introduce novel and practical contributions in the area of textual document-based plagiarism system. The chosen approach is proposed to collect, analysis and store textual dataset. Secondly, at an implementation level, the paper focuses on the platform for processing the data, calling to modeling exhibits promising capabilities such as support for real-time, new sources of data, and self-service capabilities. The real application is performed in Vietnamese text-based document by conducting documents containing final reports/ assignments, dissertations of master/ Ph.D and research scientific papers applied for the University of Danang. The contribution of the paper is not only provide values to all researchers, educators and students in the university of Danang systems but also be considered as seminal work to develop plagiarism in our further next investigation of building a bigdata warehouse severing for a automatic duplicate system.
