Thông tin chung

  English

  Đề tài NC khoa học
  Bài báo, báo cáo khoa học
  Hướng dẫn Sau đại học
  Sách và giáo trình
  Các học phần và môn giảng dạy
  Giải thưởng khoa học, Phát minh, sáng chế
  Khen thưởng
  Thông tin khác

  Tài liệu tham khảo

  Hiệu chỉnh

 
Số người truy cập: 74,172,967

 A Domain indicating method for Ede terminology in building a Vietnamese-Ede bilingual corpus
Tác giả hoặc Nhóm tác giả: Hoàng Thị Mỹ Lệ; Phan Huy Khánh
cvs weekly sale cvs print prescription savings cards
Nơi đăng: The Proceedings The third Asian Conference on Information Systems - ACIS 2014; Số: The Proceedings;Từ->đến trang: 434-439;Năm: 2014
Lĩnh vực: Công nghệ thông tin; Loại: Bài báo khoa học; Thể loại: Quốc tế
TÓM TẮT
In the natural language processing, the monolingual and multilingual corpora for different domains and specialties are always an indispensable resource. The quality of a multilingual corpus plays a decisive role in the output quality of the translation system such as analysis and synthesis of texts, machine translation etc. Especially, the statistical machine translation systems will not produce a reasonable output, if the quality of corpus used in the training process is not good. Currently, there is no Vietnamese-Ede multilingual corpus using Unicode fonts that have been officially announced. Increasing demands on the resource is shared for research activities from theory to practice. From that, the paper proposes a solution indicating domain for Ede terminology which is applied to build a Vietnamese-Ede bilingual corpus. It contains the terminology in the domain of education about animal husbandry, cultivation, forest protection, health care, etc. for the ethnic minority in Vietnam. This solution has also partially solved ambiguity problems in a machine translation system from Vietnamese into Ede language in a restrict context.
ABSTRACT
In the natural language processing, the monolingual and multilingual corpora for different domains and specialties are always an indispensable resource. The quality of a multilingual corpus plays a decisive role in the output quality of the translation system such as analysis and synthesis of texts, machine translation etc. Especially, the statistical machine translation systems will not produce a reasonable output, if the quality of corpus used in the training process is not good. Currently, there is no Vietnamese-Ede multilingual corpus using Unicode fonts that have been officially announced. Increasing demands on the resource is shared for research activities from theory to practice. From that, the paper proposes a solution indicating domain for Ede terminology which is applied to build a Vietnamese-Ede bilingual corpus. It contains the terminology in the domain of education about animal husbandry, cultivation, forest protection, health care, etc. for the ethnic minority in Vietnam. This solution has also partially solved ambiguity problems in a machine translation system from Vietnamese into Ede language in a restrict context.
unfaithful spouse infidelity i dreamed my husband cheated on me
abortion stories gone wrong how to abort at home teenage abortion facts
© Đại học Đà Nẵng
 
 
Địa chỉ: 41 Lê Duẩn Thành phố Đà Nẵng
Điện thoại: (84) 0236 3822 041 ; Email: dhdn@ac.udn.vn