Science curiculum vitae personally - University of Da Nang




	Thông tin chung

	English

	Đề tài NC khoa học
	Bài báo, báo cáo khoa học
	Hướng dẫn Sau đại học
	Sách và giáo trình
	Các học phần và môn giảng dạy
	Giải thưởng khoa học, Phát minh, sáng chế
	Khen thưởng
	Thông tin khác

	Tài liệu tham khảo

	Hiệu chỉnh


Số người truy cập: 109,886,131

An Investigation on Vietnamese Credit Scoring based on Big Data Platform and Ensemble Learning

Tác giả hoặc Nhóm tác giả: Quang-Linh Tran1; Binh Van Duong, Gia-Huy Lam, Dat Vuong,
and Trong-Hop Do;

Nơi đăng: The First International Conference on Intelligence of Things; Số: ISBN 978-3-031-15062-3;Từ->đến trang: 289–298;Năm: 2022

Lĩnh vực: Công nghệ thông tin; Loại: Bài báo khoa học; Thể loại: Quốc tế

TÓM TẮT

The credit score is a vital indicator that can affect many
aspects of people’s lives. However, evaluating credit scores is done manually, so it costs a large amount of money and time. This paper learns
from disadvantages of previous research and brings some insights and
empirical experiments so as to the advantages of distributed solutions for
the problem of credit score in the future. The research compares some
feature engineering techniques using a big data platform and ensemble
learning methods to find the best solution for predicting the credit score.
Since data related to customers’ financial activities grows enormously, a
big data platform is necessary to handle this amount of data. In this paper,
Spark which is a distributed, data processing framework, is used to save
and process data. Some experiments are carried out to compare the effectiveness of feature engineering in this problem. Moreover, a comparative
study about the performance of ensemble learning models is also given
in this paper. A real-world Vietnamese credit scoring data set is used
to develop and evaluate models. Four metrics are used to evaluate the
performance of credit scoring models, namely F1-score, recall, precision,
and accuracy. The results are promising with the highest accuracy of
72.9% in the combination Gradient-boosted Tree and cleaned data set
with removing categorical features. This paper is a foundation for using
big data platforms to handle financial data and much future research can
be carried out to optimize the performance of this paper

ABSTRACT

Địa chỉ: 41 Lê Duẩn Thành phố Đà Nẵng

Điện thoại: (84) 0236 3822 041 ; Email: dhdn@ac.udn.vn