Home
Giới thiệu
Tài khoản
Đăng nhập
Quên mật khẩu
Đổi mật khẩu
Đăng ký tạo tài khoản
Liệt kê
Công trình khoa học
Bài báo trong nước
Bài báo quốc tế
Sách và giáo trình
Thống kê
Công trình khoa học
Bài báo khoa học
Sách và giáo trình
Giáo sư
Phó giáo sư
Tiến sĩ
Thạc sĩ
Lĩnh vực nghiên cứu
Tìm kiếm
Cá nhân
Nội dung
Góp ý
Hiệu chỉnh lý lịch
Thông tin chung
English
Đề tài NC khoa học
Bài báo, báo cáo khoa học
Hướng dẫn Sau đại học
Sách và giáo trình
Các học phần và môn giảng dạy
Giải thưởng khoa học, Phát minh, sáng chế
Khen thưởng
Thông tin khác
Tài liệu tham khảo
Hiệu chỉnh
Số người truy cập: 106,989,675
Gradient Deep Learning Boosting and its application on the imbalanced datasets containing noises in manufacturing
Tác giả hoặc Nhóm tác giả:
Duc-Khanh Nguyen, Chien-Lung Chan, Dinh-Van Phan
Nơi đăng:
Spinger;
S
ố:
314;
Từ->đến trang
: 1-10;
Năm:
2022
Lĩnh vực:
Khoa học công nghệ;
Loại:
Báo cáo;
Thể loại:
Quốc tế
TÓM TẮT
Imbalanced datasets are usually a challenge on classification tasks, especially in the manufacturing industry. These skewed class distributions bring out the poor performance in traditional machine learning algorithms. In addition, most of the collected datasets contain noises that make the analysis process even harder. The noises could be the missing data or irrelevant variables in the datasets. Dealing with these noisy datasets remains an important step in data analysis. For these two reasons, we propose a Gradient Deep Learning Boosting (GDLB) model to deal with imbalanced datasets containing noises in the classification task. In dealing with noise, we use the Imputation transformer for handling the missing data and deployed the Random forest method for features selection. The two benchmark datasets named SECOM and DAIWM are implemented to prove our proposed method’s performance. Those are particular imbalance datasets containing noise. Our proposed method had an accuracy, recall, Matthews correlation coefficient, and Area under the curve of 0.87, 0.70, 0.32, and 0.79, respectively on the SECOM dataset. On the other hand, on the DAIWM dataset, our proposed method achieves 0.91, 0.83, 0.56, and 0.87 respectively. We found that the combination of proposed Gradient Deep Learning Boosting and handling noises is a prospective model for imbalanced datasets.
ABSTRACT
Imbalanced datasets are usually a challenge on classification tasks, especially in the manufacturing industry. These skewed class distributions bring out the poor performance in traditional machine learning algorithms. In addition, most of the collected datasets contain noises that make the analysis process even harder. The noises could be the missing data or irrelevant variables in the datasets. Dealing with these noisy datasets remains an important step in data analysis. For these two reasons, we propose a Gradient Deep Learning Boosting (GDLB) model to deal with imbalanced datasets containing noises in the classification task. In dealing with noise, we use the Imputation transformer for handling the missing data and deployed the Random forest method for features selection. The two benchmark datasets named SECOM and DAIWM are implemented to prove our proposed method’s performance. Those are particular imbalance datasets containing noise. Our proposed method had an accuracy, recall, Matthews correlation coefficient, and Area under the curve of 0.87, 0.70, 0.32, and 0.79, respectively on the SECOM dataset. On the other hand, on the DAIWM dataset, our proposed method achieves 0.91, 0.83, 0.56, and 0.87 respectively. We found that the combination of proposed Gradient Deep Learning Boosting and handling noises is a prospective model for imbalanced datasets.
© Đại học Đà Nẵng
Địa chỉ: 41 Lê Duẩn Thành phố Đà Nẵng
Điện thoại: (84) 0236 3822 041 ; Email: dhdn@ac.udn.vn