Dec 25, 2021

XGBoost Evaluation

XGBoost is a popular machine learning algorithm. It builds multiple trees as weak learners and combine them as a strong learner. Unlike random forest that looks at the average/votes of each tree, instead, it takes the sum of all the trees' output to compute the final result. In order to form this "additive" method, the trees are targeting to predict the residual of the pre- vious output( i.e. the ground truth subtract the sum of the previous trees). Ideally, the absolute value of the target (the residual) will gradually decrease and finally converge to 0 or some predefined threshold as the number of trees grows. In this project. XGBoost algorithm is tested on four datasets, the "US-Accident-Oct2021" & the "fashion mnist" for classification; the "syn- chronous machine" & the "Beijing PM2.5" for regression.

The source code and a report could be found via the following links: