Comparison of Accuracy of Linear Regression and Random Forest Models in Predicting Bitcoin Prices
Abstract
AbstractBitcoin is a digital asset that has experienced significant growth in value since its launch in 2009. However, its high price volatility makes predicting Bitcoin's price movements a challenge for investors and financial analysts. Therefore, a data-driven approach capable of capturing patterns in historical Bitcoin price data is needed to support more accurate investment decision-making. This study aims to evaluate and compare the performance of two prediction algorithms, namely Linear Regression and Random Forest, in predicting Bitcoin prices based on daily historical data from 2018 to 2025. The dataset was obtained from the Kaggle platform and processed through pre-processing, predictive feature formation, and data normalization. Two validation schemes were used: a 70:30 data split and cross-validation using K-Fold Cross Validation (10-fold). Model performance evaluation was carried out using three main metrics: Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE). The results show that the Linear Regression model produces better performance than Random Forest, both on split data and cross-validation, even though Random Forest has been optimized using GridSearchCV. The lowest RMSE value was obtained from Linear Regression in the K-Fold scheme, at 1314.47. These findings indicate that a simple model such as Linear Regression can still be effective in predicting Bitcoin prices if the data is properly processed. This research is expected to serve as a reference for developers of digital asset price prediction systems and stakeholders in data-driven decision-making..
Keywords: Bitcoin, Prediksi Harga, Regresi Linier, Random Forest, Evaluasi Model, Machine Learning, K-Fold Cross Validation