Originalversjon
2022 IEEE 17th International Conference on Control & Automation (ICCA). 2022, 303-308, DOI: https://doi.org/10.1109/ICCA54724.2022.9831854
Sammendrag
This paper aims to propose an efficient machine learning framework for maritime big data and use it to train a random forest model to estimate ships’ propulsion power based on ship operation data. The comprehensive data include dynamic operations, ship characteristics and environment. The details of data processing, model configuration, training and performance benchmarking will be introduced. Both scikit-learn and Spark MLlib were used in the process to find the best configuration of hyperparameters. With this combination, the search and training are much more efficient and can be executed on latest cloud-based solutions. The result shows random forest is a feasible and robust method for ship propulsion power prediction on large datasets. The best performing model achieved a R2 score of 0.9238.