The prediction of random phenomena has long seemed impossible to achieve, at least through classical mathematical models. The arrival of machine learning techniques and the development of computer power have facilitated the subject, which was initially applied to the occurrence of defective parts in manufacturing and the occurrence of road accidents in developed nations. Regarding rail transport, nothing to date has been implemented, except for some work on the presence of faults on the rails and on the machines. In this paper, we make use of Machine Learning for the purpose of monitoring and predicting railway vehicle derailments using the k-Nearest Neighbors method (k-NN) based only on main causes pertaining to the rolling stock. Implementing the k-NN model gave 83.61% efficient prediction on evaluation using the confusion matrix and 87% using the ROC curve as a metric. This work proposes a prediction based on the following principle : each train lined up for departure is submitted to the classifier, which will predict that the vehicles that are part thereof, numbered respectively: XXXXXX, YYYYYY, … and ZZZZZ located precisely in positions T1, T2, … and Tn of the train, will derail, for specific causes noted : C1, C2, … and Cn, whereas the vehicles numbered respectively AAAAAA, BBBBBB, … and KKKKKK located in position S1, S2, … and Sm will not derail. Predicted in this way, vehicles announced to be derailed can be removed from the train and repaired, then resubmitted to the predictor. The train can only be allowed to depart if the number of vehicles predicted to derail drops to zero. This solution, while hopeful, in its current state at 84% accuracy does not provide full security as transportation is a critical aspect impacting human lives. could only; The current accuracy leaves a large slice, namely ±16% chance that the given prediction is not correct. However, human lives are concerned, which is why it is important for us to find a classifier capable of reducing this risk. Therefore, we discuss this margin and possibilities for improvement. One way is to use a combination of classifiers, particularly Neuro-fuzzy classifiers, given that fuzzy type data will be handled, in order to attain a much more accurate prediction.
Keywords: Machine Learning; derailment; prediction; k-NN; combination of classifiers; fuzzy data; Neuro-fuzzy classifier; risk.