Mora-López, L., J. Mora, R. Morales-Bueno and M. Sidrach-de-Cardona
Environmental Modelling and Software – 20.6 (2005), 753-760
Keywords: machine-learning, climatic data, Kolmogorov-Smirnov test
Abstract: A model to characterize and predict continuous time series from machine-learning techniques is proposed. This model includes the following three steps: dynamic discretization of continuous values, construction of probabilistic finite automata and prediction of new series with randomness. The first problem in most models from machine learning is that they are developed for discrete values; however, most phenomena in nature are continuous. To convert these continuous values into discrete values a dynamic discretization method has been used. With the obtained discrete series, we have built probabilistic finite automata which include all the representative information which the series contain. The learning algorithm to build these automata is polynomial in the sample size. An algorithm to predict new series has been proposed. This algorithm incorporates the randomness in nature. After finishing the three steps of the model, the similarity between the predicted series and the real ones has been checked. For this, a new adaptable test based on the classical Kolmogorov–Smirnov two-sample test has been done. The cumulative distribution function of observed and generated series has been compared using the concept of indistinguishable values. Finally, the proposed model has been applied in several practical cases of time series of climatic parameters.