Multi-proxy palaeo- and ecological research usually provides large and heterogeneous databases with temporal relationships between components. It is noteworthy that significant cross-correlations of different indicators and their repeated co-evolutions through time are not easy to characterize empirically. MOBI-PALEO was created for this reason.

The overall objective here is to extract frequent closed gradual patterns or FCGP (Di-Jorio et al., 2008) that track the order correlations of the form “the more/less X associated with the more/less Y…” from large databases with a task automation and thus a reduced runtime. This automatic patterning work is based on a data-driven modelling, which confirms data mining methods are complementary to multivariate statistics, which allow user-driven modelling of data. Algorithms of gradual patterns mining currently reported in the literature do not assume any temporal constraints on data, yet numerical palaeoecological databases present temporal relationships between objects (time-scaled data). The application of data mining methods in palaeoecology is to perform a data mining process under temporal constraint. This need for a temporal dimension motivated our creation of a new and specific algorithm allowing to automatically extract co-evolutions between paleoecological indicators. The basic principles and the methodology used to obtain it are detailed in Lonlac et al. (2017 and 2018).

Briefly, the initial database in tabular form is a set of objects (the different depths or the equivalent estimated radiocarbon dates) described by a set of attributes. This table displays the abundance (in percentages) of each attribute for each object. In this database, a gradual item corresponds to (attribute 1=+), for instance, while {attribute 1=+, attribute 2=+, …}, for example, is a gradual pattern, which indicates that these 2 types of attributes are positively correlated (in term of covariation). An algorithm, inspired from the approach proposed by Berzal et al. (2007), allows firstly to transform the original numerical paleo- or ecological database in a categorical database. The APRIORI algorithm (Agrawal and Srikant, 1994) is secondly applied on the obtained categorical database to extract frequent closed item sets corresponding to the frequent closed gradual patterns (FCGP) of the original numerical database, which is constituted by objects temporally ordered. The obtained gradual patterns are finally post-processed according to the user preferences and research objectives in order to reduce the number of patterns and focus on the most interesting patterns.

FCGP correspond to the most concise representation of patterns without any loss of information (Pasquier et al., 1999). In this sense, the FCGP with a low support of at least 10% and positively correlated have been retained. The support measures the redundancy of a FCGP in the database and low support values ensure no loss of information. FCGP correspond to the most significant and repeated co-evolutions of indicators.