Theodosios Theodosiou, Stavros Valsamidis, Georgios Hatziliadis and Michael Nikolaidis
A huge amount of data are produced in the agriculture sector. Due to the huge number of these datasets it is necessary to use data analysis techniques in order to comprehend the…
Abstract
Purpose
A huge amount of data are produced in the agriculture sector. Due to the huge number of these datasets it is necessary to use data analysis techniques in order to comprehend the data and extract useful information. The purpose of this paper is to measure, archetype and mine olea europaea production data.
Design/methodology/approach
This work applies three different data mining techniques to data about Olea europaea var. media oblonga from the island of Thassos, at the northern part of Greece. The data were from 1,063 farmers from three different municipalities of Thassos, namely Kallirachi, Limenaria and Prinos and concerned the year 2010. They were analysed using the classification algorithm OneR, the clustering algorithm k‐means and the association rule mining algorithm, Apriori from the WEKA data mining package. Also, new measures which quantify the performance of the productions of olives and oil are applied. Finally, archetypal analysis is applied in order to distinguish the most typical/stereotype farms for each region and describe their specific characteristics.
Findings
The results indicate that organic cultivation could improve the production of olives and olive oil. Furthermore, the climate differences among the three municipalities seems to be a factor involved in production efficacy.
Originality/value
It is the first time that data from the island of Thassos have been analysed systematically using a variety of data mining methods. Also, the measures proposed in the paper in order to analyse the data are new. Furthermore, archetypal analysis is proposed as a method to extract sterotypes/representative farms from the dataset.