At the centre of the IoT ecosystem consisting of billions of connected devices is the wealth of information that can be made available through the fusion of data that is produced in real-time, as well as data stored in permanent repositories. This information can make the realisation of innovative and unconventional applications and value-added services possible and will act as an immense source for trend analysis and strategic business opportunities. A comprehensive management framework of data and information that is generated and stored by the objects within the IoT is thus required to achieve this goal [1] [2] [3] [4] [5] [6] [7] [8] [9].
Data management is a broad concept referring to the architectures, practices, and procedures for proper management of the data lifecycle requirements of a particular IT system. As far as the IoT is concerned, data management should act as a layer between the physical sensing objects and devices generating the data – on the one hand, and the applications accessing the data for analysis purposes and services – on the other.
The IoT data has distinctive characteristics that make traditional relational-based database management an obsolete solution. A massive volume of heterogeneous, streaming and geographically-dispersed real-time data will be created by millions of diverse devices periodically sending observations about certain monitored phenomena or reporting the occurrence of certain or abnormal events of interest. Periodic observations are most demanding regarding communication overhead and storage due to their streaming and continuous nature, while events present time-strain with end-to-end response times depending on the urgency of the response required for the event. Furthermore, in addition to the data that is generated by IoT entities, there is also metadata that describes these entities (i.e. “things”), such as object identification, location, processes and services provided. The IoT data will statically reside in fixed- or flexible-schema databases and roam the network from dynamic and mobile objects to concentration storage points. It will continue until it reaches centralised data stores. Communication, storage and process will thus be defining factors in the design of data management solutions for the IoT.
Traditional data management systems handle the storage, retrieval, and update of elementary data items, records and files. In the context of the IoT, data management systems must summarise data online while providing storage, logging, and auditing facilities for offline analysis. It expands the concept of data management from offline storage, query processing, and transaction management operations into online-offline communication/storage dual operations. We first define the data lifecycle within the context of the IoT and then discuss some of the phases to have a better understanding of the IoT data management.