This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| en:iot-reloaded:iot_data_analysis [2024/07/19 16:05] – agrisnik | en:iot-reloaded:iot_data_analysis [2025/05/17 08:56] (current) – agrisnik | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== IoT Data Analysis ====== | ====== IoT Data Analysis ====== | ||
| - | {{: | + | IoT systems are built to provide |
| - | + | ||
| - | IoT systems, in their essence, | + | |
| Today, IoT systems produce a vast amount of data, which is very hard to use manually. Thanks to modern hardware and software developments, | Today, IoT systems produce a vast amount of data, which is very hard to use manually. Thanks to modern hardware and software developments, | ||
| - | As various resources have stated, IoT in most cases, complies with the so-called big 5Vs of Big Data, where just one correspondence is needed to solve a Big Data problem. As has been explained by Jain et al. ((Jain, A., Mittal, S., Bhagat, A., Sharma, D.K. (2023). Big Data Analytics and Security Over the Cloud: Characteristics, | + | As various resources have stated, IoT, in most cases, complies with the so-called big 5Vs of Big Data, where just one correspondence is needed to solve a Big Data problem. As has been explained by Jain et al. ((Jain, A., Mittal, S., Bhagat, A., Sharma, D.K. (2023). Big Data Analytics and Security Over the Cloud: Characteristics, |
| === Volume === | === Volume === | ||
| Line 13: | Line 11: | ||
| === Variety === | === Variety === | ||
| - | As Jain explained, big data is highly heterogeneous | + | Jain explained |
| === Veracity === | === Veracity === | ||
| Line 21: | Line 19: | ||
| === Velocity === | === Velocity === | ||
| - | Data velocity characterises the data bound to the time and its importance during a specific period or at a particular time instant. A good example might be any real-time system like an industrial process control system, where reactions or decisions must be made during a fixed period | + | Data velocity characterises the data bound to the time and its importance during a specific period or at a particular time instant. A good example might be any real-time system like an industrial process control system, where reactions or decisions must be made during a fixed period, requiring data at particular time instants. In this case, data has a flow nature of a specific |
| === Value === | === Value === | ||
| - | Since the IoT systems and their data analysis subsystems are built to add value to their owner, the costs of the development and ownership should exceed the returned value. | + | Since IoT systems and their data analysis subsystems are built to add value to their owners, the development and ownership |
| ====== ====== | ====== ====== | ||
| - | Dealing with big data requires specific hardware and software infrastructure. While there is a certain number of typical solutions and a lot more customise, some of the most popular are explained here: | + | Dealing with Big Data requires specific hardware and software infrastructure. While there is a certain number of typical solutions and a lot more customised, some of the most popular are explained here: |
| === Relational DB-based systems === | === Relational DB-based systems === | ||
| Those systems are based on well-known relational data models and appropriate database management systems like MS SQL Server, Oracle Server, MySQL, etc. There are some advantageous features of those systems, for instance: | Those systems are based on well-known relational data models and appropriate database management systems like MS SQL Server, Oracle Server, MySQL, etc. There are some advantageous features of those systems, for instance: | ||
| - | * Advantages of SQL (Structured Querying Language): enabling easy manipulation | + | * Advantages of SQL (Structured Querying Language): enabling easy data manipulation while maintaining a relatively good expressiveness of the data model. |
| - | * A well-designed set of software tools and interfaces enabling integration with a large number of different systems; | + | * A well-designed set of software tools and interfaces enabling integration with many different systems. |
| * A lot of built-in data processing routines (stored procedures) provide higher development productivity. | * A lot of built-in data processing routines (stored procedures) provide higher development productivity. | ||
| * Enables asynchronous reactions to events by triggering internal events. | * Enables asynchronous reactions to events by triggering internal events. | ||
| * Data reading might be scaled out using multiple entities, while writing might be scaled up using more productive servers. | * Data reading might be scaled out using multiple entities, while writing might be scaled up using more productive servers. | ||
| - | Unfortunately, | + | Unfortunately, |
| <figure RelationalDBMS> | <figure RelationalDBMS> | ||
| - | {{ : | + | {{ : |
| - | < | + | < |
| </ | </ | ||
| === Complex Event Processing (CEP) systems === | === Complex Event Processing (CEP) systems === | ||
| - | CEP systems are very application-tailored, | + | CEP systems are very application-tailored, |
| Some of the most common drawbacks to be considered are: | Some of the most common drawbacks to be considered are: | ||
| * It might be scaled up only by introducing higher productivity hardware, which is limited by the application-specific design. To some extent, the design might be more flexible if microservices and containerisation are applied. | * It might be scaled up only by introducing higher productivity hardware, which is limited by the application-specific design. To some extent, the design might be more flexible if microservices and containerisation are applied. | ||
| - | * Due to the factors mentioned above and the complexity, the maintenance costs are usually higher than a universal design. | + | * Due to the factors mentioned above and the complexity, the maintenance costs are usually higher than a universal design |
| + | |||
| + | <figure CEP_systems> | ||
| + | {{ : | ||
| + | < | ||
| + | </ | ||
| === NoSQL systems === | === NoSQL systems === | ||
| - | As the name suggests, the main characteristic is higher flexibility in data models, which overcomes the limitations of highly structured relational data models. NoSQL systems are usually distributed, | + | As the name suggests, the main characteristic is higher flexibility in data models, which overcomes the limitations of highly structured relational data models |
| - | It also provides a means for scalability out and up, enabling high future tolerance and resilience. A typical approach | + | It also provides a means for scalability out and up, enabling high future tolerance and resilience. A typical approach |
| - | Some other designs might extend the SQL data models by others – object models, graph models, or the mentioned key-value models, providing highly purpose-driven and, therefore, productive designs. However, the complexity of the design raises problems of data integrity as well as the complexity of maintenance. | + | Some other designs might extend the SQL data models by others – object models, graph models, or the mentioned key-value models, providing highly purpose-driven and, therefore, productive designs. However, the complexity of the design raises problems of data integrity as well as the complexity of maintenance |
| + | |||
| + | <figure NoSQL_systems> | ||
| + | {{ : | ||
| + | < | ||
| + | </ | ||
| === In-memory data grids === | === In-memory data grids === | ||
| This is probably the most productive type of system, providing high flexibility, | This is probably the most productive type of system, providing high flexibility, | ||
| - | * Hazelcast | + | * Hazelcast |
| - | * JBOSS Infinispan | + | * JBOSS Infinispan |
| - | * IBM eXtreme Scale ibm.com/software/products/en/websphere-extreme-scale | + | * IBM eXtreme Scale ((https:// |
| - | * Gigaspace XAP Elastic caching edition www.gigaspaces.com/ | + | * Gigaspace XAP Elastic caching edition |
| - | * Oracle Coherence | + | * Oracle Coherence |
| - | * Terracotta enterprise suite www.terracotta.org/ | + | * Terracotta enterprise suite ((www.terracotta.org/ |
| - | * Pivotal Gemfire | + | * Pivotal Gemfire |
| - | ====== | ||
| - | This chapter is devoted to the main groups of algorithms for numerical data analysis and interpretation, | + | <WRAP excludefrompdf> |
| + | This chapter is devoted to the main groups of algorithms for numerical data analysis and interpretation, | ||
| * [[en: | * [[en: | ||
| Line 80: | Line 88: | ||
| * [[en: | * [[en: | ||
| * [[en: | * [[en: | ||
| + | </ | ||