REACH Solutions' Blog Page

  • Aug 10th, 2018

IoT Gateways

In our earlier posts we talked about how to store, process and analyze data, but missed a crucial step, how to collect them. An outstanding challenge for the IoT lies in connecting sensors, devices, endpoints in a cost effective and secure way to capture, analyze and effectively gain insights from the massive amounts of data. IoT gateway is the key element in this process, below we describe why.

The definition of an IoT gateway has changed over time as the market developed. Just like traditional gateways in networks do, IoT gateways function like bridges – and they bridge a lot, positioned between edge systems and our REACH solution.

IoT Gateway market

IoT gateways fulfil several roles in IoT projects. IoT gateways are built on chipsets that feature low-power connectivity and may be rugged for critical conditions. Some gateways also focus on fog computing applications, in which customers need critical data so that machines can make split-second decisions. Based on this IoT vendors can be divided into three groups. Vendors those who just give hardware (Dell), companies who focus on softwares & analytics (Kura, Kepware) and the end-to-end providers (Eurotech).

Our IoT Gateway solution belongs to the software & analytics group, which is an OPC client (communicating with an OPC server). OPC is a software interface standard that allows secure and reliable exchange of data with industrial hardware devices. reach i4 opc server  iot gateway mqtt kafka

What are IoT Gateways?

Gateways are emerging as a key element of bringing legacy and next-gen devices to the Internet of Things (IoT). Modern IoT gateways also play an increasingly important role in helping to provide analytics so that only the most important information and alerts are sent up to the REACH to be acted upon. They integrate protocols for networking, help manage storage and analytics on the data, and facilitate data flow securely between edge devices and REACH.

Mainly in Industrial IoT there is an increasing movement towards the fog as is the case in many technologies.

Intelligent IoT gateways

With fog computing (and the movement to the edge overall) we really enter the space of what is now known as an intelligent IoT gateway. Whereas in the initial and more simple picture an IoT gateway sat between the sensors, devices and so forth on one hand and the cloud on the other, a lot of analytics and filtering of information is now increasingly done closer to the sensors through fog nodes for myriad possible reasons as explained in our article on fog computing. The illustration below shows where the intelligent IoT gateway (and soon they’ll all be intelligent) sits in an IoT architecture.

reach i4 opc server  iot gateway mqtt kafka

(img source: https://www.postscapes.com/iot-gateways/ )

  • July 10th, 2018

Machine Learning

Machine learning is a process how we make software algorithms to learn from huge amounts of data. This term was originally used by Arthur L. Samuel, who described it as: “programming of a digital computer to behave in a way which if done by human beings”. ML is an alternative way to build AI with help of statistics to find patterns in data rather than using explicitly hard-coded routines with millions of lines of code. There is a group of algorithms, that allows to build such applications that can receive input data and predict an output dependings on the input. ML is also can be understood as a process, when you “show” tons of data – text, pictures, sensor data – to the machine, with the required output – this is the training part – and then you “show” a new picture without required output and ask the machine to guess the result.

Use cases

Machine learning has grown to be a very powerful tool for various problems from different areas, for example text processing – for categorizing documents or speech recognizers (chatbots) – or image processing – where we train the algorithm with hundreds of thousands of tagged pictures, to be able to recognize persons, objects, etc.

However, from our point of view, the more important use-cases are those where factory machines and processes are involved. In this case we have to collect, assort and store many different sensor readings from different producing robots to train our machine learning solutions for different purposes – like predict these machine’s failures.

Before training, different tasks have to be done, like data preparation - which contains for example filling or throwing empty cells out, standardize data, sort the important features etc. - training - involves feeding the cleaned data, to the algorithm, to adjust itself - and finally, we have to be able to measure and improve our solution with new data, and parameters.

For these purposes a data pipeline should be built, using the same processes to a newly arrived data as we have done in the training state. With REACH, we are able to do all of these tasks - data preparation, training and building the data pipeline - easily, and in a user friendly way through the UI, for useful solutions which will result in downtime and cost reduction, too.

Machine Learning within REACH

We provide solutions for many different problems occurring during the lifetime of a machine learning project: we have different tools for different roles: we provide an easy and simple graphical UI with pre-configured models, which helps you to focus only on the data and the pattern behind it.

Of course, whit this approach you will also be able to tune the model parameters and compare them to find out which parameters are more suitable for your application. For developers who want to build their own solution, REACH is also perfectly suitable; with an embedded jupyter notebook, users are able to build any model with different technologies – like scikit learn, spark ML lib, tensorflow, etc. These models could be deployed to a single machine or distributed to the cluster to reach the best performance and better scalability.

Towards to the future

Machine learning as a term is so pervasive today, but many people use it in wrong way, or mix it up with AI or deep learning – our following blog posts topics. Whit this introduction you can get a little insight from this technology to have a general picture how to build applications which are able to improve their performance without any human interaction by analysing data and using feedback of performance.

reach i4 data lake big data cep hadoop machine learning ui model learning
  • June 10th, 2018

Hybrid Architecture

As we discussed in our previous blog post, data lake and Big Data is a required technology to cover all the needs of the Industry 4.0. But what about my classic data? Should I transfer them to a data lake? Do I have to redesign all my application and process to use the new storage layer?

The answer is definitely not. You do not need to throw out your classic databases, systems and processes that uses them. Our terminology and best practices says that a classical database engine can live in a smooth symbiosis with a modern Big Data data lake system approach. It’s only about a well-designed architecture, which is not easy to create. That’s why our experts designed REACH to be ready to handle such situations and we call it a hybrid architecture, where all data goes its proper place. Some data should be stored in the data lake system, but some of them should go to the classical storage layer.

The question is where to combine these datasets. REACH is designed to be ready to combine these different data sources also on process and analytical level, so at the end of an analysis you cannot distinguish whether a data came from the classical layer or from the Big Data data lake.

We believe in the proper storage technique that says all the data must go to its proper storage layer, we should not say use only one kind of storage for all your data. Going further by implementing this technology our experts designed REACH to handle multiple storage layer not only saying this should go to the big data layer or into a classical layer but saying that you should use different storage techniques inside the data lake and also in the classical storage layer. A good example for this where Kudu, HBase and HDFS lives next to each other by extending the storage techniques from the classical layer where relational database storage is also mixed with standard file storage techniques. That is why we cannot say that one database engine is good for all. REACH is designed to support this multistorage approach to get the maximum out of your data.

reach i4 data lake big data cep hadoop hybrid architecture
  • May 10th, 2018

Data lake

In the world of Big Data traditional data warehouses are not sufficient anymore to support the requirements of the 4.0 level industry and to become the foundation of truly real-time solutions. In contrast to the structured data storage concept of the traditional data warehouses, a Data Lake can offer a solution that will keep the original format and state of the data and provide real-time access to them. The greatest advantage of a Data Lake is that it is capable of storing tremendous amount of data while preserving its raw format in a distributed, scalable storage system. Therefore, it is possible to store data coming from various data sources so that it is adaptable for future requirements, resulting in such a flexibility that current data warehouses cannot provide.

What can a Data Lake offer?

The concept of a Data Lake enables factories to fulfill the requirements of the Industry 4.0, to make data generated during production accessible for other participants in the production line, in the swiftest, smoothest way with the help of the Complex Event Processing method (introduced in our previous blog post), as the data is stored in its original raw format and no data transformation will slow down this process. For this reason, REACH is putting Data Lake in the heart of its architecture so that we could contribute to the competitiveness of our partners. Without the modernization of the storage process, real-time analysis and automations are not possible. Companies that seek to utilize machine learning methods, need to possess a wide range of data sources to provide the sufficient amount of data for the algorithms. Cost is also an important element: in case of a Hadoop-based Data Lake utilizing well-known big data techniques, storage costs are minimal compared to a standard data warehouse solution, because Hadoop consists of open source technologies. Furthermore, its hardware requirement is also lower due to its distributed setup, so it can be built even on commodity hardware.

Should data warehouses be replaced?

One of the main aspects during the design of the REACH architecture was integrability, therefore it offers interfaces to connect to various data sources and applications. See our upcoming blogpost of hybrid architectures!

reach i4 data lake big data cep hadoop
  • April 10th, 2018

Complex event processing

The complex event processing paradigm (CEP) is a fundamental paradigm for a software system to self-adapt to environmental changes which has been introduced to follow, analyze and react to any incoming events which require near real-time responses, though early detection and reaction to emerging scenarios. A CEP architecture has to handle data from multiple, heterogeneous sources, apply complex business rules, and drive outbound actions. The applied technique for tracking, analyzing, and processing data as an event happens and is useful for Big Data because it is intended to manage data „on-the-fly”.

The amount and complexity of data is growing

CEP utilizes data generated continuously - everywhere in a factory - from different sources such as sensors, PLCs, location tracking data, AOI, etc. This data is generally different from former data sources, as it is not prepared or clarified in any way, therefore it tends to be messy, intermittent, unstructured and dynamic. There is a need to handle data asynchronously, which means that the architecture should facilitate multiple processes simultaneously on a single message or trigger. The number of devices, volume and variety of sources, the frequency of the data are all growing and will not scale using traditional approaches to computing, storing and transporting them.

Complex events in real time

In many use cases, latency matters. Delays between a data event and a reaction often must be near real-time. Throughput is impacted along with data growth, and delays will become unacceptable. Latency must be low; typically less than a few milliseconds, but sometimes less than one millisecond, between the time that an event arrives and it is processed. Traditional approaches to centralizing all data and running analytics (even in the cloud) are unsustainable for real-time use cases. Traditional and cloud-based data management and analytics can pose security challenges as they are physically outside of the data center’s security perimeter. As machine learning intelligence become more commonplace in devices out in the field, those devices become more complex, requiring greater CPU and memory and drawing more power. Increasing complexity slows processing down and leads to data results being discarded since by the time results from the device are gathered, more recent data is desired.

Walking the bridge towards complex event processing

There is a need to bridge the gap between the traditional approach and solutions with new Big Data technologies such as CEP. Leaders across all industries are looking for ways to extract real-time insight for their massive data resources and act at the right time. Bridging of this gap will enable your company to become a real Industry 4.0-ready factory, so that business outcomes can be maximized by making better-informed, more automated decisions and delivering a better service, higher quality to the customers. Our REACH (Real-time Event-Based Analytics and Collaboration Hub) provides such a bridge for you. Let’s make a Proof-of-Concept together to reach your low-hanging-fruits and deliver tangible business benefits within a couple of months!

reach i4 complex event processing complex event space platform fog computing edge computing distributed iot industry digital twin

(img source: Steinberg, A., & Bowman, C., Handbook of Multisensor Data Fusion, CRC Press 2001)

  • March 10th, 2018

EDGE vs FOG vs Cloud

In our previous technical blog post we revealed the technology behind the term “fog computing”. As we discussed in a very high level fog computing is something about bringing cloud functionality closer to the data while we transport the data a bit away from the edge. Where the data crosses the computing functionality we are talking about fog computing. Now it is time to make a comparison between Edge, Fog and Cloud computing.

Edge computing

In edge computing, physical devices, machines and sensors are directly connected into data processing devices. This device does processing on the data, specially aggregating, transforming or running no performance intensive algorithms. This technology is usually used when the data in digital recognisable format. In this case edge nodes transform these non digital values to digital one and transport them into the fog layer for further analysis. It is also possible to do non performance intensive pre-calculations on the edge nodes, but it is not really preferred because we lose the possibility to run complex algorithms on the raw data in the fog layer.

Fog computing

As we discussed in our previous post, fog computing is about somehow process complex performance intensive algorithms on the data collected from the edge nodes. These algorithms uses more resources and / or requires data to perform the calculation that is nor available on the edge nodes, so this is not possible to run the on them. In the fog layer it is also possible to generate control signal, to transport them to the edge nodes and make the system ready to control any machine or device based on complex, calculated events.

Cloud computing

Cloud computing is about to buy resources from the cloud providers and put the data into the cloud for analysis. The term of IoT it is unlikely to push all raw data into the cloud system. In the cloud computing layer we have to make difference between private and public cloud systems. Public cloud does not mean that your data will be available for public access but the resources and the services can be ordered by anyone who pays. Private cloud systems are more about connecting several locations into a centralized architecture, such as data centre and owing the whole architecture by the company. Both private and public cloud systems are (external) network dependent, so in the Industry 4.0 architecture it is advices to use them only for cross company aggregated data analysis.

As you see we could not say that Edge, Fog or Cloud computing can cover a complex Industry 4.0 system by alone. Our implementation experiences show that fog computing layer is the most important layer in such architecture but it must live in a perfect harmony with Edge and Cloud systems.

reach i4 platform fog computing edge computing distributed iot industry digital twin
  • February 10th, 2018

Fog Computing.

In this context, „Fog” means that „Cloud” moves down, closer to the ground, to the machines, sensors and legacy systems.

Fog and Cloud computing are complementary to each other. As the amount of data increases, transmitting it all to the cloud can lead to challenges such as high latency, unpredictable bandwidth bottlenecks and distributed coordination of systems and clients.

Fog computing brings computing and applications closer to the data, saving bandwidth on billions of devices and enabling real-time processing and analysis on huge datasets and streams.

Our product, REACH is a Fog computing platform that delivers real-time, event stream processing capabilities by using distributed computing, analytics-driven storage and networking functions that reside closer to the data-producing sources

reach i4 platform fog computing edge computing distributed iot industry digital twin

Send a message

Successful sending
An error occured