REACH Solutions' Blog Page

  • Oct 10th, 2018

Data security

A key factor for applications dealing with lots of data – including complex event processing – is security. Nowadays as IoT is more popular than ever, one can hear more and more stories about security breaches, as the simple internet connected devices are often less secured, thus more vulnerable to different types of attacks.

REACH uses Fog Computing , which means none of the data leaves the factory's territory, making the external attack itself impossible. Of course, this doesn't mean that everything is secured, as if hundreds, thousands, or even tens of thousands of employees and vendors can access every data without restrictions, the chance of a potential disaster is excessive. Just think about what could happen if someone deletes all the data collected the past years - it doesn't matter if it was intentional or not.

Most companies only think about security after the Armageddon already happened – such as a leak or destruction of private data. All of the incidents are avoidable with enough care. Fortunately, there are multiple ways to address these problems, and REACH also has these solutions integrated together by default.

Kerberos – developed by MIT – plays a key role in authentication, to only let people (and services) access the data if they can prove their identity. The client authenticates itself at the Kerberos server, and receives an encrypted timestamped ticket-granting ticket (or TGT for short), and whenever it wants to access a new service in the TGT’s lifespan, it asks for a separate ticket for that exact service. The different services are accessible only with these valid tickets, which also have a lifespan so they are unusable after a short period, and all the tickets are encrypted with AES256, which could take an eternity to brute force with billions of supercomputers.

The next level of security is authorization, where rules specify who can do what. Lightweight Directory Access Protocol – shortened as LDAP – is an industry standard for distributed directory access, created by University of Michigan. It’s OpenLDAP implementation is fully open source, and integrates well with Kerberos, what makes them a perfect fit for security. It holds all the information about the users and services, and tells which user has permission to access a specific resource.

However, one piece is still missing: what if the fog devices are communicating with each other? They still have to send data across the local network to collaborate, and one could sniff those packets. The solution for this is the usage of Transport Layer Security (TLS), which is a cryptographic protocol. It encrypts the data over the network, so only the intended recipient can open the messages.

Remember, no matter how tall, spiky, strong fence you have at 95% of your territory’s circumference, your fence is as strong as its weakest part. Any of the above technologies wouldn't be enough alone, but together they form an all-round security layer to protect your valued data.

reach i4 data security ldap kerberos tls iot industry digital twin
  • Sep 10th, 2018

The role of Industrial IoT in maintenance and manufacturing optimization

Maintenance is a task that is carried out in factories on a daily basis for keeping machines healthy and the whole manufacturing process efficient. The main goal is to do maintenance before a particular machine starts producing waste or even suffers complete failure. It is easy to prove that preventing machines from being stuck means lower operating costs and helps keeping production smooth and fluent. Still many factories have a hard time dealing with downtime due to asset failure.

Standard maintenance procedures – Preventive Maintenance (PM)

The purpose of regular care and service done by maintenance personnel is to make sure that the equipment remains productive, without any major breakdowns. For this purpose, maintenance periods are specified conservatively, usually based on data measured by the equipment manufacturer or at the beginning of operation. However, these procedures do not account for the actual condition of the machine resulting from different environmental effects – like ambient temperature and air humidity –, raw material quality, load profiles and more.

In order to take the ever changing operation conditions into account, condition-relevant data needs to be collected and processed. This is condition based maintenance (CBM). In case of machining it is essential to measure ambient parameters, machine vibration, sound and motor current, which give a picture about the concrete health state of the machine and machining tools. The availability of this data enables making the step towards a more sophisticated maintenance mode: Predictive maintenance.

Bring maintenance to the next level: Predictive maintenance (PdM)

Maintenance work that is based on prediction presumes fulfilling following requirements. First of all, the data that is collected from the asset contains the information showing the signs of an upcoming event. In other words, patterns precisely describing each event can be identified in the measurement signals. If this hypothesis holds true, the next step is to either hard-code the conditions indicating oncoming failure, or use a Machine Learning algorithm to identify and literally learn the particular failure mode patterns.

The second requirement comes into picture as soon as the patterns have been identified and the model is capable of predicting the unwanted event soon enough to take action. This requirement addresses the architecture of the system making the prediction: it needs to operate in real time. The reason is that there are many applications where damage can be predicted only shortly before the event (usually measured in minutes or seconds). Advanced IIoT systems feature real-time operation.

And last but not least, manufacturing situations where multiple machines and robots are involved – a group of machines together having impact on the product quality or operation efficiency of consequential machines –, events and data are very complex. Processing this data requires a system that has high computing performance and tools to handle the complexity.

The complex data processing and predictive capability of REACH lies in the most advanced Big Data technologies, built-in Machine Learning algorithms and the real-time Fog Computing architecture . The system is capable of learning and distinguishing between failure modes and sending alerts in case of an expected breakdown.
reach i4 opc server  iot gateway mqtt kafka
Although Predictive maintenance is a big step for most manufacturers, there is still a next level to go for.

State-of-the-art: Prescriptive maintenance (RxM)

Prescriptive maintenance requires even more detailed data and a checklist on what actions to take in case of a detected failure mode pattern. Although technology makes implementing prescriptive systems utterly possible, only few organizations make it to this point. This step requires a very good harmonization between maintenance and production departments, a fair understanding of the problem and efficient cross-department information sharing. These are the key criteria of a successful Industrial IoT implementation anyway.

Manufacturing companies that take the effort to collect data, analyze the problem, identify patterns of inadequate operation, understand and prepare their data, can reach the level of a Smart Factory regarding maintenance operations, too. In the presented case, not only the checklist is being displayed on the REACH UI, but emails and SMS can be sent to maintenance personnel and other relevant stakeholders. This minimizes the time required to prepare for required actions.
reach i4 opc server  iot gateway mqtt kafka
Besides using email and SMS alerting, REACH can send status messages to engineers and other personnel even via our chatbot called RITA. Using state-of-the art technology needs to be fun, too!

  • Aug 10th, 2018

IoT Gateways

In our earlier posts we talked about how to store, process and analyze data, but missed a crucial step, how to collect them. An outstanding challenge for the IoT lies in connecting sensors, devices, endpoints in a cost effective and secure way to capture, analyze and effectively gain insights from the massive amounts of data. IoT gateway is the key element in this process, below we describe why.

The definition of an IoT gateway has changed over time as the market developed. Just like traditional gateways in networks do, IoT gateways function like bridges – and they bridge a lot, positioned between edge systems and our REACH solution.

IoT Gateway market

IoT gateways fulfil several roles in IoT projects. IoT gateways are built on chipsets that feature low-power connectivity and may be rugged for critical conditions. Some gateways also focus on fog computing applications, in which customers need critical data so that machines can make split-second decisions. Based on this IoT vendors can be divided into three groups. Vendors those who just give hardware (Dell), companies who focus on softwares & analytics (Kura, Kepware) and the end-to-end providers (Eurotech).

Our IoT Gateway solution belongs to the software & analytics group, which is an OPC client (communicating with an OPC server). OPC is a software interface standard that allows secure and reliable exchange of data with industrial hardware devices. reach i4 opc server  iot gateway mqtt kafka

What are IoT Gateways?

Gateways are emerging as a key element of bringing legacy and next-gen devices to the Internet of Things (IoT). Modern IoT gateways also play an increasingly important role in helping to provide analytics so that only the most important information and alerts are sent up to the REACH to be acted upon. They integrate protocols for networking, help manage storage and analytics on the data, and facilitate data flow securely between edge devices and REACH.

Mainly in Industrial IoT there is an increasing movement towards the fog as is the case in many technologies.

Intelligent IoT gateways

With fog computing (and the movement to the edge overall) we really enter the space of what is now known as an intelligent IoT gateway. Whereas in the initial and more simple picture an IoT gateway sat between the sensors, devices and so forth on one hand and the cloud on the other, a lot of analytics and filtering of information is now increasingly done closer to the sensors through fog nodes for myriad possible reasons as explained in our article on fog computing. The illustration below shows where the intelligent IoT gateway (and soon they’ll all be intelligent) sits in an IoT architecture.

reach i4 opc server  iot gateway mqtt kafka

(img source: https://www.postscapes.com/iot-gateways/ )

  • July 10th, 2018

Machine Learning

Machine learning is a process how we make software algorithms to learn from huge amounts of data. This term was originally used by Arthur L. Samuel, who described it as: “programming of a digital computer to behave in a way which if done by human beings”. ML is an alternative way to build AI with help of statistics to find patterns in data rather than using explicitly hard-coded routines with millions of lines of code. There is a group of algorithms, that allows to build such applications that can receive input data and predict an output dependings on the input. ML is also can be understood as a process, when you “show” tons of data – text, pictures, sensor data – to the machine, with the required output – this is the training part – and then you “show” a new picture without required output and ask the machine to guess the result.

Use cases

Machine learning has grown to be a very powerful tool for various problems from different areas, for example text processing – for categorizing documents or speech recognizers (chatbots) – or image processing – where we train the algorithm with hundreds of thousands of tagged pictures, to be able to recognize persons, objects, etc.

However, from our point of view, the more important use-cases are those where factory machines and processes are involved. In this case we have to collect, assort and store many different sensor readings from different producing robots to train our machine learning solutions for different purposes – like predict these machine’s failures.

Before training, different tasks have to be done, like data preparation - which contains for example filling or throwing empty cells out, standardize data, sort the important features etc. - training - involves feeding the cleaned data, to the algorithm, to adjust itself - and finally, we have to be able to measure and improve our solution with new data, and parameters.

For these purposes a data pipeline should be built, using the same processes to a newly arrived data as we have done in the training state. With REACH, we are able to do all of these tasks - data preparation, training and building the data pipeline - easily, and in a user friendly way through the UI, for useful solutions which will result in downtime and cost reduction, too.

Machine Learning within REACH

We provide solutions for many different problems occurring during the lifetime of a machine learning project: we have different tools for different roles: we provide an easy and simple graphical UI with pre-configured models, which helps you to focus only on the data and the pattern behind it.

Of course, whit this approach you will also be able to tune the model parameters and compare them to find out which parameters are more suitable for your application. For developers who want to build their own solution, REACH is also perfectly suitable; with an embedded jupyter notebook, users are able to build any model with different technologies – like scikit learn, spark ML lib, tensorflow, etc. These models could be deployed to a single machine or distributed to the cluster to reach the best performance and better scalability.

Towards to the future

Machine learning as a term is so pervasive today, but many people use it in wrong way, or mix it up with AI or deep learning – our following blog posts topics. Whit this introduction you can get a little insight from this technology to have a general picture how to build applications which are able to improve their performance without any human interaction by analysing data and using feedback of performance.

reach i4 data lake big data cep hadoop machine learning ui model learning
  • June 10th, 2018

Hybrid Architecture

As we discussed in our previous blog post, data lake and Big Data is a required technology to cover all the needs of the Industry 4.0. But what about my classic data? Should I transfer them to a data lake? Do I have to redesign all my application and process to use the new storage layer?

The answer is definitely not. You do not need to throw out your classic databases, systems and processes that uses them. Our terminology and best practices says that a classical database engine can live in a smooth symbiosis with a modern Big Data data lake system approach. It’s only about a well-designed architecture, which is not easy to create. That’s why our experts designed REACH to be ready to handle such situations and we call it a hybrid architecture, where all data goes its proper place. Some data should be stored in the data lake system, but some of them should go to the classical storage layer.

The question is where to combine these datasets. REACH is designed to be ready to combine these different data sources also on process and analytical level, so at the end of an analysis you cannot distinguish whether a data came from the classical layer or from the Big Data data lake.

We believe in the proper storage technique that says all the data must go to its proper storage layer, we should not say use only one kind of storage for all your data. Going further by implementing this technology our experts designed REACH to handle multiple storage layer not only saying this should go to the big data layer or into a classical layer but saying that you should use different storage techniques inside the data lake and also in the classical storage layer. A good example for this where Kudu, HBase and HDFS lives next to each other by extending the storage techniques from the classical layer where relational database storage is also mixed with standard file storage techniques. That is why we cannot say that one database engine is good for all. REACH is designed to support this multistorage approach to get the maximum out of your data.

reach i4 data lake big data cep hadoop hybrid architecture
  • May 10th, 2018

Data lake

In the world of Big Data traditional data warehouses are not sufficient anymore to support the requirements of the 4.0 level industry and to become the foundation of truly real-time solutions. In contrast to the structured data storage concept of the traditional data warehouses, a Data Lake can offer a solution that will keep the original format and state of the data and provide real-time access to them. The greatest advantage of a Data Lake is that it is capable of storing tremendous amount of data while preserving its raw format in a distributed, scalable storage system. Therefore, it is possible to store data coming from various data sources so that it is adaptable for future requirements, resulting in such a flexibility that current data warehouses cannot provide.

What can a Data Lake offer?

The concept of a Data Lake enables factories to fulfill the requirements of the Industry 4.0, to make data generated during production accessible for other participants in the production line, in the swiftest, smoothest way with the help of the Complex Event Processing method (introduced in our previous blog post), as the data is stored in its original raw format and no data transformation will slow down this process. For this reason, REACH is putting Data Lake in the heart of its architecture so that we could contribute to the competitiveness of our partners. Without the modernization of the storage process, real-time analysis and automations are not possible. Companies that seek to utilize machine learning methods, need to possess a wide range of data sources to provide the sufficient amount of data for the algorithms. Cost is also an important element: in case of a Hadoop-based Data Lake utilizing well-known big data techniques, storage costs are minimal compared to a standard data warehouse solution, because Hadoop consists of open source technologies. Furthermore, its hardware requirement is also lower due to its distributed setup, so it can be built even on commodity hardware.

Should data warehouses be replaced?

One of the main aspects during the design of the REACH architecture was integrability, therefore it offers interfaces to connect to various data sources and applications. See our upcoming blogpost of hybrid architectures!

reach i4 data lake big data cep hadoop
  • April 10th, 2018

Complex event processing

The complex event processing paradigm (CEP) is a fundamental paradigm for a software system to self-adapt to environmental changes which has been introduced to follow, analyze and react to any incoming events which require near real-time responses, though early detection and reaction to emerging scenarios. A CEP architecture has to handle data from multiple, heterogeneous sources, apply complex business rules, and drive outbound actions. The applied technique for tracking, analyzing, and processing data as an event happens and is useful for Big Data because it is intended to manage data „on-the-fly”.

The amount and complexity of data is growing

CEP utilizes data generated continuously - everywhere in a factory - from different sources such as sensors, PLCs, location tracking data, AOI, etc. This data is generally different from former data sources, as it is not prepared or clarified in any way, therefore it tends to be messy, intermittent, unstructured and dynamic. There is a need to handle data asynchronously, which means that the architecture should facilitate multiple processes simultaneously on a single message or trigger. The number of devices, volume and variety of sources, the frequency of the data are all growing and will not scale using traditional approaches to computing, storing and transporting them.

Complex events in real time

In many use cases, latency matters. Delays between a data event and a reaction often must be near real-time. Throughput is impacted along with data growth, and delays will become unacceptable. Latency must be low; typically less than a few milliseconds, but sometimes less than one millisecond, between the time that an event arrives and it is processed. Traditional approaches to centralizing all data and running analytics (even in the cloud) are unsustainable for real-time use cases. Traditional and cloud-based data management and analytics can pose security challenges as they are physically outside of the data center’s security perimeter. As machine learning intelligence become more commonplace in devices out in the field, those devices become more complex, requiring greater CPU and memory and drawing more power. Increasing complexity slows processing down and leads to data results being discarded since by the time results from the device are gathered, more recent data is desired.

Walking the bridge towards complex event processing

There is a need to bridge the gap between the traditional approach and solutions with new Big Data technologies such as CEP. Leaders across all industries are looking for ways to extract real-time insight for their massive data resources and act at the right time. Bridging of this gap will enable your company to become a real Industry 4.0-ready factory, so that business outcomes can be maximized by making better-informed, more automated decisions and delivering a better service, higher quality to the customers. Our REACH (Real-time Event-Based Analytics and Collaboration Hub) provides such a bridge for you. Let’s make a Proof-of-Concept together to reach your low-hanging-fruits and deliver tangible business benefits within a couple of months!

reach i4 complex event processing complex event space platform fog computing edge computing distributed iot industry digital twin

(img source: Steinberg, A., & Bowman, C., Handbook of Multisensor Data Fusion, CRC Press 2001)

  • March 10th, 2018

EDGE vs FOG vs Cloud

In our previous technical blog post we revealed the technology behind the term “fog computing”. As we discussed in a very high level fog computing is something about bringing cloud functionality closer to the data while we transport the data a bit away from the edge. Where the data crosses the computing functionality we are talking about fog computing. Now it is time to make a comparison between Edge, Fog and Cloud computing.

Edge computing

In edge computing, physical devices, machines and sensors are directly connected into data processing devices. This device does processing on the data, specially aggregating, transforming or running no performance intensive algorithms. This technology is usually used when the data in digital recognisable format. In this case edge nodes transform these non digital values to digital one and transport them into the fog layer for further analysis. It is also possible to do non performance intensive pre-calculations on the edge nodes, but it is not really preferred because we lose the possibility to run complex algorithms on the raw data in the fog layer.

Fog computing

As we discussed in our previous post, fog computing is about somehow process complex performance intensive algorithms on the data collected from the edge nodes. These algorithms uses more resources and / or requires data to perform the calculation that is nor available on the edge nodes, so this is not possible to run the on them. In the fog layer it is also possible to generate control signal, to transport them to the edge nodes and make the system ready to control any machine or device based on complex, calculated events.

Cloud computing

Cloud computing is about to buy resources from the cloud providers and put the data into the cloud for analysis. The term of IoT it is unlikely to push all raw data into the cloud system. In the cloud computing layer we have to make difference between private and public cloud systems. Public cloud does not mean that your data will be available for public access but the resources and the services can be ordered by anyone who pays. Private cloud systems are more about connecting several locations into a centralized architecture, such as data centre and owing the whole architecture by the company. Both private and public cloud systems are (external) network dependent, so in the Industry 4.0 architecture it is advices to use them only for cross company aggregated data analysis.

As you see we could not say that Edge, Fog or Cloud computing can cover a complex Industry 4.0 system by alone. Our implementation experiences show that fog computing layer is the most important layer in such architecture but it must live in a perfect harmony with Edge and Cloud systems.

reach i4 platform fog computing edge computing distributed iot industry digital twin
  • February 10th, 2018

Fog Computing.

In this context, „Fog” means that „Cloud” moves down, closer to the ground, to the machines, sensors and legacy systems.

Fog and Cloud computing are complementary to each other. As the amount of data increases, transmitting it all to the cloud can lead to challenges such as high latency, unpredictable bandwidth bottlenecks and distributed coordination of systems and clients.

Fog computing brings computing and applications closer to the data, saving bandwidth on billions of devices and enabling real-time processing and analysis on huge datasets and streams.

Our product, REACH is a Fog computing platform that delivers real-time, event stream processing capabilities by using distributed computing, analytics-driven storage and networking functions that reside closer to the data-producing sources

reach i4 platform fog computing edge computing distributed iot industry digital twin

Send a message

Successful sending
An error occured