Member-only story

Flink in a Nutshell

9 min readFeb 10, 2019

In this post I will try to explain why Flink is gaining so much attention. I will review Flink from the AI and DevOps point of view.

Introduction

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. WOW, okay, what does this mean? Well, basically it can process reliably lots of data in real time and it does it really fast. The next question is: Why do I need huge amounts of data and processing power? Let’s rewind a bit…

I decided to write this post after reading this great explanation about AI. I already talked about streaming platforms and AI in the past, and both are related. AI is a broad topic but most business use cases resolve around Machine Learning classification algorithms that help to predict “things” and enhance customer engagement. So AI is just a thing labeler, you present the data to the model and the model adds labels. This is not as fancy as you may have though but it is extremely useful. Common tasks like recognizing someone are extremely difficult to program in a computer. What AI does instead, is getting a lot of examples with already defined labels, train the model and then use it to make predictions. So, you let the computer decide the best way to recognize patterns in the data to minimize the error and make good predictions. This simple model allows you…

Flink in a Nutshell

Introduction

Written by Javier Ramos

Responses (1)