Breakthrough innovation of EIT Digital Master School students amazes global community of Apache Flink users

eit digital master school
Source: EIT Digital Master School

Two EIT Digital Master School students developed a feature to reduce the up- and down-scaling processing time for Apache Flink significantly in their thesis research. This is so relevant that they were selected to present their solution at two global conferences: Flink-Forward, organised by Ververica, the original creators of Apache Flink and BEAM Summit, which is organised by Google. Their solution can reduce up- and down-scaling time from hours to seconds. This saves companies a lot of money and reduces energy waste, making the computing industry more sustainable.

Apache Flink is an open-source framework and distributed processing engine, designed to run in all common cluster environments, perform computations at in-memory speed and at any scale. The framework is being used by big companies, like Alibaba, WWS, Uber, Zalando, eBay, LinkedIn, Spotify, Ericsson and Huawei to name a few.

Muhammad Haseeb Asif and Sruthi Sree Kumar, who both study Cloud Computing and Services at the EIT Digital Master School (now called Cloud and Network Infrastructures), learned about Apache Flink during a guest lecture of Assistant Professor Paris Carbonne in the second year of their double degree master’s at EIT Digital partner university KTH. They spent their first year at TU Berlin. “We were very interested in the topic,” explained the students. They discussed with Paris Carbone, the lead researcher at the RISE Research Institutes of Sweden, to make their thesis about Apache Flink.

In January they started working as researchers at RISE for their internship and thesis which is titled: FlinkNDB – Skyrocketing the stateful capabilities of Apache Flink. During their thesis, they have developed FlinkNDB on top of Mysql Cluster Engine. This is a major feature to improve the up- and down-scaling functionality on Apache Flink. Haseeb Asif explains, “Scaling in and out a cluster in no time is one of the active areas of research in distributed data processing.  When you process a lot of data, you need a lot of computing resources, vice versa less data needs less resources. But scaling between big and small amounts of resources might take hours. With our application, we reduce those hours for switching to seconds.”

Up- and down-scaling within seconds instead of hours has a huge impact, says Sree Kumar. “A lot of companies are not down-scaling when less data is being processed. That takes too much time and thus money. So, they keep all their CPUs running while only using a fraction of it when the amount of data processing is small. This is not sustainable. With our solution, we only need the resources that are needed. There is no need to waste energy anymore and that saves companies a lot of money.” The solution can also be used for quick system recovery after a crash.

While talking about this, their supervisors within the EIT Digital network suggested applying for a conference. “We applied for Flink-Forward, the annual global conference for the Apache Flink community. All big companies gather here to learn from each other on different types of innovations. And we were invited,” says Sree Kumar. Flink Forward was held from Oct. 19 to 22. They were listed as speakers alongside people from Uber, Netflix, Amazon, Bol.com, Yelp, Spotify, Intel, Ververica, Intuit, Microsoft, and Alibaba. “It was an amazing experience,” says Asif. “People were tweeting about our project; someone said that our presentation was the best talk of the day.”

Google approached the students themselves to speak at the Beam Summit 2020, a conference for the worldwide community of Apache Beam users and contributors that was held from Aug. 24 to 28. Google spotted the students because they were active within the open-source community about their work. They were using the Apache BEAM, open source from Google, to test their Apache Flink feature.  The topic of the speech was NEXMark-Beam: Your Best Companion For Testing And Benchmarking New Core Stream Processing Libraries.

EIT Digital Master School: Inspiring innovators to take the next steps

The students felt comfortable doing the presentations. After all, they are trained to be so at the EIT Digital Master School. “We have learned how to present, how to sell. All those things helped us a lot. All of this would not be possible without the education within EIT Digital Master School and the encouraging EIT Digital network.”

Their contributions to the conferences and the feedback they received, motivated the students to move to the next step. “When people say that we have developed something that they were looking for, it gives us inspiration to deliver,” says Sree Kumar.

The project they started working on is open source. The students are now considering making this a product and a business. Before that, they want to get the consultation of the EIT Digital network on how to get funding and finish their thesis, set to be finalised by the end of the year. Interested in the students’ work? Read their blog on Medium.