• AI Developers Masterclass –  November 9, 2018 | Cluj-Napoca

    Starting a Big Data project: from collecting the data to machine learning using Kafka, Spark, HDFS and Zeppelin



Intelligence has been and it will always be one of the most important resource that humanity ever had. And if we had access to a lot more intelligence, than there’s no limit to what the human race can do.

The niche of business intelligence is showing a brand-new life. We know what you are thinking… Sophia much? No, it’s not about it. It’s about building, adapting beyond borders, deep learning, reasoning, testing, running correctly and never-ending task (like evaluating the prototypes and robotic systems). We’re giving the game a brain, constructing a system of action and reaction.

We are working on a new future: we can build Ai without losing control over it. How is that? We had to be quite sure that the purpose we put into the machines is the purpose which we really desire, because the robot’s only objective is to maximize the realization of human values.

We need to create machines that are altruistic, that want to achieve only our objectives. The robots are uncertain about what those objective are, so will watch all of us to learn more about what we really want. Hopefully, in the process, we will learn to be better people.

The only problem in the relation human – AI is US. So let’s redefine AI.


Starting a Big Data project: from collecting the data to machine learning using Kafka, Spark, HDFS and Zeppelin

This Big Data intro day will focus on 3 technologies: Apache Kafka, Apache Spark (mainly the streaming and ML parts) and Apache HDFS and one powerful notebook: Apache Zeppelin that will allow us to connect in one place all these technologies and build an end to end big data project. We will use open data from meetup.com and try to find out what makes users more likely to RSVP to different meetups hosted through meetup.com platform. The day will be split into following sections :

  • Intro in Big data architectures
  • Apache Kafka theory intro
  • Apache Spark theory intro
  • Apache HDFS theory intro
  • Spark ML – available libraries, logistic regression explained
  • Building an end to end solution – hands on driven session

We will use Cloudera CDH during the day and a setup in the cloud. The participants need to have a system that allows connection to public cloud (sometimes VPNs interfere with this connection so it’s desirable they are disabled), an SSH client and Google Chrome installed.

Participants prerequisites: most of the exercises will be led by the trainers, but SQL knowledge would be good. Understanding of distributed systems is a plus.


Felix Crișan

Co-Founder and CTO @ Netopia, co-Founder @ BTKO.io

Felix is currently the Co-founder and CTO of Netopia Payments

Read More

Valentina Crișan

Consultant & Trainer Big Data Technologies @ Densodata

Consultant in Big Data and Cloud domains

Read More



BT Arena, Strada Uzinei Electrice, Cluj-Napoca 400375




Bianca TRITEAN, Project Manager 


​0040 773 392 398