The end of my internship

Last week I finished my OPW internship.

It was a constructive experience, with both positive and difficult aspects.

The difficult aspects were related to the code review process. Firstly, because of my clumsy coding style mistakes and secondly, because everyone had to agree with the implementation. In order to get a patch committed, at least two Mesos committers should give you a “ship it” on the review. This is a good practice because it assures the high quality of the code.

It was necessary to rewrite the patches several times and this required lots of patience, but I didn’t give up that easy :D. I guess whenever you want to accomplish a goal, you have to fight a little bit.

Here I have to thank to my mentor who helped me pass through all the obstacles and answered all my questions.

Even though I didn’t manage to finish all I wanted, there are a lot of new things that I have learned about Mesos and I have improved my C++ coding skills. I feel extremely satisfied to see my patches upstream and encouraged to continue to more open source projects.

Thank you Apache Mesos for this wonderful experience!

Advertisements

Mesos deep dive

I am extremely happy that my exam session is almost over and it’s been a while since my last blog post. I take this as a good moment to write a new article about how Mesos works and give an insight about the current research efforts related to this topic.

At a top level, Mesos can be considered as a highly-available and fault-tolerant operating system kernel. Just like daemons or services run on top of an operating system, data processing frameworks are running on Mesos. The analogy is even stronger if you think on Mesos as a manner of abstracting CPU, memory, storage and other computational resources from a cluster away from this frameworks.

A good point to start reading more about Mesos is the article published for the NSDI conference Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center.

Each framework that runs on top Mesos can benefit from resource isolation and from what is called resource pooling. Resource isolation is done using containers such as cgroups or docker.

The principle of resource pooling is not only used for distributed systems, but also for networking protocols. For example, MultipathTCP, an extension of TCP, achieves this by transmitting data simultaneously on multiple subflows and shifting traffic from the congested paths to the less congested ones, using a coupled congestion control.

The allocation of resources is done by using two policies. One of them, Dominant Resource Fairness(more details in this paper Dominant Resource Fairness: Fair Allocation of Multiple Resource Types), seeks to maximize the minimum dominant share across all frameworks. For example, if framework A is CPU-intensive and framework B is memory intensive, the DRF algorithm attempts to equalize framework A’s share of CPU with framework B’s share of memory. The other one is implemented using strict priorities.

Fault tolerance and high availability is done by running multiple masters in a standby configuration using Zookeeper, a framework that does the leader election whenever the active master fails. The communication interface between frameworks and Mesos is composed from a scheduler which plans the execution of the framework’s tasks and an executor which runs on slave nodes and runs the tasks. The figure below better summarizes this:

architecture3

I hope you found this interesting and thanks for reading ! 😀

My OPW internship: first steps to being part of an open source community

Hi all !

First of all, I have to apologize for being so late with my first blog post! I am extremely happy to have at last the chance to contribute to an open source project and not just to any one, to the Apache Mesos. I heard about Mesos for the first time two summers ago, from a company that intended to use the framework for big data analytics. At that moment, I started to  read more about it (I found a lot of good research papers !!) and I became more interested about what Mesos can do, but didn’t have the time to “play” with it. Last months, I had the chance to get more involved and made my first contributions, which means a lot to me! 😀

So, what is my project about? My task consists in implementing the functionality to make Mesos work in an IPv6 environment. This problem concerns finding a convenient way to store the IP addresses in the code, write wrappers to abstract method calls for different address families, perform the routing and isolation of network containers and writing tests to validate the new functionalities.

So far, I concentrated on creating the IP address abstraction and the wrappers for the method calls of different address families. Here are my reviews and submitted patches.

The most important thing that I learned  is how to interact with the community, identify problems , propose solutions and discuss which one is better. From my point of view, these are the most important aspects (beside learning more about an open source project) that can help students/interns gain more experience from programs such as OPW or GSOC .

Even though I am a little bit behind with my timeline because it’s the end of the semester(I am a student) and I got lots of deadlines to meet, I am going to catch up!

Summing up, I am happy to have this opportunity, I am going to do my best and I promise will update from now on this blog more often!