Luis Gonzalez Blog

I Participated in Google Summer of Code 2026

Motivation

I always wanted to contribute to Open Source projects but I was too afraid to do it because of the unfamiliar codebases, imposter syndrome and my lack of experience. In late 2025, I decided to get over my fear and started looking for a project to contribute. I love Machine Learning (ML) and all the frameworks/tools around it so that is where I started looking. This lead me to Metaflow, a project I was vaguely familiar with. Metaflow is an ML framework that lets you build ML projects quickly and easily. After joining the Slack community in early October, I introduced myself in one of the main channels and got a welcoming message from the creator of Metaflow, Ville Tullos. I also sent direct messages to some of the core maintainers of the project and asked for guidance on how to start learning the codebase and get more familiar with it. I also had bought a copy of the Effective Data Science Infrastructure: How to make data scientists productive in order to learn more about Metaflow on a high level.

How it started

In early January of 2026, I decided to open LinkedIn and I saw a post from a user, announcing that Google Summer of Code was going to start in one month (February). I quickly searched on Google and learned about what the program was. GSoC is a program that lets early career and unexperienced developers contribute to Open Source projects. The way that it works is that you can join an organization you want to contribute to and each organization has multiple projects you can submit a proposal. In this case the project proposal is a document where you write about how you would implement the project with technical details, timeline, system design etc. I thought this program was a great opportunity to start my Open Source journey.

Shortly after the event started, Google announced all the organizations that were going to participate on the event. One of them was Metaflow, what a coincidence, it was perfect timing.

Luckily I had already joined the Metaflow Slack community on October, 2025 and GSoC was going to start on February, 2026

How to learn a big codebase

The Metaflow codebase is complex, having about 1,785,235 total lines of code across all files, with the core package being about 187,526 total lines, with about 123,492 Python lines. It also has a lot of Metaprogramming, Decorators, Dynamic API construction etc.