Data Democratization at Airbnb

Etienne Gautheron on
Reeport Blog - Data Democratization at Airbnb

A sneak-peek into data wonderland.

At Reeport we believe that, in the near future, companies will have to be data-driven in order to be successful.

Although every company has tons of data, it’s tough to empower every employee to make data informed decisions. It’s tough because it takes more than data to be data-driven. It takes culture (will to make data readily available across the organization), tools (from data collection to data visualization and sharing), and education (knowing how to find and interpret data).

In this widespread struggle, a few companies stand out. Airbnb is one of them.

If you care at all about using data to give your company an edge over competition, Airbnb is definitely the company you want to draw inspiration from.

That’s why, earlier this summer, we asked Jeff Feng (Product Manager leading data visualization and  experimentation at Airbnb) if he could share a few words of wisdom – which he selflessly accepted!

In the spirit of openness, we wanted to share what we learned.

For a startup like Reeport, Airbnb comes in as an obvious source of inspiration: a suite of home-grown tools (Airpal, Superset, Dataportal), 100+ people on the data science team (!), and an in-house Data University aligned with one mandate: “Every employee should be empowered to make data informed decisions.”

Why is Airbnb investing so much in data democratization?

Jeff: There are a few reasons:

  • Airbnb is actually a very complex product:
    • It is a two-sided marketplace with hosts and travellers, each having various profiles;
    • We have a long conversion funnel: several touches across various devices, users are not necessarily logged in, it takes time between the first touch and the conversion whether you’re planning a family vacation or a business trip;
    • We’ve run 2,500 experiments last year, that’s an average of 50 experiments launched every week.
  • It’s in our culture:
    • We believe in building our own tools rather than buying;
    • We have a culture of sharing, we love open sourcing what we build.
  • We’ve got sponsorship from our top management
    • We, as the data science team, were challenged to think of the best way to share our knowledge outside of the data science team.

That’s actually where the Data University came in. Here are the two sides of the story:

  1. The data science and engineering teams were data informed.
  2. Many other parts of the organization struggled to use data effectively.

People were just not trained on the tools we had build or made available to them. They had to go find the right people and ask them what they were looking for.

As a result, while the data science team should be focusing on inference analyses and machine learning, they ended up doing a lot of ad hoc analyses for their colleagues.

Data Democratization = Toolset + Mindset

We thus came up with the Data University to teach everyone how to fish. The hypothesis was the following: if we teach everyone how to fish, the data science will get to focus on strategic analyses because non-data science teams will be able to find data-informed answers to their questions on their own.

The Data University is made for anyone to get started on data science and to answer their own questions, no matter their initial data science knowledge (or lack thereof).

What’s the number one thing that had the biggest impact on data democratization?

The Data University definitely is top one! We measure the impact of our initiatives by tracking the number of Weekly Active Users (WAU) on our data tools.

Ideally we’d like for our OKRs (Objectives and Key Results) to be measured by the number of good decisions but haven’t cracked that yet! So for now the best proxy we’ve found is the number of WAU of all data tools.

We’ve seen a big difference after launching the Data University. Now, in terms of tools, the Dataportal was truly instrumental.


As a company scales, different people have different types of data. The question is, how do you make it so that data overcomes silos? Also, how do you build trust around data?

Dataportal is a search engine for data. It pulls together all our tools so that all data is searchable by everyone at Airbnb.

Really? Everyone?

Yes, everyone.

You’ve got great tools that are interconnected, you’re on your way to educate everyone on how to use them… it all sounds like Airbnb is the land of wonders as far as data. Do you have any obstacles that prevent you from further democratizing data?

The number one obstacle is still to train people on how to use SQL, Superset, etc. and thus to make them autonomous.

Knowing what you know now, if you were to advise legacy organizations which do not have your culture and resources, what would be your top recommendations?

With legacy organizations come legacy infrastructures and thus unstructured data-sets spread across data silos. Therefore, step #1 would be to work towards a centralized single source of truth for data and a modern data infrastructure.

Step #2 would be to change the culture with one question in mind: “how do you help knowledge workers frame their questions with data and equip them with the knowledge to use and analyze data at your company?”