Adopt the “10,000 Experiment Rule” like Netflix and Facebook
Will Lam
7/16/2023
If you've ever learned to play a sport or a musical instrument growing up, you've probably heard the phrase, “practice makes perfect.” As kids, we were conditioned to think that the more time we devoted to learning a craft, the closer we'd get to perfecting our skills.
Today, that may no longer be true, especially in rapidly changing fields such as technology and business. Rather than spending 10,000 hours learning how to become a world-class performer, Empact founder Michael Simmons recommends that you follow what he calls the “10,000 experiment rule.”
Simmons' 10,000 experiment rule is pretty simple, and it's a principle that's been practiced by leaders and operators at companies like Facebook, Netflix, Google, and Amazon. It simply states that “deliberate experimentation is more important than deliberate practice in a rapidly changing world.”
Whether you're trying to build the next Facebook, or create a mobile travel app, the principles that govern how products look, feel, and behave today will change tomorrow. Since Apple launched its App Store, we've seen endless experimentation applied to mobile applications, and we're seeing more to come, especially in augmented and virtual reality.
This fast-moving world can feel daunting for product and engineering teams looking to build impactful products, but there's a simple rule to navigating it. Avoid making large, massive bets on one or two big projects/features. Instead, build the habit of running smaller experiments at a faster cadence.
Source: Michael Simmons
Rather than betting the company on a major app re-design, start by running smaller tests on your navigation and onboarding flow. These won't necessarily immediately fix your retention or engagement problems. But week over week, making small, incremental improvements of between 2-5%, can translate into a massive increase to your bottom line. That's because small wins compound into big ones over time.
Leading companies today break new ground by experimenting quickly and often, rather than trying to slowly gain expertise around a specific industry. Companies like Google, Amazon, Netflix, and Facebook, run thousands of experiments each year.
Rapid Release and Experimentation at Facebook
If you've ever looked closely at the Facebook app on your phone and compared it to a friend's, you may have noticed a difference between the apps. That's a feature, not a bug.
While startups are often told to “move fast and break things”, many companies—Facebook included—don't have the luxury of breaking things. To scale to billions of users, Facebook has to rapidly experiment and innovate on product. At the same time, with so many people relying on Facebook each day, the company can't run the risk of breaking its app.
As Facebook VP Andrew Bosworth says,
The most obvious approach might be to imagine the future you want and build it. Unfortunately, that doesn't work that well because technology co-evolves with people. It’s a cycle—technology pushes people to move forward and then people move past technology and it has to catch up.
When Facebook runs experiments, it will often test new ideas and features through a feature rollout where it's tested upon only a small fraction of users. In some cases, Facebook will scope this even more specifically by deploying only to people within a specific market or country. This gives Facebook a granular level of control over how it's able to test new features, get feedback, and improve engagement and usage of the app. Feature management solutions like DevCycle enable gradual feature rollouts so that you can release features to a smaller subset of users and validate their functionality before sharing them with your entire user base.
This graph shows the deployment of an experiment over the course of a week. The light green bar shows the total number of users in the experiment, while the dark green bar shows the number of users the experiment impacts. Each day, Facebook rolls the experiment out to larger number of users.
Source: Facebook
Here are a couple of ways that Facebook has applied this approach to rapid development and experimentation over the years:
- Disappearing messages and Stories: Facebook-owned Instagram famously copied Snapchat's "Stories" feature, growing to 250M daily-active users. What's less well-known is the fact that this was the culmination of a series of failed experiments that Facebook had launched around disappearing messages and quietly killed.
- The “Explore” Feed”: Facebook tested a secondary “Explore” Feed to its primary News Feed in six countries, to surface content to users for pages they hadn't explored with. Early test results show that the new feature resulted in a 60-80% engagement rate. Rather than experience that drop in engagement across the app globally, Facebook was able to limit the experiment by testing it on a smaller segment of users first.
The moral of the story is that adopting the 10,000 experiments rule isn't about recklessly throwing ideas at the wall to see what sticks. It's about working within the constraints and providing yourself with the ability to safely run a lot of experiments and learn at scale.
Cross-Platform A/B Testing with Netflix
Like Facebook, Netflix deals with many of the same issues around rapid experimentation and development. Netflix has over 109 million subscribers worldwide, and one study shows that it's single-handedly responsible for 35% of internet traffic in North America.
Building around experimentation is what has helped Netflix build such a sticky product. Netflix researchers estimate that if a typical user doesn't find something to watch in the app within 60-90 seconds, they run the risk of getting bored and moving onto something else. The company fanatically A/B tests everything--from the content that a user sees when they open the app, to loading speeds--in order to optimize their UI.
As one company blog post wrote: By following an empirical approach, we ensure that product changes are not driven by the most opinionated and vocal Netflix employees, but instead by actual data, allowing our members themselves to guide us toward the experiences they love.
To achieve this, one of the biggest challenges Netflix had to solve for was the fact that its users access the product across platforms, from their laptops, to mobile phones and gaming consoles. To make data-driven decisions, Netflix has to test continuously—but degrading the user experience with playback interruptions and slow buffer speeds was unacceptable.
So the company engineered a cross-platform A/B testing solution called ABlaze.
A test schedule view in ABlaze, the front end to Netflix's A/B testing platform.
It works like this:
When a test runs on say, an iPhone, the iOS app sends a request to the Netflix API for more information around the user, device, and session. That information is relayed to the A/B testing client, which then queries other services for more context. Then, the client passes this information to the A/B testing server, which retrieves all tests the user is allocated to, and then figures out if there's room to identify additional tests.
While this sounds complex—and it is—it allows Netflix to really push the envelope in terms of the experiments it can run.
For example, the data team at Netflix found that users look at the artwork before deciding whether to click for more details around it. So they decided to run a number of experiments:
- First, they experimented with a simple A/B test to see if they could increase engagement by changing up the artwork by measuring click through rates, play duration, and other metrics.
- Next, they wanted to see if changing the artwork would contribute to increasing total streaming hours across the product. They tested to find the best artwork for each title over a period of days, then served that artwork to other watchers to see if that would result in a higher number of hours streamed.
- Finally, they experimented with finding a more efficient way of running the test by narrowing the number of users and time required to optimally find the winning variant for each test.
The result is that each piece of artwork that you see on Netflix may have been tested against five different variants. The important thing to note here is that Netflix didn't just decide that they were going to test all the artwork —which would have been virtually impossible. They ran a series of incremental tests to confirm their hypotheses before moving on to bigger and bigger wins.
Work Within Constraints to Experiment Faster
While it's unlikely that most products will need the same level of internal tooling and infrastructure as companies like Facebook and Netflix, there are important lessons here for how you should think about experimentation. To experiment faster, you shouldn't start by thinking about all the possible tests you could run. Instead, start by working within your constraints and maximize for impact. Often, that has to do with optimizing the quality of data that's coming in, how you're testing that data, and how your team processes the results.
By reducing the time it takes to deploy a test and get live feedback, you're able to speed up the entire experimentation cycle and learn faster.
Written By
Will Lam