Market uncertainty is a practical guarantee for those in the startup world.
One of the key errors that most founders make is exclusively considering their user base holistically. The reality of the situation is that your user base is diverse, and making decisions without acknowledging their diversity is a sure way to fail.
Cohort Analysis is a method of data analysis that classifies your data into distinct groups, and then compares key metrics across these groups.
An Example Of Its Necessity
Let’s say your Canadian company just launched a major marketing campaign in the United States, and saw the following results across your metrics.
- Signups: + 50%
- Revenue: + 30%
- Location of User Base: 70% in the US, 30% in Canada
Wow! After starting to market in the US, our signup rate and revenue went up, and the US now has a majority share in our user base. We might make the decision based on this data to focus more heavily on the United States.
The problem is that if we look exclusively at Signups and Revenue for Americans, it might look like this:
- Signups: +80%
- Revenue: +2%
And for Canadians like this:
- Signups: +10%
- Revenue: +40%
It was only after breaking down our user base into distinct cohorts, that we could see what was actually happening under the hood. If we hadn’t looked deeper, and just acted on our instinct to move our operations in the US, we likely would have severely harmed our business.
How to Conduct a Cohort Analysis
While there is a whole world of higher-level algorithms to conduct in-depth cohort analyses, a lot of information can be revealed with just some simple spreadsheeting.
The first step to conducting a cohort analysis is identifying the key metrics. What do you want to understand about your user base? Do you want to optimize Revenue? Daily usage? Perhaps there is a specific feature getting used but you are not sure why.
The next step is to decide how you will divide your users into Cohorts. If you are doing this correctly, it shouldn’t be an easy process. There are all sorts of factors that affect user decisions, so you will likely need to try out dividing your users in a number of different ways to see what makes a difference in the metric you are evaluating
If you are looking to do this algorithmically, you should check out Cluster Analysis.
Once you have defined your cohorts, the next step is to see how these cohorts differ on key metrics. Do users in region X likely pay more. Are users who signed more than Y weeks ago more likely to use this feature? This is the part where you can gain interesting insights on how your product is being used by different groups of people.
The Major Pitfall: Causality and Correlation
This cannot be said enough: basic Cohort Analysis cannot tell you why two things are correlated, it can only tell you the extent to which they are on your current user base.
If we divide our users into gender-cohorts, and find that men are 50% more likely to spend money, that does not mean men will be more likely to spend money. It simply means that of our current users, the men are more likely to be spending money.
Before you make a decision from your cohort analysis, you need to understand whether this is a causal relationship in addition to being a correlated relationship. This is one of the most difficult problems in analytics.
A helpful way to investigate causal relationships is through randomized controlled trials(Such as A/B tests), but even then you need to be incredibly cautious of sampling bias.
For startups, data analysis refines your questions,not answer them. In the above example, the cohort analysis turned us from asking “Who is spending money on our product”, to “Why are men spending more money on our product”. Further cohort analysis can refine these questions until we can use reasoning to deduct an answer.
We might next divide our Male users into their marketing funnels(The way they found your product), which could give us an even more refined question of “Why are men who found us through twitter more likely to spend money on our platform?”.
While it might not be a golden bullet, Cohort analysis is essential to understanding your user base, your product, and your business. Ignoring it is one of the fastest ways a product can fail.