Note: No data will go to the server. This is all client-side in your browser.
Format is comma-separated with quotes like Excel. A guide to columns:
First row is used to name the various user states you have. It should be in the format "Cohort group type,Cohort group value,Cohort day,State1 name,State2 name,...".
The dataset may have the same user appear in many rows that are of different group types. What's important is for each group type the user is only in one row with one group value; aka the tuple (group value, cohort day) should be a unique row for each cohort group type. Order of the rows does not matter.
Here's a simple example for a fake social network with "Sign-up Referrer", "Favorite feature", and "Total" cohort group types. This only counts a user towards one sign-up referrer row, one favorite feature row, and one total row. In the case of "Total" the only valid grouping value is the empty string, which this tool treats specially.
|Cohort group type||Cohort group value||Cohort day||Born||Updated profile||Sent first message|
|Favorite feature||Reading news||10/25/12||10||5||0|
|Favorite feature||Reading news||10/26/12||5||5||5|
The sign of a column value is treated specially. The cohort data may include two rows with the same (group value, cohort day) tuple, but one with all positive column values, and one with all negative column values. Example usage: When a user joins your service, have a positive value on their birthday. When the same user leaves the service, have the same negative value on their leave date. This lets you see the rate of sign-ups and drop- offs independently. If you view both positive and negative values cumulatively, you will plot your peak active users over time.
A cohort is a group of people who share a common characteristic or experience within a defined period (Wikipedia). It's used a lot in health to track how different groups of patients respond to disease and medication. It can be used in business to track progress in the funnel.
For software and websites, cohorts are useful because they let you measure the impact of your product changes over time. Simple example: Using cohorts you can see the conversion rate of new users from two months ago and compare it to new users of today. Ideally, this would let you judge if your software is getting better over time, users are getting happier, etc. Sometimes you'll see that things are getting worse.
Cohort analysis can apply to more than just users. For example, you could treat a set of articles on a blog as the source dataset; the levels of traffic or reshares could be mutually exclusive states; and you could treat common tags as a group type, or author as a group type. This tool would let you drill down and compare all of those groupings pretty easily.
One of the most interesting things about cohort analysis is the graphs change over time. If you take a snapshot of your cohort data today and then take another snapshot two months from now, you'll see that the older cohort bars have changed. This happens because users who signed up two months ago remain active and continue to make progress in your funnel over extended periods. It's useful to save your cohort datasets after you collect them, so you can compare to the past.
Other things to read to understand the motivation and method behind this:
Here's a quick guide to the calculations presented on the right. A "bar" is every cohort for a particular day. A "bar segment" is one part of the bar for a day in single color, corresponding to a particular "cohort state" (like "Made two posts" above), which is usually some level of progression in the funnel.
Copyright 2012-2015 Brett Slatkin
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.