It all started with AOL’s leaked click through rates in 2006 that revealed to us the percentage of clicks each ranking position received – a mystery previously unsolved. This was the first stepping stone in the attempt to not only estimate current traffic, but estimate potential traffic. In the years since, there have been numerous CTR studies which all report similar – but different – results. With all this variation in research methods and results how are we to choose the proper formula for our own sites? Furthermore, how can you be sure any of the suggested click curves apply to your market sector, search vertical, or brand? Below is a methodology I’ve used to determine a site’s SERP CTR, using data generated from the site itself rather than industry studies.

**First, I must say that as a student of the scientific methods, there are a number of caveats in this procedure and is in no way meant to be an exact solution. Instead, it is only to be used as a more accurate and applicable estimation for your own website.** Although imperfect, this CTR curve is more relevant than industry data, and certainly better than nothing. That being said, the data set you’ll use for this is from Google Webmaster Tools (gasp!). Yes, I know, but just stick with me as it is the only source for both clicks and average ranking for ALL your keywords.

In order to make this analysis the ‘most valid’ it could possibly be, you need a large amount of data. In fact, the more data the better. Not only do you need many months of data, but you also need to have a site with sufficient keyword rankings at each position. Realistically, this is probably about 5-10% of all websites, as it utilizes averages at each ranking position to determine the average CTR. The fewer instances of ranks at each position (1-10), the less confident you’ll be in the data. Additionally, since this method uses search volume, it assumes that each search results in an organic click. **This method will help you estimate what percentage, and volume of clicks, your site can expect for any given keyword at any ranking position.**

## Getting Started

If you’re like me, you pull GWMT data every month and store it somewhere safe. So, first grab as many months of query data as you have available. The three columns you’ll need are Keyword, Clicks, and Average Position. If you’re working with more than 3 months of data, there will be repeats in keywords so be sure to do some data munging to get it all playing nicely and de-duped.

## Step 1: Hunting & Gathering

- First, sort your spreadsheet by clicks. Delete all the rows with keywords that have “<10″ for their clicks. Since you don’t know this number, just throw them out.
- Now that you’re working with only the keywords that have clicks, we’ll want to see how many possible clicks there are. Go through at about 100 at a time, and grab the monthly exact match search volume for all keywords. Create a new column in the sheet for these. Be sure to multiply the search volume data by the number of months of data you’re using.
- Standardize the Average Position data by rounding it. Highlight the entire column and Format Cells to round them all to whole numbers (no decimal point). Once completed, the Average Position column should only contain whole numbers to represent each ranking position.

## Step 2: Crunch Time

- With Auto Filtering still enabled for the row of column labels, select one ranking position at a time. For example, select #1 so that you only see the keywords with an average (rounded) rank of 1.
- Create a new column next to Search Volume and label it CTR (Share of Search).
- Divide the clicks by the search volume for each keyword, and store it in this new column. You can copy and paste the formula for speed. Again, you’re only doing this for one ranking position at a time.
- Now you should have a column populated with Share of Search (the percentage of clicks out of the total search volume, at each ranking position).
- Grab the average percentage for the CTR column. This is the average percentage of earned clicks out of the estimated clicks available. Since you’re only looking at one ranking position at a time, you’re finding this average percentage for each individual rank, one at a time.
- Repeat this process for each of the 10 ranking positions. Again, you need a sufficient number of keywords at each ranking position in order for this to have any semblance of validity. For everything on the second page (11+), just examine these all together.

**Here is an example of this analysis using fictitious data for a running shoe e-commerce site.**

## Your Own SERP Click-Through Curve In Practice

The strength of this analysis lies in how much data you can muster up, so if you don’t have the habit of collecting your GWMT data, start it now and come back to this is a few months. **As an analytics practitioner, you’re the only one who can determine the confidence of your data – and the more the merrier.**

**Now that you have the average click through rate for each ranking position, you can make a more accurate judgement call of how your site will perform for keywords you have yet to pursue.** Additionally, you can approach clients with realistic expectations of traffic by keyword and rank, using conservative and aggressive estimates. You can even use this method to formulate specific click curves for branded terms, unbranded terms, mobile SERPs, desktop SERPS, or even search verticals like image search. One of my favorite variations of this is to analyze only the keywords which utilize structured data to compare and contrast with the regular, boring search results.

Although this method has various caveats, estimations, and requires a bit of faith, it is derived from your own data. **If you’re comfortable with the amount of data used to formulate the click curve, it can be much more applicable to your site than mere industry data.** In the comments below, I’d love for readers to poke more holes in this analysis, and propose other methods to achieve a customized CTR.

Interesting idea, but you’d want to be able to identify when there’s a major SERP change – e.g. Knowledge Graph shows up, keyword becomes “Caffeinated,” universal SERP, or becomes an 7-SERP.

Excellent point Victor. I didn’t think about the 7-packs, I wonder what kind of effect the non-traditional SERPs would have on the CTR spread.

It would definitely be important to segregate them out when doing this analysis. Thanks for adding this!

I definitely think that building your own data sets is a great way to go. Only once concern with GWT data (which I’ve been digging into a lot lately) – if you use the exported data, aren’t you looking at average ranking? I’m not sure that using average ranking data to estimate CTR for that keyword won’t run into trouble (I’d have to run simulations, I guess).

I do absolutely agree that generalized CTR curves are pretty useless, no matter how good they are. We’ve been looking at ways to incorporate CTR, and the best I’ve come up with so far is that all the studies show a similar shape and drop-off. So, if we use the concept of a curve, without getting hung up on the exact percentages, it can be useful. I do think that looking at rankings as linear is very misleading, and some integration of CTR can help a lot.

So, all of that is to say that even poor data may be better than what we do now (where we imagine ranking is linear). I’m all for experimenting.

Stumbled upon this article when looking for new industry CTR curves. Good read, however I’m thinking this approach, combined with GA position tracking might render more accurate results. Or at the very least be interesting to compare against. There have been many accounts (e.g., http://www.portent.com/blog/analytics/google-webmaster-tools-query-data-is-worthless.htm and http://www.distilled.net/blog/seo/new-google-webmaster-tools-keyphrase-data-is-70-useless/) of GWMT data being grossly inaccurate in certain situations, so I’d be interested to see how the ranking data compares. Thanks for the interesting and informative post!

Hi Zach, you’re absolutely right about the lack of confidence we should have in GWMT data; and I struggled with that while developing this method and writing the post. In fact, I never use it as a source of raw data measurement, but instead to make comparative observations over time. My hope there is that although it is inaccurate, it might be consistently inaccurate!

That being said, since the search volume data comes from Google (the denominator) I thought it might be best to also use Google as the source for the ranking data (the numerator). Obviously, this is not a perfect solution, but like you, I’m also curious to see if other sources of data might help (whether it’s GA or otherwise).