Lies, Darned Lies and Web Stats

Interesting post on TechCrunch today about how Alexa thinks that YouTube is more popular than Google. According to Alexa, who use their toolbar to measure popularity of various websites around the web, YouTube has more page hits than Google does. However, everyone know that this just isn’t the case. It’s just that the numbers can be manipulated to make it look that way.

There are a number of flaw in the way Alexa operates. First of all, it’s terribly easy to spoof Alexa with the right requests to make it think your website is more popular than it really it. I’m not saying that other search engines are immune to this, Google can be gamed too; however Alexa seems like something that was invented 10 years ago and expected to withstand the ravages of time (and hackers). The other problem with Alexa is the idea of using a Toolbar to measure traffic. This picks up a segment of Internet users and uses them to guage trends across the whole Internet. However, attributes of this sample will skew the results Alexa shows. First of all, users need to be savvy enough to install the toolbar. You also need a set of users who aren’t concerned about their privacy and the fact that their browsing patterns are sent to a third party. You also need users to be made aware that a toolbar exists (which means webmasters and site owners are more likely to form part of the sample). Such a sample is far from random, and unlikely to be a fair representation of site popularity.

The problem remains, however, what is a reliable way of measuring site popularity? Web stats for the same site vary tremendously from one statistics package to another and statistics package will rank sites differently depending on what is being measured. Nielsen? Netratings’? suggestion of? adding duration to the mix of metrics has received mixed response. In the meantime, until something better comes along, measures like Alexa, PR, Compete etc are with us to stay.

3 comments

  1. This is always a hot topic whenever it comes up. At the current time the feeling is “Alexa is bad but it’s better than nothing”. Alexa spoofing exists, sure, but their main drawback is that they’re based on an unrepresentative sample. I and other technology-savvy surfers don’t tend to install toolbars because we’re worried about privacy, spy/adware, etc.

    One solution is for Alexa to have some sort of algorithm which acts on the raw data to adjust for user metrics.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.