-
Website
http://www.scobleizer.com/ -
Original page
http://scobleizer.com/2007/05/13/we-need-better-statistics/ -
Subscribe
All Comments -
Community
-
Top Commenters
-
danja
44 comments · 4 points
-
polizeros
52 comments · 1 points
-
AndyBeard
69 comments · 4 points
-
Zachary Adam Cohen
35 comments · 8 points
-
dbarefoot
40 comments · 3 points
-
-
Popular Threads
-
The best and worst thing Twitter did in 2009: RT
1 day ago · 22 comments
-
World-brand-building mistakes France’s entrepreneurs make
1 week ago · 181 comments
-
2010: the year SEO isn’t important anymore
1 week ago · 67 comments
-
A new addition here: the Meebo bar
1 day ago · 7 comments
-
iPhone developers abandoning app model for HTML5?
1 week ago · 52 comments
-
The best and worst thing Twitter did in 2009: RT
I personally think a way of measuring blogs (or perhaps most all services) should be how many "repeat" customers you get. Exactly the opposite of uniques then.
That is how all service businesses like restaurants, training centers and even seminars are measured and I dont see why it shouldn't be a way of measuring blogs.
What is the churn rate of the readers VS the uniques (not uniques by themselves). What is the average stay length per visit. Most importantly, how many of those people consistently over a month and actually participate in creating value for the service.
For blogs, thats with user comments; for wikipedia and other communities, its in extensions.
The only solution is ISP-side tracking (which is outrageous), opt-in programs for webmasters (via JS), or OS-level trackers.
The most realistic option is a company that does hosted webstats (like FeedBurner, Google, Performancing, etc.) publish their results.
Maybe I should code that app after all.
All it takes is Ethereal and some socket programming. You can even have the bot sign up accounts. I know for a fact some of the sites on the top 100 have homebrew versions of these applications and use them.
A quick Google search reveals there are already many php versions available for Alexa:
http://onlinehoster.com/blog/alexa-rank-cheater/
http://www.dailysofts.com/program/703/37178/Ale...
As for netcraft and the others, that's more complex, because they limit their "hit ratings" to limited user groups. Hard, but if you have money, still not impossible. At any rate, the cheaper ones like Amazon's Alexa and others are quite easy to emulate through a proxy server list or even a subnet you may have access to. Larger sites have large subnets, plus they can use the proxies as well.
I would suggest you cheat, because all the other sites are doing it anyway.
Of course this will not effect e-commerce sites, but it can boost the value of something like YouTube.com from 100 million to 2.5 billion in a matter of a couple months. Large sites use this fact to their advantage, and from what I've heard they mostly cheat.
Even boosting the value of a website by 1% on a billion dollar scale comes out to a million bucks. That's a LOT of money. And that's why these people are insanely rich. They are the best at making people perceive value in something that may not have very much. Manipulating Alexa and other sites like that are only 1 tool in the toolkit.
I don't have the means to get an infrastructure to perform something like that. If I could, I probably know all the dirty tricks they use to push it all the way up.
1. Pageviews per visit - how involved is my reader? Are they just checking the front page or are they looking through the archives or clicking the "Read the rest of this story" link?
2. Average time spent on my site - this is again leading back to involvement.
Comments are a good indicator, as well. In my opinion, and from a marketing standpoint, I think that I'd rather have 500 involved readers who actually "experience" my content rather than 1000 people who pull up my front page, don't even scroll down, and then leave.
Of course I don't make any money to speak of just yet, but from a personal satisfaction standpoint, that's what's important to me.
In my case, over 50% of my traffic comes from non-IE users (Firefox, Netscape, Safari, etc.), which Alexa doesn't count. So Alexa shows my traffic lower than it was a month ago, while Google Analytics and Sitemeter show it 50% higher. I wish one of those free services would implement an opt-in traffic rating and ranking system.
Having worked at a retail point of sale (POS) data tracking company, IRI, which was a company the CEO of Comscore founded, some insight can be culled from an industry that tried to crack this same kind of problem. Here are some thoughts for consideration:
When Nielsen and IRI historically posted retail POS data to their clients using statical sample-projection methods, they had an ace in the hole. They had state and federally required information on total retail sales, so invariably, they could push and pull individual sales data by product and store around to ensure the totals matched. They would use panels to overlay user-details on retail sales to create a user-sales analysis that was an interesting add-in of 'color commentary' to the sales trends.
Over time, as Coke and Pepsi or WalMart and Target would match their own data against these 3rd party vendors, the truth showed the sample projection methods inaccurate even at a national level, let alone on an item or store. This lead the POS data industry to go after store-level data in as many cases as possible to collect all information. As individual retailers' internal data became more available at the store level, and as stores aggregated from regional to national, folks like WalMart found working with outside data vendors less valuable and became disinterested in fully providing their data to benefit other retailers, resulting in greatly reduced effectiveness of 3rd party POS reporting (and a the admission of the fact that national analytics for retail POS weren't as accurate as were lead on).
What does this mean for web traffic sample-projection methods? Problems of using appropriate sample selection methods combined with the necessary sample size to get proper per-site analytics, down to the week or day, is a massive undertaking and while these methods can report an accurate portrait over longer periods (quarterly or annually), one should be wary of using this information at such a detailed report as weekly or on smaller individual sites. As a parallel to Comscore's goals, even when combining user information with state/federal retail sales, statisticians in retail POS couldn't easily nail per item sales per week which is much the same frustration web analytic seekers are hunting for today.
Comscore appears to have a system in place that can provide very insightful information at a high level. Comscore's reporting [over longer time periods] combined with very strong in-house metrics are the making of a robust analytic approach that looks across the horizon and focuses on a solid return on marketing investment within a specific business.
"For ad-supported sites, it seems like the site’s own server stats should suffice. Measure and publish the actual stats, not a bunch of imperfect proxies."
I'm taking some time off coding rereading your comment. Tons of good stuff here.
If people would accept people's words on it, Jupiter research, Alexa, netcraft ect... would not make much sense.
Just like a company reporting earnings, people want an audit of some kind by an independent. Consider that the SEC rules are a little more strict than just posting what you believe are accurate server stats.
The imperfection comes from the fact that these companies can not wiretap. There are privacy laws that say that an ISP can not simply record and publish statistics about their users. Jupiter research, Alexa, and others don't EVEN HAVE that level of access.
They typically pull a test case market, either via toolbar or by other means, and do some math to scale the results back up to the general population. Not only is this totally wrong, but as I wrote in #5 most companies that can afford to cheat to blow up their value. Especially Cali SF startups living on VC money and a prayer. They don't want to get yelled at in the next board meeting you see.
"Sort of like in politics: the opinion polls stop mattering once the actual vote is held."
On the internet there is never a real vote held. There are no real metrics outside of stats held by ISPs, which will not be published.
AOL recently had release ONLY search data:
http://www.securityfocus.com/brief/277
And it had people completely up in arms. Imagine if they had released their data from all their internet users as well? It would have meant disaster for them. Sadly, they and other ISPs are the only people that could semi-accurately do metrics like this.
BTW: I still have the AOL search data because I had downloaded it before they took it offline.
I've found them to be much more accurate than Alexa/Hitwise.
I still find that their stats are unsatisfying, though, and don't show engagement enough. Look at two sites, for instance, that have lots of traffic, but one has a lot more comments and a lot more outbound link clicky behavior and they won't show that.
Here's how all fair in Alexa.
That's how I would expect the rankings to be, btw.
Or what about outbound links? Should a blogger be penalized for not doing as many links as another? Kottke just has a running list of links, and mostly of general interest and humor, so I wouldn't be surprised if he was generating more outbound traffic than you.
I don't think we're ever going to find a set of stats that are going to be truly satisfactory for everyone. Maybe we wise up and stop relying on ranking to determine our worth.
Or maybe we should just have a swimsuit competition to settle the whole thing.
Whichever. :)
I think we should expect better analytics in the future - our CTO at StyleDiary has interesting perspective on this.
These sites only give a representation of the 'hitshare' for the user sample that is those users with the required tool installed.
It's not too far from how TV stations say how many people are watching their shows. They use incredibly small samples (in the thousands - see in the uk: BARB ) using special boxes in their living rooms, where they are required to tap in which channel they are watching.
Our whole TV scheduling and commissioning efforts are based on these ridiculously low samples.
It's a similar problem with podcasts, where there's going to be less actual listens than downloads.
But then, I bet there's plenty of people who have a TV set on and leave the room. Hmmm.
This is a really interesting discussion for the whole media industry. Whatever the medium and platform.
It is actually fairly useful
Also if one 3rd party developer can create a suitable toolbar, orthers could as well, focusing on providing different data, or functions.
I started this InteractionMetrics web site because as Patricia points out, ajax and other non-loading technologies hurt stats bad.
If you are interested in joining me, www.interactionmetrics.com - there is nothing there yet but a wiki platform but my hope is that we can create a real standard for metrics just as we have for HTML, for Microformats, for lots of other technologies. I hear people moan that the large agencies won't go with anything new. I say hogwash and believe the agencies will go with whatever is presented.
You have to remember, and it's probably not wise for me to say this but many want to keep it the way it is. If we change, their ability to generate revenue from the current may decrease.
Excellent post, we also do metrics for digital media consumption, its not for websites or blogs, but for digital media (music, movies, books, television shows) which is consumed by people all around the world. There are no metrics for this and we provide this information to music labels, movie studios , publishers, advertisers for effective marketing, syndication and other purposes. It is interesting to see how the metrics demands and requirements are so different from the traditional approaches.
Anyways check out our website at http://divinitymetrics.com
Cheers,
Vishal
Are pageviews important? For advertisers, yes. But why is the public pre-disclosure necessary? If it's pageview-based, why not this:
1) Website posts their own metrics (they're likely to know better than 3rd party services anyway!).
2) Impressed advertiser goes, wow, 2 million pageviews a week! Great, we'll pay you $x for 2 million pageviews/week. If pageviews are reduced by more than 100,000, then we can get out of our contract with no penalties AND you'll owe us $y/CPM for the shortage.
Otherwise, aren't RESULTS more important? What's the quality of the mail service like? How many sales is the company making? How many new subscribers are they getting to their for-pay newsletter?
With ajax'd pages, the pageview and raw traffic numbers are, IMHO, simply a stupid metric in many cases. We need to get off an obsession with false quantifications ("Gimme numbers, any numbers!!!!!!1") and start caring more about the quality of the user experience, the power of the brand, the conversions, and so on.
Sorry, Hitwise. Sorry Compete. I just don't find your public stats to be all that useful in the overall scheme of things... even if they were 100% "accurate."
1. Feedburner RSS readers stats which I hate because they significantly fluctuate daily but it does allow others to compare a variety of blogs displaying Feedburner numbers.
2.Google Analytics/SiteMeter. I use these to measure my page impressions and vistors etc. but they are not perfect as previous commentors have said above.
3. Technorati Authority: My Technorati authority is 526 and Robert yours is 5,731 - this is a good measurement but not everyone displays their authority number.
4.Conversational Index is my personal favourite measurement. Being a blog it should be a "naked" conversation and Stowe Boyd recently talked about measuring the number of comments/posts. On my site my Conversational Index is around 4.8
I guess on this blog it would be nearer 30+ Of course the CI number is self-published and if it was to become an industry measurement then it would need to be trusted/audited
I guess for most bloggers publishing their Technorati Authority or CI number would be sufficient but for a few bloggers wanting to monetise their traffic with advertising then right now the only metric advertisers find interesting is page impressions to measure CPM.
I wrote a big post about it here: http://www.particls.com/blog/2007/05/life-after...
The only 'sure way' is to pass a governmental law requiring all ISPs/Networks to give up data, and then some special commission can be appointed to filter it out. Police state spyware, but hey, accurate ratings.
http://techfold.com/2007/04/03/how-google-can-k...
"Adding a “Sharing” option to Google Analytics and surfacing stats in “site:” searches (for those site owners who have elected the sharing option in their Analytics account) would do the job nicely. Let site owners control the degree of information shared, keep everything opt-in, and rock and roll. I know I’d share my high-level views & visits stats in a second. In addition to providing all of the value Alexa does, it would also add a layer of transparency to making ad-buys - something else I would appreciate."
Rex
ISPs hold probably the only data that could count. But what is their incentive?
I'm actually researching this now to justify a corporate initiative. I'm arguing to use a combination of the following:
Authority - based on links
Conversation Index - See Stowe Boyd's blog
Feed subscriptions - much like a conversion rate in sales
Am I nuts?
Regards,
Mark Krupinski
1. General Stats - The Alexas of the web will always be off based on their methodologies using panel or toolbar data. the only way to shore it up is to create a better system that can prevent cheating...but this isn't going to happen soon.
2. Web Analytics - Obviously in this case there are a host of competitors ranging from free to really expensive and also varying wildly in their offering. This category is the definitive source to track everything going on at your local server...but it's not going to solve the bloggers dilemma. And it's hard to get two systems to say the same number...but at the end of the day it doesn't matter since you're looking for trending and directional information for the most part.
3. Blog Analytics - I recently did a post on my blog about the complete void within the blog analytics space...since that post I've heard that Google will be releasing Measure Map, a blog specific tool that seems quite powerful and takes a step beyond the current web analytics tools out there. The measure map UI was lifted for the new version of Google Analytics and is quite slick.
http://www.quantcast.com/facebook.com
printing
Google Analytics. It takes a bit time to tweak it, but it is all worth it in the end.