PDA

View Full Version : Questions about the accuracy of Website Stats.



Harold Mansfield
06-17-2010, 10:50 AM
I've always had a problem believing the accuracy of traffic stats.
I never get an accurate reading with Google analytics and since I use Wordpress a lot, the Wordpress Stats also seem to be selective.

I figure my best chance for accuracy come from my own server stats (Awstats) and not a 3rd party application.

The way I understand it:

Uniques: Are the number of unique IP address that have visited your site. Since hundreds or even thousands of people can be on the same IP address, it's not an actual measurement of visitors.

Visits: Are the number of actual browser requests to view your website. So, more of an accurate representation of actual people that have visited it.

Page Views: Are the number of different pages, per visit, totaled, that are accessed.

Hits: can be anything from bots, to SE spiders and can count any connection to the site, even if there are no real visitors behind it.

Is this an accurate way of deciphering your actual traffic or has something changed that now muddies the water in the way that you read stats?

Can I take these readings as gospel, or are there other factors that I need to consider that may bot be giving an accurate assessment of real visitors?

vangogh
06-17-2010, 11:35 AM
First you're never going to get perfectly accurate stats even in your own server logs. There are different ways to capture the data and none are perfect. The different places you check stats are collecting the data in different ways, which is why they'll never quite match up. Think of your analytics more as relative measurements. It's not important to know you received 100 visits today. What's important is that last month you received 100 visits and then after your Facebook marketing campaign you saw a spike of 1,000 visitors on one day and the next two months showed 100 visits each instead of the 50 prior to the campaign.

Hits - Any requests made from a browser to the server. Your html page registers a hit, as does any image that page links to and any script that pages links to. Hits are a completely useless metric.

Unique Visitors are attempts to measure visits by different people over a specific time period. You've already seen one issue in the IP. It's less likely you'll be dealing with thousands of people all hitting your site with the same IP as you are two people living inside the same house visiting the site. The time frame aspect is important and different analytics services might use a different time frame. If I visit your site today and then again in 6 months that's likely 2 unique visits. If I visit it now and in an hour it's likely not a unique visit.

Visits are a way to measure sessions. Usually the session is set at a half hour or an hour, but it could be anything. If I visit your site now, spend a few minutes before leaving, and then come back in 3 hours, it's likely 2 visits. If I leave and come back again in a few minutes it's likely seen as 1 visit. It depends on how long the session is considered to last. I think a half hour is the standard, but again it could vary

Page Views are simply viewed pages. If I visit your home page and click to your about page and then your service pages, 3 page views should be recorded. Similar to hits, except only the request for the html is recorded.

mattbeck
06-17-2010, 01:10 PM
vangogh explained the different terms perfectly well.

That said, I thought it important to point out that there is a strong case to be made for javascript based stats (like google analytics).

Basically it has to do with what infomation is exposed. Awstats and other log analyzers are accurate, but don't get all of the same information reported to them.

Some data that is really important (like screen resolution) isn't available in that sort of reporting.

Log analyzers also have a hard time distinguishing between multiple links on the same page, so if you have a button in your sidebar and a link in the footer that both go to your featured product page say, the server stats can't tell you which link was actually clicked on.

Some (not all) javascript stats applications can do really cool stuff with this information (see crazyegg.com for heatmaps)

Hope that helps?

Cheers,

Matt

Harold Mansfield
06-17-2010, 01:29 PM
Are there any variable that can account for false readings?

The reason that I ask is because I have a brand new site that is picking up traffic, but, there is no rhyme or reason for it and even more so, I can't trace the actual origin of the majority of what the stats say.

It is reporting links from pages where I cannot find any or have no reason to be linking to me.

My inner voice is telling me that I am not getting accurate readings and something is affecting the numbers, but I can't figure out what that could be.

The domain was not previously owned, and I have not picked up any links that can be attributed to any more than 5%-10% of the traffic daily.
There is little to no SEO and it doesn't place well in the SERPs.

There is no reason for this site to have any traffic, yet it seems to be growing.
Could there be some other reason, such as, could someone have hacked it and be using it as some kind of proxy? I don't know, I'm guessing at answers.

Talking to my host, they also confirm that the site is not dormant and is a little busy.

Not looking a gift horse in the mouth, but something is not right.
Any ideas?

amir
06-17-2010, 03:07 PM
If you are comfortable reading raw log files, all the information you want is there. It's possible you're being crawled by unknown crawlers.

That said, if your webserver logging isn't setup correctly, you will not find the information you want there.

I'll take a look at one day worth of log files for you if you want (no charge of course).

send me a private message with the details if you like.

Amir

Harold Mansfield
06-17-2010, 03:15 PM
If you are comfortable reading raw log files, all the information you want is there. It's possible you're being crawled by unknown crawlers.

That said, if your webserver logging isn't setup correctly, you will not find the information you want there.

I'll take a look at one day worth of log files for you if you want (no charge of course).

send me a private message with the details if you like.

Amir

Already did that and the seem to match up with Awstats.

There are of course some unknown crawlers but they only visit a few times a week. I traced most of them back to Yahoo, and some smaller search engines.

Bot crawls wouldn't show up in Visits or Page Views would it? They are logged separately anyway.

vangogh
06-17-2010, 03:59 PM
That said, I thought it important to point out that there is a strong case to be made for javascript based stats (like google analytics)

Good point Matt. I think the best analytics use a combination of both javascript and server logs since each has its own strengths and weaknesses. That's not always possible though since you'd have to have part of the system on your server to get the logs. If I had to choose one over the other I'd probably lean toward the javascript side since it can provide more information. I think the server logs are probably more accurate in what they do show.


Are there any variable that can account for false readings?

There are lots of different things that can account for false readings. Since the site is new I take it there isn't a ton of traffic yet. The less traffic the more likely any data point you look at is off. One or a few people visiting can easily skew everything. Until you have enough data that data isn't so meaningful and the best use of analytics isn't the absolute numbers. It's the comparison of the numbers over time.


It is reporting links from pages where I cannot find any or have no reason to be linking to me.

There are lots of spam tactics to make this happen. It's called referrer spam. The person spamming you is hoping either you'll click on the link in your stats to find out who's linking to you or that your stats pages will be publicly visible. The referrer URL is faked in this kind of spam to promote the site that shows up as the referrer. That might be what's accounting for the false readings you're seeing.

Harold Mansfield
06-17-2010, 04:12 PM
There are lots of different things that can account for false readings. Since the site is new I take it there isn't a ton of traffic yet. The less traffic the more likely any data point you look at is off. One or a few people visiting can easily skew everything. Until you have enough data that data isn't so meaningful and the best use of analytics isn't the absolute numbers. It's the comparison of the numbers over time.
It's actually just the opposite. There is more traffic than I think there should be at this point



There are lots of spam tactics to make this happen. It's called referrer spam. The person spamming you is hoping either you'll click on the link in your stats to find out who's linking to you or that your stats pages will be publicly visible. The referrer URL is faked in this kind of spam to promote the site that shows up as the referrer. That might be what's accounting for the false readings you're seeing.

How do you get rid of those?

Added: Most of the referrer links are from within, from another blog on the network. There are about 4 that shouldn't be there. I have also considered that maybe I actually do have traffic here, but nothing else reflects that...sign ups, comments, ad clicks. Nothing. That's why I'm suspicious and need to determine if the logs are true and I need to make some changes to the site, or they aren't.

vangogh
06-17-2010, 04:50 PM
There is more traffic than I think there should be at this point

My point is that until you're getting more real traffic small things can skew the results. One awstats doesn't really filter out robotic traffic. Only the robots that it knows about or identify themselves as robots. It's one reason you want to use javascript based analytics.

I think there are some ways to block referrer spam. Not sure off the top of my head, but search should pull up some solutions. You could also switch to Google Analytics or another javascript based analytics package. Also the issue will go away as you get more real traffic since the spam will make up a smaller % of the stats and not skew the numbers as much.

Harold Mansfield
06-17-2010, 09:49 PM
My point is that until you're getting more real traffic small things can skew the results.

Well that's the conundrum (yay spell check!) I need to be sure that what I am seeing now is either correct or incorrect to even know the difference.




One awstats doesn't really filter out robotic traffic. Only the robots that it knows about or identify themselves as robots. It's one reason you want to use javascript based analytics.


Actually Awstats separates bots and spiders from regular traffic stats.


Robots shown here gave hits or traffic "not viewed" by visitors, so they are not included in other charts. Numbers after + are successful hits on "robots.txt" files.


I can say that over the last few days other areas have frown as well, not just the numbers.
Search Phrases, and Keywords, Duration of Visits, Types of Browsers, and so on. So everything is growing not just the number of visits.

Most people wouldn't complain, but I've started many, many, many sites and none have come out of the gate like this.

I guess I just want to rule out any possible anomalies so that I can actually believe it.

vangogh
06-18-2010, 12:52 AM
Awstats separates bots and spiders from regular traffic stats.

Only the robots they know about or identify themselves as robots. If I created a robot and sent it to your site awstats wouldn't be able to distinguish it from a real visitor. It's the way server logs work. They know to separate bots from Google and Bing and others that have registered themselves. The bots from those referrer spams though are seen as regular visits. One of the reasons for preferring JavaScript stats is robots don't trigger the javascript so they don't get counted.

Spider
06-18-2010, 09:55 AM
FWIW - why not attach a third party stat code to your webpages to corroborate (or otherwise) your server stats?

I use Statcounter.com -- Stats are provided for -
Summary
Popular Pages
Entry Pages
Exit Pages
Came From
Keyword Analysis
Recent Keyword Activity
Recent Came From
Search Engine Wars
Exit Links
Exit Link Activity
Downloads
Download Activity
Visitor Paths
Visit Length
Returning Visits
Recent Pageload Activity
Recent Visitor Activity
Recent Visitor Map
Country/State/City/ISP
Browsers
System Stats
Lookup IP Address
Download Logs

They have a free version, that I believe includes all the stats stated above but only to the last 500 pageloads. I pay for an expanded version. For temporary checking purposes, the free version would probably be enough for you.

Harold Mansfield
06-18-2010, 11:17 PM
They have a free version, that I believe includes all the stats stated above but only to the last 500 pageloads. I pay for an expanded version. For temporary checking purposes, the free version would probably be enough for you.

I used stat counter in the past. Wasn't too crazy about it and I'm past the 500 page views at this point. Average for the month so far (from 6/4 to today) is just over 4700 a day on an average of 548 visits. The last 2 days have gone over 1k visits. with 11k and 9k page views respectively.

I have much older blogs on Awstats (at least 4) that are pretty much right on with the traffic stats. No surprises.

Others on Awstats that are sitting in limbo and shouldn't have traffic and it shows that they don't.

All were built, SEO'ed (or lack of) the exact same way.

I have GA, WP Stats(which is worthless) and Awstats on this particular blog.
I went over it again with my host and they assured me that this is actual traffic.

Can't really trust Google Analytics because it shows another blog as having 1 visitor every few days and affiliate clicks and sales tell me that, that is ridiculously off.

Someone another forum told me that I couldn't trust Awstats to be accurate, but on my other sites, it is pretty dead on.

I can see the navigation, most viewed pages, average time spent on the site per visit..which was lopsided at first, now it's starting to balance out to around 14 minutes per visitor...I'm pretty sure if it were some kind of bot field day, average time would all be in the 0-30 sec range.

I mean from what I can tell, it all looks like a duck.

SteveC
06-27-2010, 10:43 PM
The only real stat that you should be interested in, is how many sales or how many enquiries your website generates.... the rest really doesn't matter.

Just my opinion of course.

vangogh
06-28-2010, 12:40 AM
Nice to see you again Steve.

Ultimately I agree with you. It's sales that count, however there are stats leading up to sales that I think are also valuable as a way to improve sales. For example if you're measuring what happens inside your checkout process and you notice a high % of people are leaving at the same page prior to finalizing the purchase, it can be let you know to take a look at the page and see if you can figure out why people are abandoning the checkout.

I'm also interested in increasing subscribers to my blog so I have things set up to let me know where subscribers are coming from as a way to identify sites that make for good candidates for a guest post or for some other kind of marketing.

You do want to tie your stats into some measurable conversion, ideally a sale though.

nealrm
06-28-2010, 02:36 PM
Steve,
While that is the goal, discounting everything else is not a good idea. Those stats give you hints on how to acheive that goal. They also give you hints on how your adversiting to working, which pages are meeting customer needs and which are not.