Xepher.Net Forums

Xepher.net => Hosting Q&A => Topic started by: tapewolf on January 21, 2009, 04:33:56 PM

Title: Statistics
Post by: tapewolf on January 21, 2009, 04:33:56 PM
Is there any provision for website statistics at the moment?  A search wasn't too helpful as the system seems to have changed a lot in that regard.
While traffic is interesting, I'm particularly interested in tracking 404s, as I've found a lot of bad links that way.
Title: Re: Statistics
Post by: Miluette on January 22, 2009, 10:55:20 AM
I don't think there's stat tracking here at Xepher, but I use Statcounter. :D It's especially nice since it increased its visit log from 100 to 500 for new counters... I actually deleted and recreated some of my old counters just for that. There's a lot of stuff you can track with it, a whole lot, except 404s. There's probably a way to do it with Statcounter, but I wouldn't know. I tend to find most of my broken links myself.
Title: Re: Statistics
Post by: Databits on January 22, 2009, 01:43:32 PM
It'd be up to Xepher, but you could always ask him to set up awstats. It's a pretty widely used stats tracker.

The problem is, it requires a cron job to build the stats every so often from the logs. It also only really works if the logs are set up for it.
Title: Re: Statistics
Post by: tapewolf on January 22, 2009, 01:55:05 PM
Quote from: Databits on January 22, 2009, 01:43:32 PM
It'd be up to Xepher, but you could always ask him to set up awstats. It's a pretty widely used stats tracker.
AWstats is what I'm using at the moment, actually.  I'll him about it.
Title: Re: Statistics
Post by: Xepher on January 22, 2009, 03:30:35 PM
Yeah, sorry, my documentation is pretty horrid at the moment, but there is a script already in place to do exactly what you're asking, using analog. The post about it was here http://xepher.net/forum/index.php?topic=437.msg6963#msg6963 but the gist of it is this.

Login via SSH and run the command "generate-web-stats" and then visit http://username.xepher.net/webstats/ (using your actual username.) It'll take a minute or two for stats to run, and it only has access to the current log files (e.g. this month.) I'm afraid the analog output may not be too useful for tracking 404s though. The log files are publically readable though, and located in /var/log/apache2/ The format is a bit strange though, as I had to set them up to be parseable for the virtualhosts. All the data from a normal CLF is there though.

Oh, and if anyone is curious, overall server stats are at http://xepher.net/stats/xepher-net/

I'm open to new suggestions for how to handle statistics, as the current system (logging included) was set up many years ago, and all the analyzers I looked at were pretty lame back then. I see better things from AWstats these days, and I've been considering trying it out again, but I'm looking for something that can sort by virtual hosts if possible.
Title: Re: Statistics
Post by: tapewolf on January 22, 2009, 03:33:42 PM
Okay, thanks - I'll look into that later tonight.
Title: Re: Statistics
Post by: Miluette on January 23, 2009, 04:35:24 AM
Quotebash: generate-web-statistics: command not found
I think I'm doing it wrong.
Title: Re: Statistics
Post by: fesworks on January 23, 2009, 05:30:46 AM
Quote from: Xepher on January 22, 2009, 03:30:35 PM

Oh, and if anyone is curious, overall server stats are at http://xepher.net/stats/xepher-net/


The first referring site is just ... odd... anonib.com ? huh?

As usual, DMFA is a big search topic.
Title: Re: Statistics
Post by: tapewolf on January 23, 2009, 11:31:25 AM
Quote from: Senshuu on January 23, 2009, 04:35:24 AM
Quotebash: generate-web-statistics: command not found
I think I'm doing it wrong.
I used tab-completion so I don't remember what the command was, but it did the job (though it reminded me why I stopped using Analog).
Title: Re: Statistics
Post by: Xepher on January 23, 2009, 05:22:10 PM
Try the full path for the command if it's not working

/usr/local/bin/generate-web-statistics

Title: Re: Statistics
Post by: Miluette on January 26, 2009, 07:35:49 PM
QuoteCommand not supported by this protocol

D: lol I don't know anything about this.
Title: Re: Statistics
Post by: Xepher on January 26, 2009, 08:10:02 PM
I'm becoming more and more convinced that I need to redo the entire statistics system here. I'm thinking about splitting logfiles up on a per-user basis as well, which will make it harder to get overall traffic reports, but will make site-by-site statistics a lot easier.

In the meantime, I have NO idea where you got that error from. You should be using Putty (or another SSH client) and running that command from the bash prompt.
Title: Re: Statistics
Post by: fesworks on January 26, 2009, 08:14:49 PM
If you are using WinSCP, then go to COMMANDS > OPEN TERMINAL (also CTRL+T)... it has a little icon that looks like a mini DOS window with "NOM" or something in it.

Using this and the full path (/usr/local/bin/generate-web-statistics) worked for me... I mean, it seemed like it froze, but it said it would be several minutes....

My thing didn't have Putty.exe to run Putty... I think you told me to download it before, but that was on my old computer.
Title: Re: Statistics
Post by: Miluette on January 27, 2009, 12:26:32 AM
Ohh, I see. The first time I error'd, I was using Cyberduck like I normally do on my mac. The last one I was in Filezilla, what I use on my PC.

I didn't want to download anything new, lol. Oh well.
Title: Re: Statistics
Post by: Databits on January 27, 2009, 12:53:27 PM
You may also look into Google Analytics.
Title: Re: Statistics
Post by: sagebrush on February 05, 2009, 07:10:19 AM
I think AWstats was what I had at Startlogic, and it gave a lot of useful information, like unique IP addresses and links to wherever traffic was coming from.  Lunarpages uses Webalizer, because AWstats "uses more resources."  Webalizer isn't that impressive, though.  I ran the stat program on here, and it's okay but not particularly useful.  My Project Wonderful numbers are HUGE compared to my Google Analytics numbers, so I would like a third "opinion" on how many people are actually hitting my site.
Title: Re: Statistics
Post by: Xepher on February 05, 2009, 03:45:35 PM
I'll try and redo the stats system here in the not too distant future. It looks like AWStats requires a log item that the current logs don't have though, so I have to change the output format for the server first. I'll work on it some this "weekend" (monday/tuesday for my schedule.)
Title: Re: Statistics
Post by: fesworks on February 05, 2009, 03:54:38 PM
Quote from: sagebrush on February 05, 2009, 07:10:19 AM
My Project Wonderful numbers are HUGE compared to my Google Analytics numbers, so I would like a third "opinion" on how many people are actually hitting my site.

with bats...
Title: Re: Statistics
Post by: Databits on February 05, 2009, 09:48:55 PM
Quote from: sagebrush on February 05, 2009, 07:10:19 AMLunarpages uses Webalizer, because AWstats "uses more resources."

Webilizer is absolutely worthless for anyone who's serious about watching their sites stats.

Apparently, Lunarpages admins don't have a damn clue what they are talking about. Granted AWStats has more detailed statistics but both it and Webilizer require a cron job to keep up to date listings and both process the same damn log files. Leave it to stupid admins to make a call like that... LOL
Title: Re: Statistics
Post by: tapewolf on February 06, 2009, 01:56:47 AM
I'd be most grateful if we can have AWstats :P
Title: Re: Statistics
Post by: fesworks on February 06, 2009, 05:29:01 AM
I use Go Stats

http://gostats.com

not sure about accuracy, but I put the code on every single page. (well, I have my site wordpressed, so that part is easy now)
Title: Re: Statistics
Post by: Xepher on February 10, 2009, 11:10:29 AM
Alright, I've set up awstats and seperate logfiles for each user now. There's a link to "View Statistics" when logged in to user-services, or you can go directly to http://xepher.net/awstats/awstats.pl?config=username

This too WAY too long to set up and automate, but it's finally done. Stats will be updated nightly, and shouldn't be limited to the current month the way the old ones were. However, as these are starting from scratch, there's only data for the past few hours so far right now.
Title: Re: Statistics
Post by: sagebrush on February 10, 2009, 12:34:00 PM
Ooh, thank you.
Title: Re: Statistics
Post by: Miluette on February 10, 2009, 02:49:50 PM
Hey, neat! Thanks! %D This is fun.
Title: Re: Statistics
Post by: tapewolf on February 10, 2009, 05:34:07 PM
Excellent!  Thanks!
Title: Re: Statistics
Post by: fesworks on February 10, 2009, 05:47:41 PM
Very Excellent :D

Thanks!
Title: Re: Statistics
Post by: tapewolf on February 12, 2009, 12:42:51 PM
Excellent... AWstats has already shown me one 404 (leftovers from .htm -> .php).
Title: Re: Statistics
Post by: Miluette on February 13, 2009, 12:52:06 AM
Apparently I'm getting a lot of 404s because there're people still linking to the wrong homepage/archive for Millennium now. I made a note of the change on the error page, hehe.

I don't get some of the numbers on this stat thing. Some of them just seem...well, too high lol.
Title: Re: Statistics
Post by: Databits on February 13, 2009, 05:45:10 AM
Generally a better solution for those cases is a 301 permanent move redirect to the proper page. It's also extremely useful when trying to salvage your search indexes after a large-scale change.
Title: Re: Statistics
Post by: fesworks on February 13, 2009, 06:10:35 AM
I need some clarification on "Visits",    "Pages", and "Hits"... as in... what's the difference? I'm going to guess "Sessions", "Uniques", and "Hits"? ... or?
Title: Re: Statistics
Post by: tapewolf on February 13, 2009, 10:48:29 AM
As far as the 404s go, these are rather interesting:

/images/trans.gif   19   http://www.project-future.org/
/java/prototype.js   19   http://www.project-future.org/
/java/scriptaculous.js   19   http://www.project-future.org/

Any idea what this is about?  My guess would be some worm looking for common vulnerabilities.  I've never heard of them before and unless I'm much mistaken there are no references to them in any of the code.
Title: Re: Statistics
Post by: Databits on February 13, 2009, 12:55:53 PM
In those cases I'd just ignore them.
Title: Re: Statistics
Post by: Xepher on February 13, 2009, 04:28:48 PM
Hits are actual requests... every line of the logfile. Pages are only hits that are for a page (html or php) and doesn't include images, javascript, css, etc. Visits is... poorly defined.

Actually, forget having me explain it... take it straight from the horse's umm... we'll go with "mouth" :-)

http://awstats.sourceforge.net/docs/awstats_glossary.html
Title: Re: Statistics
Post by: Miluette on February 13, 2009, 11:35:09 PM
Quote from: Databits on February 13, 2009, 05:45:10 AM
Generally a better solution for those cases is a 301 permanent move redirect to the proper page. It's also extremely useful when trying to salvage your search indexes after a large-scale change.

Aaand now I've done that, since they're better than the refresh redirect thingy I used to use. Yay!

Apparently I've gotten a few "hits" from someone using IE2. I'm not sure what to think of this.
Title: Re: Statistics
Post by: tapewolf on February 14, 2009, 10:55:42 AM
Quote from: Senshuu on February 13, 2009, 11:35:09 PM
Apparently I've gotten a few "hits" from someone using IE2. I'm not sure what to think of this.
I seem to recall seeing CP/M users browsing mine.  While there might be some people masochistic enough to use IE2, at this point in time I'd guess they're either doing it for fun, or spoofing the user agent tag.
Title: Re: Statistics
Post by: Databits on February 14, 2009, 01:07:32 PM
Or they're just people who are incapable of upgrading and are STILL somehow using i286/i386 systems with Win 3.1. :P

In either case, I wouldn't care at all.
Title: Re: Statistics
Post by: fesworks on February 14, 2009, 03:17:20 PM
Looks like it only counts unique visitors for each month, instead of daily. I hope that it actually counts only uniques throughout the month, and doesn't just keep adding up daily uniques. I've been skeptical about some of the data so far. The stats have been counting for 5 days so far, and I already have 500 uniques? That sounds a tad high... granted I have more than one "site" on my account, but it still seems high.

Oh well. In any case, this is still a really cool and fairly comprehensive 2nd look (or should this now be my FIRST look) at site stats :)
Title: Re: Statistics
Post by: Xepher on February 14, 2009, 04:38:08 PM
I've updated the stats URLs so they work a little more cleanly now. You should be able to visit http://username.xepher.net/awstats/ for any username. Likewise, http://domain.com/awstats/ if you have a domain linked here.
Title: Re: Statistics
Post by: Miluette on February 15, 2009, 03:17:55 AM
Quote from: fesworks on February 14, 2009, 03:17:20 PM
Looks like it only counts unique visitors for each month, instead of daily. I hope that it actually counts only uniques throughout the month, and doesn't just keep adding up daily uniques. I've been skeptical about some of the data so far. The stats have been counting for 5 days so far, and I already have 500 uniques? That sounds a tad high... granted I have more than one "site" on my account, but it still seems high.

Oh well. In any case, this is still a really cool and fairly comprehensive 2nd look (or should this now be my FIRST look) at site stats :)

This. It doesn't seem like I should have gotten 1000 uniques in 5 days, even though I have several different sites and I am getting slightly more traffic overall lately. But I also have a bad sense of math in my head... I hope this is true, anyway. %D

Also, it links to a bunch of my pages incorrectly, like in the top 10 URL section. Files in, say, /lf/, or even lf/pages/, are linked as if they're in the root folder, which is wrong. But it's not an issue, just weird. :B
Title: Re: Statistics
Post by: Xepher on February 15, 2009, 03:44:30 PM
AWstats logs the request's as recieved from the browser, not after the apache rewrites and vhosts have done their magic to map things to the actual file system layout. As for the unique hits... it is "true" in so far as that goes, but don't think that every unique visitor actually means a human viewing your page. ANY request from some IP that's not a known bot causes that to go up. Keep in mind a lot of search engines put preloading in their search results, meaning the top few results get preloaded by compatible browsers even if the user never clicks or actually views them.

That said, awstats is one of the foremost stats systems out there, and it's "truth" is pretty widely accepted as fact in most circles when you talk about such numbers. Sure, it's not real, but it's as real as anyone else's claim to have "X many unique visitors" is.
Title: Re: Statistics
Post by: Miluette on February 15, 2009, 04:37:16 PM
Quote from: Xepher on February 15, 2009, 03:44:30 PM
Keep in mind a lot of search engines put preloading in their search results, meaning the top few results get preloaded by compatible browsers even if the user never clicks or actually views them.

That would explain a lot!
Title: Re: Statistics
Post by: fesworks on February 15, 2009, 05:19:11 PM
Well, it's probably good to look at Uniques and Pages together. The closer they are together, I'd say the more "regulars" you have, or actual visits (visiting 1-2 pages and leaving indicates someone coming to see an update, specific page,.... or at the very least a click by a stranger, not liking what they see, and leaving :P )

The more pages viewed you have, in relation to uniques, should indicate a visitor looking through your archives, or at least clicking around your site. Regulars will already have clicked on everything, unless you updated a page somewhere.

At least that's my interpretation. Like When I check my GoStats after a Big Project Wonderful ad Campaign, my uniques go up, but my hits go WAY up.
Title: Re: Statistics
Post by: griever on February 16, 2009, 12:37:57 AM
Thanks!  StatCounter, which I've been using, doesn't show image leechers.  Time to do something about that.