While there are some fine programs to analyze your apache-logs, you can extract some basics information with bash-commands. These are some real-life examples I used after I put pictures I took online. Those pictures where taken at the traditional “Narrensuppe” (Jester’s Soup) which in our town kicks off the [carneval]-activities. I put them online, because many people asked me if they could have them. So now I want to know if they got them… :)
This number is realy the number of ‘people’ visiting the site. There might be many people watching the photos on screen, but they are only counted once. On the other hand, people on dial-up might get counted more than once if they get different IP-adresses when connecting multiple times. So, while the exact number is hard to measure, this is still interesting:
cat /var/log/apache2/narrensuppe.krone-neuenburg.de-access_log | cut -d- -f1 | sort | uniq | wc -l
‘cat /var/log/apache2/narrensuppe.krone-neuenburg.de-access_log’ reads the logfile and starts the pipe.
‘cut -d- -f1’ splits the line at “-” signs (“-d-” ) and takes the first part (“field”, “-f1”), which happens to be the ip-adress of the visitor.
‘sort’ sorts the ip-adresses, and ‘uniq’ supresses identical lines.
Finally, ‘wc -l’ counts the number of lines. (some 100 visitors after 3 days. Not bad for a crowd of 130 Jesters!)
p15156159:~ # cat /var/log/apache2/narrensuppe.krone-neuenburg.de-access_log | cut -d- -f1 | sort | uniq | wc -l
100
Ok, apache-mirror, tell me who is the most beautiful Jester here!
cat /var/log/apache2/narrensuppe.krone-neuenburg.de-access_log | grep "GET /NSuppe" | cut -d] -f2 | cut -d/ -f2 | cut -d' ' -f1 | sort |uniq -c | sort
‘cat /var/log/apache2/narrensuppe.krone-neuenburg.de-access_log’ reads the logfile and starts the pipe. This is how such a line looks:
192.168.141.114 - - [06/Feb/2005:15:57:30 +0100] "GET /NSuppe2005-252.jpg HTTP/1.1" 200 980809 "http://narrensuppe.krone-neuenburg.de/1.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; DigExt)"
‘grep “GET /NSuppe”’ filters, so only relevant requests are counted. (Thumbnails are in a different directory, so they don’t match). Now we have to extract the filename of the JPG. Some sed-wiz would probably do this in one command, but I don’t know sed, so I have to use cut. I want to get to the first slash (“/”) after the date. ‘cut -d] -f2’ get everything after the first “]”, ‘cut -d/ -f2’ get everything after the first slash (that is still left. The slashes in the date are removed by the first cut). Come to think of it, I could have done this in one ‘cut -d/ -f4’, but it’s too late now :)
Now, this is left of the original line:
/NSuppe2005-252.jpg HTTP/1.1" 200 980809 "http://narrensuppe.krone-neuenburg.de/1.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; DigExt)"
‘cut -d’ ' -f1’ removes everything after the first blank. I had to quote the blank. Now, we are on the home-stretch:
‘sort |uniq -c | sort’: The first sort prepares the list for the following uniq. The -c option tells uniq to count the occurences of each line (here: image) and give the number before the line. The second sort takes care of the ranking.
p15156159:~ # cat /var/log/apache2/narrensuppe.krone-neuenburg.de-access_log | grep "GET /NSuppe" | cut -d] -f2 | cut -d/ -f2 | cut -d' ' -f1 | sort | uniq -c | sort
1 NSuppe2005-268.jpg
1 NSuppe2005-270.jpg
[...]
8 NSuppe2005-287.jpg
8 NSuppe2005-294.jpg
8 NSuppe2005-400.jpg
10 NSuppe2005-247.jpg
11 NSuppe2005-248.jpg
11 NSuppe2005-252.jpg
12 NSuppe2005-263.jpg
20 NSuppe2005-245.jpg
Created by stwaidele
Copyright (c) by the authors.
Prior to editing, authors agreed to license their contributions by the terms of the GPL.
See our licensing page for details.
Linux® is a registered trademark of Linus Torvalds.