Monday, 22 April 2013

Displaying a summary of Apache2 access logs by country using Ruby

Recently I had a case where a site was being hammered by script kiddies.
Nothing compromised I might add.
First off I limited access to the site by country (Australia in this case).
Note I wasn't blocking countries, but blocking all countries except Australia.

Then I wanted to get a summary of the Apache2 access log to see where these annoying little herberts where from.
An Apache2 access log entry looks something like this:


180.76.5.62 - - [22/Apr/2013:12:19:46 +1000] "GET /index.php?title=User:Coombayah HTTP/1.1" 403 409 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"


Now where was that access from?
So I knocked up a *very* simple ruby script to get that data.
The code below is 100% inelegant.
It's not meant to be elegant.
It's meant to illustrate the point.
A 'real' script would have much better coding and options for dates and ranges.

Output for today would look something like this:


2013-04-22 - 128
  119.63.193.131      2    0    2 2000667011 Japan
  119.63.193.132      2    0    2 2000667012 Japan
  123.125.71.112      2    0    2 2071807856 China
  123.125.71.74       2    0    2 2071807818 China
  176.31.9.218        4    0    4 2954824154 France
  192.80.187.162      4    0    4 3226516386 United States
  216.152.250.163     4    0    4 3633904291 United States
  216.152.250.187     4    0    4 3633904315 United States
  218.30.103.31       2    0    2 3659425567 China
  220.181.108.155     2    0    2 3702877339 China
  46.246.60.177       7    0    7  787889329 Sweden
  46.28.64.213       25    0   25  773603541 Ukraine
  66.249.74.135      10    0   10 1123633799 United States
  IPs with 1 access and failure: ["119.63.193.195", "119.63.193.196", "123.125.71.107", 
  "123.125.71.113", "123.125.71.69", "123.125.71.72", "123.125.71.75", "123.125.71.76", 
  "123.125.71.83", "123.125.71.91", "173.255.217.233", "180.76.5.10", "180.76.5.15", 
  "180.76.5.162", "180.76.5.62", "180.76.5.7", "180.76.6.227", "202.46.48.27", 
  "202.46.61.34", "220.181.108.143", "220.181.108.147", "220.181.108.148", 
  "220.181.108.158", "220.181.108.159", "220.181.108.161", "220.181.108.162", 
  "220.181.108.178", "220.181.108.181", "220.181.108.183", "220.181.108.186", 
  "42.98.185.25", "46.161.41.24", "78.46.250.165", "80.93.217.42", "85.216.108.89"]


As you can see, pretty basic.
The principal rows show the IP address, accesses, successes, failures, IP-Value and country.

First off you need a Ip-To-Country list.
In a prior post I mentioned getting zone range files.
So go to http://software77.net/geo-ip/ and get the full list.
The file will have a lot of useful comments at the top.
Make a copy of the file excluding the comments as IpToCountry.csv.
You should see a file roughly 126,000 lines long where each line looks something like this:


"16777216","16777471","apnic","1313020800","AU","AUS","Australia"


The fields are 'Address From', 'Address To', 'Registrar', 'Date Assigned', 'Country2', 'Country3', 'Country'.

And now the ruby code.

First the requires:


require 'date'
require 'csv'


Since we are using *BRUTE FORCE* and not bothering to be elegant, we simply read the csv file into an array of hashes.


printf "Loading Ip to Country map\n"
ip_to_country = []
CSV.foreach('IpToCountry.csv') do |row|
  ip_to_country << { :from => row[0].to_i, :to => row[1].to_i, :country => row[6] }
end


The first pass now reads the access log into a hash of hashes:


unique_days = {}
f = File.new('logs/access.log', 'r')
while (line = f.gets)
  ip,tmp,tmp,dt,offset,verb,url,http,rcode,sz,tmp,browser = line.split
  d = DateTime.strptime( "#{dt} #{offset}", "[%d/%b/%Y:%H:%M:%S %Z]")
  date = d.strftime("%Y-%m-%d")
  if ! unique_days.has_key? date
    unique_days[date] = { :ip => {}, :total => 0 }
  else
    if ! unique_days[date][:ip].has_key? ip
      unique_days[date][:ip][ip] = { :total => 0, :succeeded => 0, :failed => 0 }
    end
    unique_days[date][:ip][ip][:total] += 1
    if rcode == '200'
      unique_days[date][:ip][ip][:succeeded] += 1
    else
      unique_days[date][:ip][ip][:failed] += 1
    end
    unique_days[date][:total] += 1
  end
end
f.close


I have eschewed elegance here for brute force and clarity.

So. Now we have everything we need.
Now we traverse the data dumping the report out:


unique_days.each do |date,h|
  printf "#{date} - #{h[:total]}\n"
  only_one_failed = []
  h[:ip].sort.map do |k,data|
    octets = k.split('.')
    if data[:total] == 1 && data[:failed] == 1
      only_one_failed << k
      next
    end
    ip_value = (octets[0].to_i * 256 * 256 * 256) + (octets[1].to_i * 256 * 256) + (octets[2].to_i * 256) + (octets[3].to_i)
    country = 'Unknown'
    ip_to_country.each do |row|
      if ip_value >= row[:from] && ip_value <= row[:to]
        country = row[:country]
        break
      end
    end
    printf "  %-16s %4d %4d %4d %10d %s\n", k, data[:total], data[:succeeded], data[:failed], ip_value, country
  end
  printf "  IPs with 1 access and failure: #{only_one_failed}\n"
end


And you have your report.

No comments:

Post a Comment