I once had a need to monitor the origin of HTTP requests, track it in real time from web server log. Since this need was actually nothing more but pure curiosity, I was not willing to pay for it. So I started looking for free solutions for this purpose – convert IP to country. Here is what it all ended at…
First of all, I decided to do it all in such order:
- read web server log periodically each 5 seconds
- grep unique IP addresses
- call some API to “convert” IP into country code
- feed what’s collected into Zabbix
Reading Apache log and getting IP addresses
So first part – reading the file is based on this approach and done like this:
log_read="$(dirname "${0}")/.$(basename "${my_log}").${pattern}.read"
# get current file size in bytes
current_size=$(wc -c < "${my_log}")
# remember how many bytes you have now for next read
# when run for first time, you don't know the previous
[[ ! -f "${log_read}" ]] && echo "${current_size}" > "${log_read}"
bytes_read=$(cat "${log_read}")
echo "${current_size}" > "${log_read}"
# if rotated, let's read from the beginning
[[ ${bytes_read} -gt ${current_size} ]] && bytes_read=0
# get the portion
data=$(tail -c +$((bytes_read+1)) "${my_log}" | head -c $((current_size-bytes_read)) | grep -P "${pattern}")
At this point you will have “fresh” portion of data each time you run the script – it will read portions of data between two runs. Now that you have data, time to grep just IP part of it:
while read line; do
# do magic here
done <<< "$(echo "${data}" | grep -Po "^\S+" | sort -u)"
Hooray, we have a list of unique IPs at this point! Time to see the countries behind them, so let’s proceed to the next part – convert IP to country.
API for IP to country conversion
So first of all, you need API to feed your collected IPs to. I found ip-api.com which is free. Unfortunately it allows only 45 requests per minute for free, which my use case sometimes exceeds. We are talking about degree of hundreds of unique IP addresses per minute at peak times. So then I found another API with free tier – ipinfo.io. However it has a limit in different way – it has 50000 calls for free per month.
Here I decided to combine the two – use ip-api.com and once (if) it reaches the limit, use the reserve from ipinfo.io – that is how only peak moments would “eat” some of it with low probability to ever reach the monthly limit. So I have something like this now:
while read line; do
[[ ${line} == "" ]] && continue
ip="${line}"
if [[ ${free_limit} -gt 0 ]]; then
country=$(curl http://ip-api.com/json/${ip} 2>/dev/null | jq -r '.countryCode')
echo "$(date "+%Y-%m-%d %H:%M:%S") limit is ${free_limit}, calling ip-api.com, country: ${country}, ip: ${ip}" >> ${log_file}
free_limit=$((free_limit-1))
else
country=$(curl "ipinfo.io/${ip}?token=${ipinfo_token}" 2>/dev/null | jq -r '.country')
echo "$(date "+%Y-%m-%d %H:%M:%S") limit is ${free_limit}, calling ipinfo.io, token: ${ipinfo_token}, country: ${country}, ip: ${ip}" >> ${log_file}
fi
sender_data="${sender_data}\"Zabbix server\" httpd.countries[${trapper_key}] ${country}${nl}"
done <<< "$(echo "${data}" | grep -Po "^\S+" | sort -u)"
echo ${free_limit} > ${free_limit_file}
How do I keep track of "${free_limit}"
you might wonder. Here is how:
free_limit=41
free_limit_file=$(dirname "${0}")/.free_limit.txt
free_limit_time_file=$(dirname "${0}")/.free_limit_time.txt
[[ ! -f ${free_limit_file} ]] && echo ${free_limit} > ${free_limit_file}
[[ ! -f ${free_limit_time_file} ]] && echo 60 > ${free_limit_time_file}
updated_before=$(($(date +%s)-$(stat -c %Y ${free_limit_time_file})))
update_left=$(($(cat ${free_limit_time_file})-${updated_before}))
if [[ ${update_left} -le 0 ]]; then
echo 60 > ${free_limit_time_file}
echo ${free_limit} > ${free_limit_file}
else
echo ${update_left} > ${free_limit_time_file}
fi
free_limit=$(cat ${free_limit_file})
echo "$(date "+%Y-%m-%d %H:%M:%S") update left: ${update_left}s, free limit: ${free_limit}" >> ${log_file}
Each time script runs I store two things about ip-api.com calls – how many calls were made this run is subtracted from previous value of “what was left” and also the time that is left till “reset” (remember limit is “per minute” so you should use 1 minute windows). After 1 minute window is over, limit is reset.
Sending to Zabbix
At this point we get countries behind all unique IP addresses. As you can see, loop section also includes part where I prepare data for last step, to send it to Zabbix. I will use utility zabbix_sender for this purpose. Once data is collected, I simply do this:
[[ ${sender_data} != "" ]] && zabbix_sender -z 127.0.0.1 -i - <<< "${sender_data}"
And on Zabbix side construct “trapper” type of item to receive what script sends. That’s it!
One thought on “retrieving country codes from IP automatically”
Comments are closed.