hive-udf-geo-ip-jtg

What does it do?

Odds are if you are using hive, you are doing ETL on your web logs. Odds are you may want to geo-locate your traffic. Hive has pluggable UDF's (User Defined Functions). We plugged the UDF framework into geo-ip and walla!

add file GeoIP.dat;
add jar geo-ip-java.jar;
add jar hive-udf-geo-ip-jtg.jar;
create temporary function geoip as 'com.jointhegrid.hive.udf.GenericUDFGeoIP';
select geoip(first, 'COUNTRY_NAME',  './GeoIP.dat' ) from a;
          

This code is not a candidate for inclusion into hive as GEO-IP is GPL and hive is Apache 2.

SVN (builds comming soon)

  • http://www.jointhegrid.com/svn/hive-udf-geo-ip-jtg
  • http://www.jointhegrid.com/svn/geo-ip-java