The installation of hbase on CentOS is fairly painless thanks to those generous folks at Cloudera. Add their CDH4 repository and you're there: yum install hbase.
However, adding lzo compression for hbase is a little more tricky. There are a few guides describing how to checkout from github, build the extension, and copy the resulting libraries into the right place, but I want a nice, simple RPM package to deploy.
Enter the hadoop-lzo-packager project on github. Let's try and use this to build an RPM I can use to install lzo support for hbase.
Get the source code:
git clone git://github.com/toddlipcon/hadoop-lzo-packager.git
Install the deps:
yum install lzo-devel ant ant-nodeps gcc-c++ rpmbuild java-devel
Build the RPMs:
cd hadoop-lzo-packager
export JAVA_HOME=/usr/lib/jvm/java
./run.sh --no-debs
Et voila – cloudera-hadoop-lzo RPMS ready for installation. But wait… The libs get installed to /usr/lib/hadoop-0.20
… That's no good, I want them in /usr/lib/hbase
.
So I went ahead & hacked run.sh
and template.spec
to allow the install dir on the target machine to be specified on the command-line. I can now use a command line something like this:
./run.sh --name hbase-lzo --install-dir /usr/lib/hbase --no-deb
That produces a set of RPMs (binary, source, and debuginfo) with the base name hbase-lzo and libraries installed to /usr/lib/hbase
My changes (plus another small change adding necessary BuildRequires to the RPM spec template) are in my fork of the project on github