http://cloudfront.blogspot.in/2012/06/how-to-build-and-use-flume-ng.html
First of all we have to write the configuration file for our agent. This agent will collect data from the file and dump it into the Hbase table. A simple configuration file might look like this :
hbase-agent.sources = tail
hbase-agent.sinks = sink1
hbase-agent.channels = ch1
hbase-agent.sources.tail.type = exec
hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
hbase-agent.sources.tail.channels = ch1
hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
hbase-agent.sinks.sink1.channel = ch1
hbase-agent.sinks.sink1.table = demo
hbase-agent.sinks.sink1.columnFamily = cf
hbase-agent.sinks.sink1.serializer = org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
hbase-agent.sinks.sink1.serializer.payloadColumn = col1
hbase-agent.sinks.sink1.serializer.keyType = timestamp
hbase-agent.sinks.sink1.serializer.rowPrefix = 1
hbase-agent.sinks.sink1.serializer.suffix = timestamp
hbase-agent.channels.ch1.type=memory
Save this in a file called hbase-agent.conf inside the /conf directory of your Flume distribution. Now start your Hadoop and Hbase and create a table called demo with a column family called cf. Now open another terminal and change you directory to /conf inside your FlumeHome. Then start your agent using the below specified command :
$ bin/flume-ng agent -n hbase-agent -c conf/ -f conf/hbase-agent.conf
Now go back to your Hbase shell and scan the demo table. If everything was ok then you will see something like this :
hbase(main):004:0> scan 'demo'
ROW COLUMN+CELL
11339770815331 column=cf:col1, timestamp=1339770818340, value=value1
11339770815332 column=cf:col1, timestamp=1339770818342, value=value6
2 row(s) in 0.0500 seconds
NOTE : I have taken a small text file called demo.txt here which has following few lines in it
value1
value2
value3
value4
value5
value6