Can Hbase table be partitioned based on time? -

May 15, 2013

i need data based on time range.is there way partition hbase table based on time range. ex : want data 9:00 9:05 .

you can create compound key of type <timestamp><id>, , entries in hbase ordered timestamp. can create scanner starts @ beginning of range , ends @ end of range.

one issue may face if have high insert rate, have single server hotspot new entries. 1 way around invert key , ensure first part random: <sha1 of id><timestamp>. has advantage of distributing writes across entire cluster, disadvantage of requiring read of entire table particular range.

if use first method of <timestamp><id>, map job may not able split work many chunks might like. default way table splits work on region. if time slice small enough, have single region serving data , not gain parallelism in query. potentially have custom table split parallelizes query across more mappers regions, still reading of data 1 region, , can have drawbacks parallelism well.

how set table depends on projected usage scenario , read/write proportion, , how high of performance need each.

if append id timestamp ensure uniqueness, can still scanner return events given timestamp. hbase sorts keys lexographically based on byte representation. so, if key <timestamp>:<id>, can set scanner start @ row <timestamp> , stop row @ <timestamp+1> events @ timestamp

Search This Blog

shell

Can Hbase table be partitioned based on time? -

Comments

Post a Comment

Popular posts from this blog

Add email recipient to all new Trac tickets -

400 Bad Request on Apache/PHP AddHandler wrapper -

java - Android recognize cell phone with keyboard or not? -