graph - Help designing a hash function to detect duplicate records? -

January 15, 2015

let me explain program far. rubiks cube solver. given scrambled cube (this initial state). becomes root node of graph. using iterative deepening depth first search "brute force" scrambled cube recognizable state can use pattern recognition solve.

as can imagine, large graph, come sort of hashing functionality detect duplicate nodes in graph (thus speeding traversal).

i largely unfamiliar hashing functions, here thinking... each node different state of rubik's cube. if come cube state (node) has seen, want skip on it. need hashing function takes me state variable checksum, state variable 54-character string. allowed characters y, r, g, o, b, w (which correspond colors).

any designing hash function appreciated.

for fastest duplicate detection , removal - avoid generating many of repeated positions in first place. easy , quicker generating , finding repeats. example if have moves f , b, if allow sub sequence fb don't allow bf, gives same result. if you've done 3f, don't follow f. can generate small look-up table allowed next moves, given last 3 moves.

for remaining duplicates want fast hash because there lot of positions. make hash go fast, others have commented, you want hashes from, representation of position, to small. there 12 edge cubies , there 8 corner cubies. representing each cubies position , orientation need take 5 bits per cubie, i.e. 100 bits (12.5 bytes) total. edges 4 bits position , 1 flip. corners 3 bits position , 2 spin. can ignore last edge cubie since position , flip fixed others. representation down 12 bytes position.

you have 70 real bits of information in rubik cube position, , 96 bits close enough 70 make counter productive hashing bits further. i.e. treat representation of board hash. may sound bit strange, question i'm envisaging @ same time experimenting less compact representation of cube more amenable pattern matching. in case the 12 byte value can treated if hash, advantage it's a hash never has collision. makes duplicate testing code , new value insertion shorter , simpler , faster. it's going cheaper md5 solutions suggested far.

there many other tricks use cut down work in searching repeated positions. have @ http://cube20.org/ ideas.

Search This Blog

shell

graph - Help designing a hash function to detect duplicate records? -

Comments

Post a Comment

Popular posts from this blog

Add email recipient to all new Trac tickets -

400 Bad Request on Apache/PHP AddHandler wrapper -

asp.net - repeatedly call AddImageUrl(url) to assemble pdf document -