algorithm - Gaussian Distribution + Hash Tables -

March 15, 2012

i had weird idea hashing function. problem statement is

you storing id-numbers of 162 students in class obtaining n marks out of 300 in course (for each n=0, 1, 2, ... 300) in hash table. devise simplest , least collision prone hash function such wasted memory cells minimum. here, collision when 2 students scoring n₁ , n₂ same slot in hash table.

one solution can use h(n) = (n*5 + 7) % 163 along chaining. there can @ 162 distinct marks.

edit there can several standard ways this. i'd try idea , check (maybe mathematically). might have lesser collisions lesser memory.

now, here's idea had. can assume distribution of marks gaussian. so, there more people near average score , lesser @ extremes.

so, can have hash function this:

h(n) = 0 (if n<100 || n>200)
h(n) = 1 (if 100<=n<125 || 175<=n<200)
h(n) = 2 (if 125<=n<140 || 160<=n<175)
h(n) = 3 (if 140<=n<160)

for such conditions (say, k), hash table have least number of collisions , least amount of space occupied.

now, guess.does make sense?is there way prove this? or wrong somewhere?

what doing manually here called in image processing histogram equalization. think makes absolutely sense, because make sure statistically slots used same probability, , you're minimizing collisions.

Search This Blog

shell

algorithm - Gaussian Distribution + Hash Tables -

Comments

Post a Comment

Popular posts from this blog

Add email recipient to all new Trac tickets -

400 Bad Request on Apache/PHP AddHandler wrapper -

asp.net - repeatedly call AddImageUrl(url) to assemble pdf document -