Java-Gaming.org Hi !
Featured games (83)
games approved by the League of Dukes
Games in Showcase (522)
Games in Android Showcase (127)
games submitted by our members
Games in WIP (589)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: 1 2 [3]
  ignore  |  Print  
  map performance  (Read 23574 times)
0 Members and 1 Guest are viewing this topic.
Offline Roquen
« Reply #60 - Posted 2010-12-30 07:54:28 »

An interesting addition to your set of graphs would be a plot of memory requirements.
Offline delt0r

JGO Knight


Medals: 27
Exp: 18 years


Computers can do that?


« Reply #61 - Posted 2010-12-30 10:18:25 »

Nate, yes but i didn't bother. 3 keys really doesn't make a huge difference. One of the appeals of cuckoo is that its quite simple. Adding a stash just adds more complications. In particular most of my usage does not require growing the hashMap once it gets to its proper size. In this case rehashes are very rare (like 1 in 10000). This can of course result in a map with a load factor below what was asked for. But again this really is just as a hash table. In these cases in my code i have made some real gains.

But in most cases Map performance just does not matter. Everything else you are doing is many times slower anyway. 

I have no special talents. I am only passionately curious.--Albert Einstein
Offline Nate

« JGO Bitwise Duke »


Medals: 158
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #62 - Posted 2010-12-30 10:28:07 »

An interesting addition to your set of graphs would be a plot of memory requirements.

Memory charts would be sweet. Caliper doesn't seem to have anything built-in. :/

Nate, yes but i didn't bother. 3 keys really doesn't make a huge difference. One of the appeals of cuckoo is that its quite simple. Adding a stash just adds more complications. In particular most of my usage does not require growing the hashMap once it gets to its proper size. In this case rehashes are very rare (like 1 in 10000). This can of course result in a map with a load factor below what was asked for. But again this really is just as a hash table. In these cases in my code i have made some real gains.

But in most cases Map performance just does not matter. Everything else you are doing is many times slower anyway. 

I may try out a stash with 3 hashes and see how it affects puts. I am mostly interested in get performance, but this stuff is fun. Smiley I am pretty sure a stash will avoid many hundreds of pushes. I'm guessing even small maps will benefit, as I saw a 64 element map loop 70 times on a put.

Re: performance, besides the fun of it all, I think this could be nice to use on Android. There using HashMap sometimes isn't feasible, since it allocates on put and GC is slooow.

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline delt0r

JGO Knight


Medals: 27
Exp: 18 years


Computers can do that?


« Reply #63 - Posted 2010-12-30 11:16:48 »

I was under the impression that Android was catching up to hotspots. At any rate i do very little allocation. But my target was always the desktop or servers.

BTW i use a randomized "pushing", so i pick the "push" hash randomly from the 3. I don't see loops very often. But like you, I wanted fast contains/get. And cuckoo really shines there.

I have no special talents. I am only passionately curious.--Albert Einstein
Offline Nate

« JGO Bitwise Duke »


Medals: 158
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #64 - Posted 2010-12-30 13:13:56 »

Android has a JIT now, but GC is still slow AFAIK.

Ahhh, I had a bug. I don't see loops very often now. Adding 50k random ints the max number of pushes required for a single put is ~7 to ~15.

I have a feeling that the hash functions I stole from delt0r aren't different enough. I ran into this:

1  
2  
3  
hash1(1620722920) & 63 == 24
hash2(1620722920) & 63 == 24
hash3(1620722920) & 63 == 24


Here are my hash functions:

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
private static final int PRIME1 = 0xbe1f14b1;
private static final int PRIME2 = 0xb4b82e39;
private static final int PRIME3 = 0xced1c241;
private int capacity = 64;
private int hashShift = 31 - Integer.numberOfTrailingZeros(capacity);
private int hash1 (int h) {
   h *= PRIME1;
   return h ^ (h >>> hashShift);
}
private int hash2 (int h) {
   h *= PRIME2;
   return h ^ (h >>> hashShift);
}
private int hash3 (int h) {
   h *= PRIME3;
   return h ^ (h >>> hashShift);
}


When all 3 hashes are the same, the cuckoo can get stuck in a loop until it gives up and doubles the size of the backing array. Any ideas on how the hash functions could be designed to avoid this?

Offline pjt33
« Reply #65 - Posted 2010-12-30 14:30:46 »

When all 3 hashes are the same, the cuckoo can get stuck in a loop until it gives up and doubles the size of the backing array. Any ideas on how the hash functions could be designed to avoid this?
Compute all three at once and fix them:
1  
2  
3  
4  
5  
6  
7  
8  
h1 = f1(hash) % capacity;
h2 = f2(hash) % (capacity - 1);
h3 = f3(hash) % (capacity - 2);
if (h2 >= h1) h2++;
int inc = 0;
if (h3 >= h1) inc++;
if (h3 >= h2) inc++;
h3 += inc;
Offline delt0r

JGO Knight


Medals: 27
Exp: 18 years


Computers can do that?


« Reply #66 - Posted 2010-12-30 15:31:19 »

well there is a proof that if a1 and a2 are different then
H(a1)=H(a2) with proablity of 1/(2^wordsize)
where H(x) is alsmost what i use...
(p*x)>>>(32-wordsize)
for odd p.

Hence the reason i used it. My tests I use 3 random 32 bit primes p. This "family" of hash functions has been shown to be good for a number of cases, in particular cuckoo hashing.

Of course using just 4 bits, even a perfect hash function has a probability of 1/256 for a 3 way collision. Perhaps more importantly the probability of no collision in all three functions is only 82%.
 

I have no special talents. I am only passionately curious.--Albert Einstein
Offline delt0r

JGO Knight


Medals: 27
Exp: 18 years


Computers can do that?


« Reply #67 - Posted 2010-12-30 15:48:50 »

Ops my math was wrong.. Well its correct for 4 bits. But 63 is 6bit... so the probabilities are 1/(64*64) for a 3 way collision, and 95.4% chance of no collision at all (1 in 20).

If you leave out the h^ part you would have all different results i think. Note you can tune the random primes.

BTW pjt33, % is horrendously slow.

I have no special talents. I am only passionately curious.--Albert Einstein
Offline delt0r

JGO Knight


Medals: 27
Exp: 18 years


Computers can do that?


« Reply #68 - Posted 2010-12-31 07:41:16 »

NOTE: Nates  post below was lost due to the attack. However i still get email notifications.
Quote
Leaving out the h^ didn't result in different results. Tuning the primes allows the above input to succeed, but fails with other input. If your math is right for the 3 way collision probability it seems difficult to have efficient memory usage for small maps without a stash. Even with larger maps, getting unlucky means you'll waste huge amount of memory (and time rehashing) without a stash.

Here is a paper about cuckoo hash functions:
http://www.siam.org/proceedings/soda/2009/SODA09_087_dietzfelbingerm.pdf
Unfortunately my greek isn't that good. Wink Could someone translate the essence of this to layman terms? Does this help us choose better hash functions?

This recent paper is neat, it talks about both queuing cuckoo moves and using the queue as a stash:
http://www.wisdom.weizmann.ac.il/~naor/PAPERS/deamortized_cuckoo.pdf

It does not matter how perfect your hash function. If you want high load factors (ie 90%) then you need large table sizes for the statistics. There is no way around it. Its roughly called the birthday paradox, which is that the probability of any 2 people having the same birthday in a room is much higher than people intuitively think.

Example: A table of size 1000 with 100 entries. For just one hash function the probability of at least one collision is 1-P{no collision} equals .99436.

But now lets consider just two entries. The probability that all 3 hash functions all distinct values is 0.997. The probability that all 100 entries all have 3 distinct values for all 3 hash functions however is just 0.741. In other words without even considering pairwise collisions there is a 25% chance that at least one entry returns the same value for 2 of its 3 hash functions.

We could continue. However I implemented, a "perfect hash function" (ie true random, well SecureRandom) and the results for my tests were the same the hash function I used above. I still typically leave the load factor at about 0.75 since puts a bit quicker here.


I have no special talents. I am only passionately curious.--Albert Einstein
Offline Nate

« JGO Bitwise Duke »


Medals: 158
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #69 - Posted 2010-12-31 09:30:50 »

I'm going to move my cuckoo discussion over to delt0r's thread. I'll start a new thread to post map performance charts as this one has become somewhat long and unkempt. Smiley

1  
throw new ThreadDeath();

Pages: 1 2 [3]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

trollwarrior1 (19 views)
2014-11-22 12:13:56

xFryIx (68 views)
2014-11-13 12:34:49

digdugdiggy (47 views)
2014-11-12 21:11:50

digdugdiggy (41 views)
2014-11-12 21:10:15

digdugdiggy (35 views)
2014-11-12 21:09:33

kovacsa (59 views)
2014-11-07 19:57:14

TehJavaDev (61 views)
2014-11-03 22:04:50

BurntPizza (60 views)
2014-11-03 18:54:52

moogie (75 views)
2014-11-03 06:22:04

CopyableCougar4 (76 views)
2014-11-01 23:36:41
Understanding relations between setOrigin, setScale and setPosition in libGdx
by mbabuskov
2014-10-09 22:35:00

Definite guide to supporting multiple device resolutions on Android (2014)
by mbabuskov
2014-10-02 22:36:02

List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50

List of Learning Resources
by SilverTiger
2014-07-31 16:26:06
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!