Java-Gaming.org Hi !
Featured games (83)
games approved by the League of Dukes
Games in Showcase (539)
Games in Android Showcase (132)
games submitted by our members
Games in WIP (603)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: 1 [2] 3
  ignore  |  Print  
  map performance  (Read 23868 times)
0 Members and 1 Guest are viewing this topic.
Offline Momoko_Fan

Junior Devvie


Medals: 2



« Reply #30 - Posted 2010-02-14 22:46:45 »

Ok guys so if I want best performance on android what should I use? CuckoFastMap that was posted above here:
http://www.java-gaming.org/topics/map-performance/21395/msg/174900/view.html#msg174900
?

Offline Nate

« JGO Bitwise Duke »


Medals: 158
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #31 - Posted 2010-02-15 00:54:55 »

The only way to really know is to test your specific code. How you use the map will affect performance and it will vary on different hardware. If you aren't having a performance problem, HashMap is fine (don't optimize prematurely). If you are having a performance problem, profiling or benchmarking is a must to really understand what is going on.

That said, if you have int keys, chances are FastIntMap will work well for you. If you have GC problems, CachingFastMap is a good bet.

I meant to do more with the cuckoo map but I got distracted and wandered off. I'll come back to it eventually!

Offline delt0r

JGO Knight


Medals: 30
Exp: 18 years


Computers can do that?


« Reply #32 - Posted 2010-02-15 11:05:37 »

The cuckoo hashing has worst case O(1) performance, so in theory is as good or better than anything else. BUT the implementation in this thread is a work in progress. So if you used it you would need to expect bugs to be in the implementation.  It should be noted I am using a version in production code. But its something to think about.

I have no special talents. I am only passionately curious.--Albert Einstein
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Nate

« JGO Bitwise Duke »


Medals: 158
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #33 - Posted 2010-10-11 00:59:32 »

Finally updated with delt0r's cuckoo map.

Edit: updated charts again, this time with server VM, 5000 for warmpup, and 5000 for runs.






Offline Spasi
« Reply #34 - Posted 2010-10-11 22:43:02 »

Hey Nate,

I converted FastIntMap to FastLongMap for LWJGL's OpenCL package, then I wrote a small benchmark to compare it with Java's HashMap. For put(), I'm seeing a small performance gain (up to 10%) on JDK6-server and JDK7 VMs by extracting the rehashing code in its own method. Could be something specific to my benchmark/machine, but you may want to try it out.
Offline Spasi
« Reply #35 - Posted 2010-10-11 23:17:39 »

I just tried your benchmark. With count = 10000 (the default 100 runs everything in interpreted mode) and a slightly changed test loop:

1  
2  
3  
4  
5  
6  
7  
8  
9  
for (Test runnable : tests) {
    warmup = true;
    for (int i = 0; i < 5; i++)
        runnable.run();
    warmup = false;
    for (int i = 0; i < 5; i++)
        runnable.run();
    System.out.println();
}


I get these results on JDK7 b111:





Offline Nate

« JGO Bitwise Duke »


Medals: 158
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #36 - Posted 2010-10-12 00:41:57 »

Thanks Spasi. Why does moving rehash to its own method improve things? Does the smaller method allow it to optimize better? Is there some general rule for this?

I made the rehash changes to FastIntMap, FastMap, and FastIdentityMap, set count=10000, and changed the loop as you indicated. I also tweaked CHashMap slightly to avoid some array lookups in "put" and avoid some hashes in "get" (fast.7z link above to source updated). My results are similar to yours using 1.6.0_16 server VM on my Intel i920 (OC to 3ghz):







Nice to see the cuckoo CHashMap "get" times are fast. CHashMap uses object keys while FastIntMap uses int keys, probably be a small boost there.

Results not on the server VM were very erratic.

It should be said again that there are so many variables involved that these benchmark results likely don't mimic how the map is used in a real app. Nothing beats benchmarking your actual app. The benchmarks also don't show remove, contains, iteration, garbage generation, etc.

That said, FastIntMap certainly seems consistently fast. The FastMap and FastIdentityMap are the same as FastIntMap just with object keys, so are similarly fast.

Offline Spasi
« Reply #37 - Posted 2010-10-12 01:13:50 »

Why does moving rehash to its own method improve things? Does the smaller method allow it to optimize better? Is there some general rule for this?

I'm not an expert on so low-level stuff, but I used -XX:+PrintAssembly and the shorter method uses registers differently. Larger code size might affect CPU caches negatively too. A general rule could be, if you have a method with half its code not doing anything for 99% of the calls, you might as well extract that half to a different method to help the VM optimize the other half better.

Results not on the server VM were very erratic.

I love JDK7's tiered compilation Wink
Online kappa
« League of Dukes »

JGO Kernel


Medals: 81
Projects: 15


★★★★★


« Reply #38 - Posted 2010-10-12 11:18:59 »

Why does moving rehash to its own method improve things? Does the smaller method allow it to optimize better? Is there some general rule for this?

iirc, small and simple methods get inlined by the JIT compiler automatically thus removing the method call altogether.

Quote
Method inlining increases performance by reducing the number of method calls your program makes. The Java VM inlining code inlines methods that return constants or only access internal fields.

To take advantage of method inlining you can do one of two things. You can either make a method look attractive to the VM to inline or manually inline a method if it doesn't break your object model. Manual inlining in this context means simply moving the code from a method into the method that is calling it.
Offline delt0r

JGO Knight


Medals: 30
Exp: 18 years


Computers can do that?


« Reply #39 - Posted 2010-10-12 14:41:47 »

I just noticed that Nate's implementation of cuckoo hashing on the other page is wrong.

For each object you have 2 or more hash functions(i have 3). All objects are at one of  index given by hash&mask (assuming power of 2 for table size).  This means with 3 hash functions you just check 3 positions.

What is wrong (from my reading) of nate's code is that if all the slots are full (full collision), he rehashes the table to 2x the size. What you do instead is take one of the objects that are in the way, and check if one of its slots are empty, if so move it. you do this recursively. This will avoid excessive rehashing. You catch the case where you end up with object you started with by rehashing, but i can be proved this is very rare.

It should be noted that with 2 hash functions, the load factor should be 0.5. With 3 about 0.75.

In fact the performance of puts is greatly improved by starting with a very oversize hash map to avoid "cascading puts" and rehashing. Something i often do (say about 2x bigger than i expect it to need).

I have no special talents. I am only passionately curious.--Albert Einstein
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Roquen
« Reply #40 - Posted 2010-10-12 15:17:26 »

Yeah, if you're not kicking out a placed entry, then you're performing a classic multi-hash instead of cuckoo hashing.  Is your .75 what you're seeing with your data set?  I'd expect (significantly) higher than that in general.
Offline delt0r

JGO Knight


Medals: 30
Exp: 18 years


Computers can do that?


« Reply #41 - Posted 2010-10-12 15:28:35 »

Not sure i understand what you are asking? 0.75 is the limit of not requiring a rehash with high probability for 3 hash functions assuming pseudo randomness. Note that it can be higher but in a real hash map with a finite table size it can also be smaller, but with a very small probability.

If you have the pathological case of all keys returning the same hash code... your boned.

It should be noted that i have just noticed that my "rehashing" for 3 hashes is brain dead. I am fixing it now.

I have no special talents. I am only passionately curious.--Albert Einstein
Offline Roquen
« Reply #42 - Posted 2010-10-12 17:41:52 »

Looks like we're talking the same language.  I'd expect 3 hash functions on random data to be around .9
Offline delt0r

JGO Knight


Medals: 30
Exp: 18 years


Computers can do that?


« Reply #43 - Posted 2010-10-12 19:25:08 »

Turns out the proofs show no resizes with load factor <0.75 the upper bound is around 0.8 IIRC.

Note i have updated my implementation with better hashing. Its faster in the general case, but slower in stupid pathological cases. For example I ran a test with Integers as the keys. However Integer(1).hashCode==1. This created nice access patterns for ordered keys because i assumed hashCode was somewhat random (it should be) and didn't rehash hashCode(). Now i hash the hashCode() value and random order is similar to "ordered" key puts.

I also noted big differences in performance with cache coherence issues -- a typical problem with hashmaps. Basically a get/put combination with the same key is more than 2x faster than with different keys.

I have no special talents. I am only passionately curious.--Albert Einstein
Offline Nate

« JGO Bitwise Duke »


Medals: 158
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #44 - Posted 2010-12-27 10:42:18 »

Ran across this interesting paper on map comparison, including cuckoo hashing:
http://crpit.com/confpapers/CRPITV91Askitis.pdf

Also, I wanted a hashmap that doesn't allocate for use on Android. I implemented one using open addressing and linear probing. I use separate arrays for keys/values. I tried interleaving keys/values but got worse performance (ObjectMap2 in the source). Source link in the first post has been updated.

The new classes are named ObjectMap, IdentityMap (Nate), and IntMap in the charts below. They seem to perform pretty well, with the added benefit of not allocating. It's too bad about ObjectMap#put, though get is fast. While there are some tests in FastMapTests, these new classes aren't yet tested all that thoroughly, so use with care. Also, the methods returning an iterator/iterable are not thread safe (see javadocs), which is easy to change if you don't want that.






Offline krasse
« Reply #45 - Posted 2010-12-27 14:51:43 »

How about iteration over the keys returned by the map's keySet()?

I often do that and I always use a LinkedHashMap then instead.

I'll be back soon...

Offline krasse
« Reply #46 - Posted 2010-12-27 15:18:10 »

Added another test for iteration for those maps that implement the Map interface:

1  
2  
3  
4  
5  
6  
start();
Set keySet = map.keySet();
for (Object o : keySet) {
  o.getClass();
}
end("iter");






Here are also my results for put and get:





LinkedHashMap seems to be superior for iteration but not as good for put/get on my machina.


Offline delt0r

JGO Knight


Medals: 30
Exp: 18 years


Computers can do that?


« Reply #47 - Posted 2010-12-27 15:24:21 »

Something is definitely wrong with my implementation. Either that or you are using Sequential Integers as the keys, and some other imps don't hash properly (as mine wasn't when it was getting very "high" performance) and this will result is "perfect" hashing that will not represent true performance.

I have no special talents. I am only passionately curious.--Albert Einstein
Offline krasse
« Reply #48 - Posted 2010-12-27 21:19:50 »

Something is definitely wrong with my implementation. Either that or you are using Sequential Integers as the keys, and some other imps don't hash properly (as mine wasn't when it was getting very "high" performance) and this will result is "perfect" hashing that will not represent true performance.

I don't know if you refer to my "iteration" test results but I simply use the same random integers in the put-test.

Offline delt0r

JGO Knight


Medals: 30
Exp: 18 years


Computers can do that?


« Reply #49 - Posted 2010-12-28 11:31:22 »

sounds like its my end. I will look into it. It really shouldn't be that slow.

I have no special talents. I am only passionately curious.--Albert Einstein
Offline Nate

« JGO Bitwise Duke »


Medals: 158
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #50 - Posted 2010-12-28 12:03:37 »

I messed around and implemented the benchmarks with Caliper. The links are below. There is data for client/server VMs, 32/64bit VMs, 2000/10000/50000 map entries, and all combinations of those. Note that you can scroll down and uncheck stuff under the "Variables" section to hide information you don't care about. You can also use the left/right arrows in the table header to group the information differently. These benchmarks include d3ltor's latest cuckoo.jar, LinkedHashMap, and there are now benchmarks for entry iteration. I separated the maps that use object keys from the maps that use int keys.

Benchmark source code

MapGet
MapPut
MapIterate

IntMapGet
IntMapPut
IntMapIterate

Caliper is pretty cool. It is somewhat lacking in docs though. I also managed to crash something server side for the online results. I had to create a new account and file a bug. I'll probably use Caliper more often for benchmarking. The view of the data is awesome and the benchmarks are easy to write.

Some notes:

* Seems like d3ltor's cuckoo map has some issue with iteration.
* The put numbers are X puts and then a clear. Unfortunately I don't have a way to measure only the puts and not the clear.
* CachingFastMap had a bug in clear, will be fixed next time I do a run.
* IntMap uses Integer.MIN_VALUE to represent an unused key. This makes initializing the array ghetto. Need to try using zero internally (eg by adding Integer.MIN_VALUE to the key) to do the same, though this means a subtraction when getting a key out of the map.

Offline delt0r

JGO Knight


Medals: 30
Exp: 18 years


Computers can do that?


« Reply #51 - Posted 2010-12-28 20:51:58 »

Its probably my code that crashed the server.  persecutioncomplex

I found my problem. I was being so stupid that i can't believe that i did that. I am sure that even a drunk monkey would not have done what i did..

For cuckoo hashing you use 3 (in my case) independant hashes. Only i did this...
1  
2  
3  
h1=hash(o.hashCode());
h2=hash(h1);
h3=hash(h2);

So i have in effect a single hash code. Since a collision in the first hash code will produce a collision for all 3. As a result my load factors where around the 5% range!

This is now fixed. Also i have added a Pure int key hash map too. Its quite a bit faster.

Hopefully Nate you could run my new code so i don't look so bad.  Undecided

I should add that my implementations do *not* return backed Collections for the map methods that return them. It creates new Collections. 

I have no special talents. I am only passionately curious.--Albert Einstein
Offline Nate

« JGO Bitwise Duke »


Medals: 158
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #52 - Posted 2010-12-28 22:38:14 »

Glad this stuff proved useful. Smiley

The server only collects the data for a run and displays it. Bug is here:
http://code.google.com/p/caliper/issues/detail?id=107

Running new benchmarks now! I included CIntHashMap, fixed CachingFastMap, and added IntMap2 which internally uses 0 for an empty key (Integer.MIN_VALUE is still an invalid key).

BTW, the cuckoo.jar attached to the post wasn't as newer as the one at your URL. Maybe easier to just host in one place.

There is no need to calculate all 3 hashes up front. You could move the calculation to just before it is needed. Not a big deal, but every little bit helps.

Offline delt0r

JGO Knight


Medals: 30
Exp: 18 years


Computers can do that?


« Reply #53 - Posted 2010-12-28 23:22:10 »

Updated the attachment. I will not be at this university for much longer. So rather than another dead link, i wanted the attachment.

I do a little bit of lazy hashing. Not much, but put, get and remove have it. Its does not make much difference. Cache access patterns and .hashCode() seem to be the big overheads.

I have no special talents. I am only passionately curious.--Albert Einstein
Offline Nate

« JGO Bitwise Duke »


Medals: 158
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #54 - Posted 2010-12-29 06:22:58 »

d3ltor, your cuckoo map looks much better now, though iteration is slow from putting everything into a Set first.

I wish I could see the int map results but the Caliper results site is still broken. Sad

Offline delt0r

JGO Knight


Medals: 30
Exp: 18 years


Computers can do that?


« Reply #55 - Posted 2010-12-29 08:27:43 »

Iteration is something i almost never do with a hash map, and I often don't want backed copies. That's why it did it that. However since it does produce such poor performance, and its in fact breaking the contract of Map, maybe i should revise this?

Oh and Thanks Nate for all the hard work. ++

I have no special talents. I am only passionately curious.--Albert Einstein
Offline delt0r

JGO Knight


Medals: 30
Exp: 18 years


Computers can do that?


« Reply #56 - Posted 2010-12-29 10:04:22 »

I checked a few things. Turns out that Collections.* and other utils use iteration a lot, and this just would not work with my classes. Also i was iterating though things sometimes.

So now my Maps return fully backed Collections and iterators. Iteration is about 30-80x faster.

Nate: appreciation*=appreciation Wink

I have no special talents. I am only passionately curious.--Albert Einstein
Offline krasse
« Reply #57 - Posted 2010-12-29 10:20:12 »

Thanks Nate for all your work and for showing me all those tools that I didn't know existed! Smiley

Offline Nate

« JGO Bitwise Duke »


Medals: 158
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #58 - Posted 2010-12-29 23:19:26 »

Benchmarks updated. CHashMap iteration is definitely more in line with the other maps now. Smiley

d3ltor, have you thought of using a "stash" for keys that would otherwise cause a rehash? Maybe this would let you get away with only 2 hashes and/or reduce rehashes. I wonder how often you see a rehash?

Offline Nate

« JGO Bitwise Duke »


Medals: 158
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #59 - Posted 2010-12-30 05:39:46 »

I've been messing with my own cuckoo implementation, doing int keys first. I've implemented both 2 and 3 hashes and I'm not seeing much of a difference, even with different load factors and initial capacities:

CuckooPut3
CuckooGet3

I expected a 0.9 load factor to have fewer collisions and cause the 2 hash cuckoo to rehash. Turns out in both cases a single rehash occurs due to unrecoverable collisions, the 3 hash cuckoo just runs into a problem later. I just realized I'm limiting the number of bucket "pushes" due to collisions to 100. This should probably be based on (or equal to?) the capacity. Sure enough, removing the limit allows the 3 hash cuckoo to have 50k items without a rehash (not shown in above benchmarks). Checking the max number of 3 hash cuckoo pushes for 50k random integer keys and 0.9 load, I see it takes ~90 to ~160 loops (a range due to the "random walk"). The max for 50 (not 50k) random int keys and 0.9 load is ~19 to ~70. So it seems for small capacities, the limit may be > capacity. What would be a good way to determine the limit?

The 3 hash cuckoo doesn't seem to incur much penalty for the extra hash, so I think I'll go with it. I'm still not sure about a stash. It could allow a lower "push" limit, which might save time on puts while also avoiding rehashing. For a 3 hash cuckoo, 50k random int keys and 0.9 load causes only 0 to ~5 collisions to loop more than 5 times, even though some of these collisions loop ~90 to ~160 times, as described above. So for a 3 hash cuckoo, the stash would stop some of this excessive looping at the cost of having to check every key in the stash if a key otherwise could not be found on a get.

Source link in the first post has been updated with the latest. My cuckoo classes aren't finished, but they and the benchmarks are there if anyone wants to check them out. Code reviews are appreciated!

Pages: 1 [2] 3
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

rwatson462 (35 views)
2014-12-15 09:26:44

Mr.CodeIt (26 views)
2014-12-14 19:50:38

BurntPizza (58 views)
2014-12-09 22:41:13

BurntPizza (93 views)
2014-12-08 04:46:31

JscottyBieshaar (53 views)
2014-12-05 12:39:02

SHC (69 views)
2014-12-03 16:27:13

CopyableCougar4 (71 views)
2014-11-29 21:32:03

toopeicgaming1999 (131 views)
2014-11-26 15:22:04

toopeicgaming1999 (122 views)
2014-11-26 15:20:36

toopeicgaming1999 (34 views)
2014-11-26 15:20:08
Resources for WIP games
by kpars
2014-12-18 10:26:14

Understanding relations between setOrigin, setScale and setPosition in libGdx
by mbabuskov
2014-10-09 22:35:00

Definite guide to supporting multiple device resolutions on Android (2014)
by mbabuskov
2014-10-02 22:36:02

List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!