Java-Gaming.org    
Featured games (91)
games approved by the League of Dukes
Games in Showcase (580)
games submitted by our members
Games in WIP (500)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: 1 [2]
  ignore  |  Print  
  Cuckoo hashing code[important update**2]  (Read 10277 times)
0 Members and 1 Guest are viewing this topic.
Offline Matthias

Senior Member


Medals: 3
Projects: 1


TWL - Themable Widget Library


« Reply #30 - Posted 2010-12-30 16:26:27 »

Using an identity hashmap for wrapper classes together with auto boxing is a guarantee for failure. As long as your keys are within the cached range it looks like it will work. But once you use keys outside the cached range (mostly 0..255) you get different objects for the same integer.

So if you use identity hashcode you must use ==. Then you can't use auto boxing.
Offline delt0r

JGO Coder


Medals: 22


Computers can do that?


« Reply #31 - Posted 2010-12-30 16:40:03 »

The problem with code that's not a black box, is that someone will use it as a black box. Especially if it implements java.util.Map. I generally don't extend from that when i want special behavior.  Note i do randomize my tests as well.

This code is intended to work as a drop in replacement for HashMap or IdenetityHashMap. As such you must make consistent use of Object.equals(), Object.hashCode(), System.idenetityHashCode() and ==. There is no way to have correct code otherwise.

Cuckoo hashing does have a limitation that means its not a true drop in for any chaining HashMap. In particular you cannot add 3 different objects that return the same hash code. It will rehash till you run out of ram.

I have no special talents. I am only passionately curious.--Albert Einstein
Offline Roquen
« Reply #32 - Posted 2010-12-30 17:15:14 »

I'm just throwing out general comments about various possible implementations. Take them for what they're worth.  Personally I have no problem with creating codebases with strict contracts that if violated will be "boom".
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline delt0r

JGO Coder


Medals: 22


Computers can do that?


« Reply #33 - Posted 2010-12-31 08:58:44 »

CommanderKeith said
Quote
So the problem is that if 4 or more objects are added and they have the same hashCode but are not == then the cuckoo map will rehash forever?

How about a system where the map uses hashCode() until the above is encountered and then as a fall back uses system.identityHashCode() only for those problem objects. Because identityHashCode really kills performance massively. Frame rate drops and takes 13% of cpu in my case when using -xprof vm option.
No. The problem is that == and equals means different things. Just as Object.hashCode() and  identityHashCode() do.

Example:
1  
2  
3  
4  
5  
6  
7  
8  
Integer one=new Integer(1);
Integer moreOne=new Integer(1);

assert one.hashCode()==moreOne.hashCode();
assert one.equals(moreOne);

assert System.idenetityHashCode(one)!=System.identityHashCode(moreOne);
assert one!=moreOne;

Line 7 will return false perhaps once every 2^32 iterations. Generally you want Object.equals(). But for the odd occasion you need "identity" for example using reflection for a deep clone, or with serialization.

Falling back from one method or the other is even worse. Its makes the semantics (meaning of hashMap) random.

The infinite allocation problem is *not* solved by using a different hash. I have a single number that must deterministically produce 3 pesudo random numbers. If the original number is the same then all three hash functions will always produce the same 3 values. This is compounded by the fact that java Number class wrappers use stupid hash values. They should at least contain "type" in the hash as well (ie Short.hashCode() should not give the same result as Long.hashCode() for the same value).

Nate said:
Quote
To avoid the infinite allocation, a stash could be used. This has a small performance hit for gets. I have an implementation I'll post in the "map performance" thread (soon) so we don't hijack d3ltor's thread.

If you control the objects, you could implement hashCode and cache the identityHashCode. But, why is your application so sensitive to map performance!?
 
Yes this would. While giving worst case performance of O(n). In which case perhaps chaining or linear probing is the better choice.

btw you are not hijacking anything.

I have no special talents. I am only passionately curious.--Albert Einstein
Offline Nate

JGO Kernel


Medals: 129
Projects: 3
Exp: 14 years


Esoteric Software


« Reply #34 - Posted 2010-12-31 10:02:35 »

Yes this would. While giving worst case performance of O(n). In which case perhaps chaining or linear probing is the better choice.

Did you mean this is response to using a stash? The required stash size is only log(n) (see the paper linked below), however even then it is empty almost all the time, has only 1 to ~3 items when it isn't empty, and is only checked when the key was not found first in the 3 hashes. I'm pretty convinced a stash is the way to go. It has little overhead and handles pathological cases, making cuckoo robust.

btw you are not hijacking anything.

Okie, I'll post my cuckoo stuff here then. This can be the cuckoo thread and that other thread can be the chart thread. Smiley Actually, the chart thread is long, messy, has a lot of OT, and is in the Android section... maybe I'll make a new fresh one. Smiley

I guess there's no fixing the hash functions, n-way collisions will always be an issue. I tried 4 hashes and it happened less often, but still bothers me. Stash to the rescue!

I have implemented cuckoo maps (3 hash, random walk) with and without a stash. Implementing the stash was very little code. Benchmarks:

CuckooPut8
CuckooGet8

These benchmarks reuse the map instance so put can be more accurately measured, which means the benchmarks don't reflect the time it takes to rehash (and of course memory usage is not shown). I've run a few standalone tests with millions and millions of puts, and the stash prevents the map from ever rehashing due to 3-way hash collisions or loops. The required stash size is only log(n), according to this paper.

To answer my earlier question, I found that max(16, sqrt(n) / 4) for the max number of push iterations for a single put is a reasonable value to minimize the number of iterations while also minimizing the stash usage. This seems to scale well. Without the stash, it seems max(32, sqrt(n) / 2) is a good value to not cause a rehash most of the time.

I think I'll go with the stash version as it is more robust without much penalty. Next I'll implement the rest of the map methods (delete, etc) and an object key version, and then I'll run new benchmarks with all the maps. Smiley

Oops, forgot the link to my source:
http://n4te.com/temp/fast.7z

Offline delt0r

JGO Coder


Medals: 22


Computers can do that?


« Reply #35 - Posted 2010-12-31 11:02:27 »

I am suspicious of claims of a O(ln(n)) stash size, my guess that the proof requires strong assumptions about the hash functions and all the hash codes begin "truly" random, in this case chaining and linear probing is also O(ln(n)). At any rate that is now a treeMap, not a hash map. All my tests are based on 1Million entries or more.

For tables of size 16 or so, I doubt hashMaps would be quicker than just iterating. 

I have no special talents. I am only passionately curious.--Albert Einstein
Offline Nate

JGO Kernel


Medals: 129
Projects: 3
Exp: 14 years


Esoteric Software


« Reply #36 - Posted 2010-12-31 11:13:37 »

I am suspicious of claims of a O(ln(n)) stash size, my guess that the proof requires strong assumptions about the hash functions and all the hash codes begin "truly" random, in this case chaining and linear probing is also O(ln(n)).

Well, the paper is there. Tongue There is some stuff about results from Braverman that proves "polylog(n)-wise independent hash functions are sufficient for" cuckoo hash maps using a stash/queue. I'm not really sure what that means though.

Possibly better, play with my Cuckoo3Stash class. I'm using a log(n) stash size and I am not seeing it exceeded in my tests. The stash is often not even used, and when it is only has a few items in it. Here is my class for ease of browsing or copy pasting into an IDE (yay JGO syntax highlighting!):

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
44  
45  
46  
47  
48  
49  
50  
51  
52  
53  
54  
55  
56  
57  
58  
59  
60  
61  
62  
63  
64  
65  
66  
67  
68  
69  
70  
71  
72  
73  
74  
75  
76  
77  
78  
79  
80  
81  
82  
83  
84  
85  
86  
87  
88  
89  
90  
91  
92  
93  
94  
95  
96  
97  
98  
99  
100  
101  
102  
103  
104  
105  
106  
107  
108  
109  
110  
111  
112  
113  
114  
115  
116  
117  
118  
119  
120  
121  
122  
123  
124  
125  
126  
127  
128  
129  
130  
131  
132  
133  
134  
135  
136  
137  
138  
139  
140  
141  
142  
143  
144  
145  
146  
147  
148  
149  
150  
151  
152  
153  
154  
155  
156  
157  
158  
159  
160  
161  
162  
163  
164  
165  
166  
167  
168  
169  
170  
171  
172  
173  
174  
175  
176  
177  
178  
179  
180  
181  
182  
183  
184  
185  
186  
187  
188  
189  
190  
191  
192  
193  
194  
195  
196  
197  
198  
199  
200  
201  
202  
203  
204  
205  
206  
207  
208  
209  
210  
211  
212  
213  
214  
215  
216  
217  
218  
219  
220  
221  
222  
223  
224  
225  
226  
227  
228  
229  
230  
231  
232  
233  
234  
235  
236  
237  
238  
239  
240  
241  
242  
243  
244  
245  
246  
247  
248  
249  
250  
251  
252  
253  
254  
255  
256  
257  
258  
259  
260  
261  
262  
263  
264  
265  
266  
267  
268  
269  
270  
271  
272  
273  
274  
275  
276  
277  
278  
279  
280  
281  
282  
283  
284  
285  
286  
287  
288  
289  
290  
291  
292  
293  
294  
295  
296  
297  
298  
299  
300  
301  
302  
303  
304  
305  
306  
307  
308  
309  
310  
311  
312  
313  
314  
315  
316  
317  
318  
319  
320  
321  
322  
323  
324  
325  
326  
327  
328  
329  
330  
331  
332  
333  
334  
335  
336  
337  
338  
339  
340  
341  
342  
343  
344  
345  
346  
347  
348  
349  
350  
351  
352  
353  
354  
355  
356  
357  
358  
359  
360  
361  
362  
363  
364  
365  
366  
367  
368  
369  
370  
371  
372  
373  
374  
375  
376  
377  
378  
379  
380  
381  
382  
383  
384  
385  
386  
387  
388  
389  
390  
391  
392  
393  
394  
395  
396  
397  
398  
399  
400  
401  
402  
403  
404  
405  
406  
407  
408  
409  
410  
411  
412  
413  
414  
415  
416  
417  
418  
419  
420  
421  
422  
423  
424  
425  
426  
427  
428  
429  
430  
431  
432  
433  
434  
435  
436  
437  
438  
439  
440  
441  
442  
443  
444  
445  
446  
447  
448  
449  
450  
451  
452  
453  
454  
455  
456  
457  
458  
459  
460  
461  
462  
463  
464  
465  
466  
467  
468  
469  
470  
471  
472  
473  
474  
475  
476  
477  
478  
479  
480  
481  
482  
483  
484  
485  
486  
487  
488  
489  
490  
491  
492  
493  
494  
495  
496  
497  
498  
499  
500  
501  
502  
503  
504  
505  
506  
507  
508  
509  
510  
511  
512  
513  
514  
515  
516  
517  
518  
519  
520  
521  
522  
523  
524  
525  
526  
527  
528  
529  
530  
531  
532  
533  
534  
535  
536  
537  
538  
539  
540  
541  
542  
543  
544  
545  
546  
547  
548  
549  
550  
551  
552  
553  
554  
555  
556  
557  
558  
559  
560  
561  
562  
563  
564  
565  
566  
567  
568  
569  
570  
571  
572  
573  
574  
575  
576  
577  
578  
579  
580  
581  
582  
583  
584  
585  
586  
587  
588  
589  
590  
591  
592  
593  
594  
595  
596  
597  
598  
599  
600  
601  
602  
603  
604  
605  
606  
607  
608  
609  
610  
611  
612  
613  
614  
public class Cuckoo3Stash<V> {
   static private final int EMPTY = Integer.MIN_VALUE;
   static private final int PRIME2 = 0xbe1f14b1;
   static private final int PRIME3 = 0xced1c241;

   public int size;

   int[] keyTable;
   V[] valueTable;
   int capacity, stashSize;

   private float loadFactor;
   private int hashShift, mask, threshold;
   private int stashCapacity;
   private int pushIterations;

   private Entries entries;
   private Values values;
   private Keys keys;

   public Cuckoo3Stash (int initialCapacity, float loadFactor) {
      if (initialCapacity > 1 << 30) throw new IllegalArgumentException("initialCapacity is too large.");
      if (initialCapacity < 0) throw new IllegalArgumentException("initialCapacity must be greater than zero.");

      if (loadFactor <= 0) throw new IllegalArgumentException("initialCapacity must be greater than zero.");
      this.loadFactor = loadFactor;

      capacity = nextPowerOfTwo((int)(initialCapacity * (1 / loadFactor)));
      threshold = (int)(capacity * loadFactor);
      mask = capacity - 1;
      hashShift = 31 - Integer.numberOfTrailingZeros(capacity);
      stashCapacity = Math.max(3, (int)Math.ceil(Math.log(capacity)) + 1);
      pushIterations = Math.max(16, (int)Math.sqrt(capacity) / 4);

      keyTable = new int[capacity + stashCapacity];
      for (int i = capacity; i-- > 0;)
         keyTable[i] = EMPTY;

      valueTable = (V[])new Object[capacity + stashCapacity];
   }

   public static void main (String[] args) {
      Random r = new Random();
      for (int i = 0; i < 1500000; i++) {
         int c = 10;
         Cuckoo3Stash<Object> m = new Cuckoo3Stash(c, 0.9f);
         for (int ii = 0; ii < c; ii++)
            m.put(ii, "moo" + ii);
         Entries<Object> entries = m.entries();
         for (Entry<Object> e : entries) {
            System.out.println(e);
            if (e.value.equals("moo7")) entries.remove();
         }
         System.out.println();
         for (Entry<Object> e : m.entries()) {
            System.out.println(e);
         }
         System.out.println();
         m.remove(2);
         for (Entry<Object> e : m.entries()) {
            System.out.println(e);
         }
         m.clear();
         break;
      }
   }

   public V put (int key, V value) {
      int[] keyTable = this.keyTable;
      int mask = this.mask;

      // Check for existing keys.
     int index1 = key & mask;
      int existingKey1 = keyTable[index1];
      if (existingKey1 == key) {
         V oldValue = valueTable[index1];
         valueTable[index1] = value;
         return oldValue;
      }

      int index2 = hash2(key) & mask;
      int existingKey2 = keyTable[index2];
      if (existingKey2 == key) {
         V oldValue = valueTable[index1];
         valueTable[index2] = value;
         return oldValue;
      }

      int index3 = hash3(key) & mask;
      int existingKey3 = keyTable[index3];
      if (existingKey3 == key) {
         V oldValue = valueTable[index1];
         valueTable[index3] = value;
         return oldValue;
      }

      // Check for empty buckets.
     if (existingKey1 == EMPTY) {
         keyTable[index1] = key;
         valueTable[index1] = value;
         if (size++ >= threshold) resize(capacity << 1);
         return null;
      }

      if (existingKey2 == EMPTY) {
         keyTable[index2] = key;
         valueTable[index2] = value;
         if (size++ >= threshold) resize(capacity << 1);
         return null;
      }

      if (existingKey3 == EMPTY) {
         keyTable[index3] = key;
         valueTable[index3] = value;
         if (size++ >= threshold) resize(capacity << 1);
         return null;
      }

      push(key, value, index1, existingKey1, index2, existingKey2, index3, existingKey3);
      return null;
   }

   // Skips checks for existing keys.
  private void putResize (int key, V value) {
      int[] keyTable = this.keyTable;
      int mask = this.mask;

      // Check for empty buckets.
     int index1 = key & mask;
      int existingKey1 = keyTable[index1];
      if (existingKey1 == EMPTY) {
         keyTable[index1] = key;
         valueTable[index1] = value;
         if (size++ >= threshold) resize(capacity << 1);
         return;
      }

      int index2 = hash2(key) & mask;
      int existingKey2 = keyTable[index2];
      if (existingKey2 == EMPTY) {
         keyTable[index2] = key;
         valueTable[index2] = value;
         if (size++ >= threshold) resize(capacity << 1);
         return;
      }

      int index3 = hash3(key) & mask;
      int existingKey3 = keyTable[index3];
      if (existingKey3 == EMPTY) {
         keyTable[index3] = key;
         valueTable[index3] = value;
         if (size++ >= threshold) resize(capacity << 1);
         return;
      }

      push(key, value, index1, existingKey1, index2, existingKey2, index3, existingKey3);
   }

   private void push (int key, V value, int index1, int existingKey1, int index2, int existingKey2, int index3, int existingKey3) {
      int[] keyTable = this.keyTable;
      V[] valueTable = this.valueTable;
      int mask = this.mask;

      // Push keys until an empty bucket is found.
     int insertKey = key, replacedKey;
      V insertValue = value, replacedValue;
      for (int i = 0, n = pushIterations; i < n; i++) {
         // Replace the key and value for one of the hashes.
        switch (random(3)) {
         case 0:
            replacedKey = existingKey1;
            replacedValue = valueTable[index1];
            keyTable[index1] = insertKey;
            valueTable[index1] = insertValue;
            break;
         case 1:
            replacedKey = existingKey2;
            replacedValue = valueTable[index2];
            keyTable[index2] = insertKey;
            valueTable[index2] = insertValue;
            break;
         default:
            replacedKey = existingKey3;
            replacedValue = valueTable[index3];
            keyTable[index3] = insertKey;
            valueTable[index3] = insertValue;
            break;
         }

         // If the replacedKey has an empty bucket, put it there and stop.
        index1 = replacedKey & mask;
         existingKey1 = keyTable[index1];
         if (existingKey1 == EMPTY) {
            keyTable[index1] = replacedKey;
            valueTable[index1] = replacedValue;
            if (size++ >= threshold) resize(capacity << 1);
            return;
         }

         index2 = hash2(replacedKey) & mask;
         existingKey2 = keyTable[index2];
         if (existingKey2 == EMPTY) {
            keyTable[index2] = replacedKey;
            valueTable[index2] = replacedValue;
            if (size++ >= threshold) resize(capacity << 1);
            return;
         }

         index3 = hash3(replacedKey) & mask;
         existingKey3 = keyTable[index3];
         if (existingKey3 == EMPTY) {
            keyTable[index3] = replacedKey;
            valueTable[index3] = replacedValue;
            if (size++ >= threshold) resize(capacity << 1);
            return;
         }

         insertKey = replacedKey;
         insertValue = replacedValue;
      }

      stash(insertKey, insertValue);
   }

   private void stash (int key, V value) {
      if (stashSize == stashCapacity) {
         // Too many pushes occurred and the stash is full, increase the table size.
        resize(capacity << 1);
         put(key, value);
         return;
      }
      // Store problem key in the stash.
     int index = capacity + stashSize;
      keyTable[index] = key;
      valueTable[index] = value;
      stashSize++;
   }

   public V get (int key) {
      int index = key & mask;
      if (keyTable[index] == key) return valueTable[index];

      index = hash2(key) & mask;
      if (keyTable[index] == key) return valueTable[index];

      index = hash3(key) & mask;
      if (keyTable[index] == key) return valueTable[index];

      index = capacity;
      for (int n = index + stashSize; index < n; index++)
         if (keyTable[index] == key) return valueTable[index];

      return null;
   }

   public V remove (int key) {
      int[] keyTable = this.keyTable;
      V[] valueTable = this.valueTable;

      int index = key & mask;
      if (keyTable[index] == key) {
         keyTable[index] = EMPTY;
         V oldValue = valueTable[index];
         valueTable[index] = null;
         size--;
         return oldValue;
      }

      index = hash2(key) & mask;
      if (keyTable[index] == key) {
         keyTable[index] = EMPTY;
         V oldValue = valueTable[index];
         valueTable[index] = null;
         size--;
         return oldValue;
      }

      index = hash3(key) & mask;
      if (keyTable[index] == key) {
         keyTable[index] = EMPTY;
         V oldValue = valueTable[index];
         valueTable[index] = null;
         size--;
         return oldValue;
      }

      index = capacity;
      for (int n = index + stashSize; index < n; index++) {
         if (keyTable[index] == key) {
            V oldValue = valueTable[index];
            stashRemove(index);
            size--;
            return oldValue;
         }
      }

      return null;
   }

   void stashRemove (int index) {
      // If the removed location was not last, move the last tuple to the removed location.
     int lastIndex = capacity + stashSize - 1;
      if (index < lastIndex) {
         keyTable[index] = keyTable[lastIndex];
         valueTable[index] = valueTable[lastIndex];
         valueTable[lastIndex] = null;
      } else
         valueTable[index] = null;
      stashSize--;
   }

   public void clear () {
      int[] keyTable = this.keyTable;
      V[] valueTable = this.valueTable;
      for (int i = capacity; i-- > 0;) {
         keyTable[i] = EMPTY;
         valueTable[i] = null;
      }
      size = 0;
      stashSize = 0;
   }

   /**
    * Returns true if the specified value is in the map. Note this traverses the entire map and compares every value, which may be
    * an expensive operation.
    */

   public boolean containsValue (Object value, boolean identity) {
      V[] valueTable = this.valueTable;
      if (value == null) {
         int[] keyTable = this.keyTable;
         for (int i = capacity + stashSize; i-- > 0;)
            if (keyTable[i] != EMPTY && valueTable[i] == null) return true;
      } else if (identity) {
         for (int i = capacity + stashSize; i-- > 0;)
            if (valueTable[i] == value) return true;
      } else {
         for (int i = capacity + stashSize; i-- > 0;)
            if (value.equals(valueTable[i])) return true;
      }
      return false;
   }

   public boolean containsKey (int key) {
      int index = key & mask;
      if (keyTable[index] == key) return true;

      index = hash2(key) & mask;
      if (keyTable[index] == key) return true;

      index = hash3(key) & mask;
      if (keyTable[index] == key) return true;

      index = capacity;
      for (int n = index + stashSize; index < n; index++)
         if (keyTable[index] == key) return true;

      return false;
   }

   /**
    * Increases the size of the backing array to acommodate the specified number of additional items. Useful before adding many
    * items to avoid multiple backing array resizes.
    */

   public void ensureCapacity (int additionalCapacity) {
      int sizeNeeded = size + additionalCapacity;
      if (sizeNeeded >= threshold) resize(nextPowerOfTwo((int)(sizeNeeded * (1 / loadFactor))));
   }

   private void resize (int newSize) {
      capacity = newSize;
      threshold = (int)(newSize * loadFactor);
      mask = newSize - 1;
      hashShift = 31 - Integer.numberOfTrailingZeros(newSize);
      stashCapacity = Math.max(3, (int)Math.ceil(Math.log(newSize)));
      pushIterations = Math.max(16, (int)Math.sqrt(newSize) / 4);

      int[] oldKeyTable = keyTable;
      V[] oldValueTable = valueTable;

      keyTable = new int[newSize + stashCapacity];
      int[] keyTable = this.keyTable;
      for (int i = newSize; i-- > 0;)
         keyTable[i] = EMPTY;

      valueTable = (V[])new Object[newSize + stashCapacity];

      size = 0;
      stashSize = 0;
      for (int i = 0, n = oldKeyTable.length; i < n; i++) {
         int key = oldKeyTable[i];
         if (key != EMPTY) putResize(key, oldValueTable[i]);
      }
   }

   private int hash2 (int h) {
      h *= PRIME2;
      return h ^ (h >>> hashShift);
   }

   private int hash3 (int h) {
      h *= PRIME3;
      return h ^ (h >>> hashShift);
   }

   public String toString () {
      if (size == 0) return "[]";
      StringBuilder buffer = new StringBuilder(32);
      buffer.append('[');

      int[] keyTable = this.keyTable;
      int i = keyTable.length;
      while (i-- > 0) {
         int key = keyTable[i];
         if (key == EMPTY) continue;
         buffer.append(key);
         buffer.append('=');
         buffer.append(valueTable[i]);
         break;
      }
      while (i-- > 0) {
         int key = keyTable[i];
         if (key == EMPTY) continue;
         buffer.append(", ");
         buffer.append(key);
         buffer.append('=');
         buffer.append(valueTable[i]);
      }
      buffer.append(']');
      return buffer.toString();
   }

   /**
    * Returns an iterator for the entries in the map. Remove is supported. Note that the same iterator instance is reused each
    * time this method is called.
    */

   public Entries<V> entries () {
      if (entries == null)
         entries = new Entries(this);
      else
         entries.reset();
      return entries;
   }

   /**
    * Returns an iterator for the values in the map. Remove is supported. Note that the same iterator instance is reused each time
    * this method is called.
    */

   public Values<V> values () {
      if (values == null)
         values = new Values(this);
      else
         values.reset();
      return values;
   }

   /**
    * Returns an iterator for the keys in the map. Remove is supported. Note that the same iterator instance is reused each time
    * this method is called.
    */

   public Keys keys () {
      if (keys == null)
         keys = new Keys(this);
      else
         keys.reset();
      return keys;
   }

   static public class Entry<V> {
      public int key;
      public V value;

      public String toString () {
         return key + "=" + value;
      }
   }

   static public class Entries<V> implements Iterable<Entry<V>>, Iterator<Entry<V>> {
      Entry<V> entry = new Entry();
      int index = -1;
      private final Cuckoo3Stash map;

      public Entries (Cuckoo3Stash map) {
         this.map = map;
      }

      public boolean hasNext () {
         int[] keyTable = map.keyTable;
         for (int n = map.capacity + map.stashSize; ++index < n;)
            if (keyTable[index] != EMPTY) return true;
         return false;
      }

      public Entry<V> next () {
         entry.key = map.keyTable[index];
         entry.value = (V)map.valueTable[index];
         return entry;
      }

      public void remove () {
         if (index >= map.capacity)
            map.stashRemove(index);
         else {
            map.keyTable[index] = EMPTY;
            map.valueTable[index] = null;
         }
         map.size--;
         index--;
      }

      public void reset () {
         index = -1;
      }

      public Iterator<Entry<V>> iterator () {
         return this;
      }
   }

   static public class Values<V> implements Iterable<V>, Iterator<V> {
      int index = -1;
      private final Cuckoo3Stash map;

      public Values (Cuckoo3Stash map) {
         this.map = map;
      }

      public boolean hasNext () {
         int[] keyTable = map.keyTable;
         for (int n = map.capacity + map.stashSize; ++index < n;)
            if (keyTable[index] != EMPTY) return true;
         return false;
      }

      public V next () {
         return (V)map.valueTable[index];
      }

      public void remove () {
         if (index >= map.capacity)
            map.stashRemove(index);
         else {
            map.keyTable[index] = EMPTY;
            map.valueTable[index] = null;
         }
         map.size--;
         index--;
      }

      public void reset () {
         index = -1;
      }

      public Iterator<V> iterator () {
         return this;
      }
   }

   static public class Keys<V> {
      int index = -1;
      private final Cuckoo3Stash map;

      public Keys (Cuckoo3Stash map) {
         this.map = map;
      }

      public boolean hasNext () {
         int[] keyTable = map.keyTable;
         for (int n = map.capacity + map.stashSize; ++index < n;)
            if (keyTable[index] != EMPTY) return true;
         return false;
      }

      public int next () {
         return map.keyTable[index];
      }

      public void remove () {
         if (index >= map.capacity)
            map.stashRemove(index);
         else {
            map.keyTable[index] = EMPTY;
            map.valueTable[index] = null;
         }
         map.size--;
         index--;
      }

      public void reset () {
         index = -1;
      }
   }

   static private int randomSeed = (int)System.currentTimeMillis();

   /**
    * Returns a random integer between 0 (inclusive) and range (exclusive).
    */

   static private final int random (int range) {
      int seed = randomSeed * 1103515245 + 12345;
      randomSeed = seed;
      return (seed >>> 15) * range >>> 17;
   }

   static private int nextPowerOfTwo (int value) {
      if (value == 0) return 1;
      if ((value & value - 1) == 0) return value;
      value |= value >> 1;
      value |= value >> 2;
      value |= value >> 4;
      value |= value >> 8;
      value |= value >> 16;
      return value + 1;
   }
}


Quote
At any rate that is now a treeMap, not a hash map. All my tests are based on 1Million entries or more.

For what definition of "treemap"? Eg, Java's TreeMap is a red-black tree, which my Cuckoo3Stash certainly isn't. AFAIK, it is still a cuckoo hash map. There are lots of papers about using a stash with cuckoo.

Quote
For tables of size 16 or so, I doubt hashMaps would be quicker than just iterating.  

Probably. Note sure why you mention this though?

Offline delt0r

JGO Coder


Medals: 22


Computers can do that?


« Reply #37 - Posted 2010-12-31 12:15:11 »

For your iterators, I don't think repeated calling of next() will in fact increment the returned item. ie you would always return the same item. You are not required to call hasNext() in order to get the next item.


I have no special talents. I am only passionately curious.--Albert Einstein
Offline pjt33
« Reply #38 - Posted 2010-12-31 12:41:54 »

Recovering a lost comment from the other thread:

Quote from: Nate
Here is a paper about cuckoo hash functions:
http://www.siam.org/proceedings/soda/2009/SODA09_087_dietzfelbingerm.pdf
Unfortunately my greek isn't that good. Wink Could someone translate the essence of this to layman terms? Does this help us choose better hash functions?
What this says basically is that hashes of the form h = (m * x + c) % M aren't good for cuckoo hashing. It doesn't give any suggestions for what are good hashes.
Offline delt0r

JGO Coder


Medals: 22


Computers can do that?


« Reply #39 - Posted 2010-12-31 14:03:15 »

Just read a little bit of that paper. It should be noted that the extra xor makes the hash function i use non linear, ie doesn't suffer from their warning. So it is at least better than a pure linear ax+c mod d. What makes it nonlinear is the combination of * and ^. Just one or the other is equivalent to a Galois Field (ie GF(2^n)).

You could add extra non linear structure. But from my tests I highly doubt that it will make any real difference. The weak point is that Java doesn't use good hash values for its hashCode() method. Shame really.

For my objects I often use longHashCode(). This works even when you have more than 1e9 entries.

I have no special talents. I am only passionately curious.--Albert Einstein
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Nate

JGO Kernel


Medals: 129
Projects: 3
Exp: 14 years


Esoteric Software


« Reply #40 - Posted 2010-12-31 14:21:37 »

For your iterators, I don't think repeated calling of next() will in fact increment the returned item. ie you would always return the same item. You are not required to call hasNext() in order to get the next item.

Hmm, good point. I'll fix it up, thanks!

You could add extra non linear structure. But from my tests I highly doubt that it will make any real difference.

Ok, cool. What do you think about my hash for the int map? The first hash I just use the int, the 2nd and 3rd I use your hashes. Does this 1st hash have any significant drawbacks?

Offline Nate

JGO Kernel


Medals: 129
Projects: 3
Exp: 14 years


Esoteric Software


« Reply #41 - Posted 2011-01-05 23:58:39 »

Finished my cuckoo classes. Updated benchmarks:

MapGet2
MapPut2
MapIterate2
IntMapGet2
IntMapPut2
IntMapIterate2

I'm annoyed my gets are not as fast as delt0r's. Smiley Here is my code:

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
public V get (int key) {
   if (key == 0) return zeroValue;
   int index = key & mask;
   if (keyTable[index] != key) {
      index = hash2(key);
      if (keyTable[index] != key) {
         index = hash3(key);
         if (keyTable[index] != key) return getStash(key);
      }
   }
   return valueTable[index];
}

private V getStash (int key) {
   int[] keyTable = this.keyTable;
   for (int i = capacity, n = i + stashSize; i < n; i++)
      if (keyTable[i] == key) return valueTable[i];
   return null;
}


Here is delt0r's code:

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
public V get(int key) {
   int hash =hash(key);
   hash &=mask;
   if (key == keys[hash])
      return entrys[hash];
   int hash2 =hash2(key);
   hash2 &=mask;
   if (key == keys[hash2])
      return entrys[hash2];
   int hash3 =hash3(key);
   hash3 &=mask;
   if (key == keys[hash3])
      return entrys[hash3];
   return null;
}


My code takes almost twice as long, even when getStash is not actually called! I can replace "return getStash(key);" with "throw new RuntimeException("blah!");" and it doesn't change the speed. If I change "return getStash(key);" to "return null;" then it runs the same speed as delt0r's code. I guess it can't be optimized the same way? Seems crazy.

Offline CommanderKeith
« Reply #42 - Posted 2011-01-06 03:04:57 »

I'm using the CIdentityHashSet and to reduce System.identityHashCode() in my use-case I made every class of object that I was adding to the set over-ride the hashCode() method to the following:

1  
2  
3  
4  
int hash = System.identityHashCode(this);
public int hashCode(){
   return hash;
}


So the System.identityHashCode() method is called only once and is cached when the object is created. I then replaced all CIdentityHashSet calls to System.identityHashCode(object) with object.hashCode() and this gained some speed.

Btw I use CIdentityHashSet mostly for its fast contains(object) method and noticed that it doesn't do lazy calculation which gains some speed, so here it is for what it's worth:
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
public boolean contains(Object e) {
   int code=e.hashCode();//System.identityHashCode(e);
  int hash = hash(code);
   hash =hash & mask;
   if (table[hash] == e) {
      return true;
   }
   int hash2 = hash2(code);
   hash2 =hash2 & mask;
   if (table[hash2] == e) {
      return true;
   }
   int hash3 = hash3(code);
   hash3 =hash3 & mask;
   if (table[hash3] == e) {
      return true;
   }
   return false;
}


I know a few posts here got deleted due to the JGO outage but I read most of them I think. Nate raised the point that my program shouldn't depend so much on map/set performance. For some reason it does, A* path finding does lots of contains() checks in the open and closed lists which I use the CIdentityHashSet for.

My code takes almost twice as long, even when getStash is not actually called! I can replace "return getStash(key);" with "throw new RuntimeException("blah!");" and it doesn't change the speed. If I change "return getStash(key);" to "return null;" then it runs the same speed as delt0r's code. I guess it can't be optimized the same way? Seems crazy.

Maybe try java 7 and see if it makes any difference since it looks like a weird hotspot quirk that might be flattened out with of java 7's more aggressive optimisations.

Thanks for the updates guys. Love those fancy graphs  Cool

Offline Nate

JGO Kernel


Medals: 129
Projects: 3
Exp: 14 years


Esoteric Software


« Reply #43 - Posted 2011-01-06 23:00:38 »

Quote
Nate raised the point that my program shouldn't depend so much on map/set performance. For some reason it does, A* path finding does lots of contains() checks in the open and closed lists which I use the CIdentityHashSet for.

I recently was using A* for a small map and used an int[] for the closed list and a Node[] and binary heap for the open list. To keep from allocating or clearing the int[] each time, I increment an int "id" and check "closed[y * width + x] != id". Each node also has an ID, and a similar check tells if the node is being encountered for the first time this run.

Quote
Maybe try java 7 and see if it makes any difference since it looks like a weird hotspot quirk that might be flattened out with of java 7's more aggressive optimisations.

Good idea, I'll give it a shot.

Offline CommanderKeith
« Reply #44 - Posted 2011-01-07 07:27:13 »

Quote
I recently was using A* for a small map and used an int[] for the closed list and a Node[] and binary heap for the open list. To keep from allocating or clearing the int[] each time, I increment an int "id" and check "closed[y * width + x] != id". Each node also has an ID, and a similar check tells if the node is being encountered for the first time this run.

Ah wow, that is a good idea! No need for sets/maps in the A* algorithm by letting the node track its own open/closed/unprocessed status. I tried it out and it sped up the algorithm lots. Clever thinking. Thanks Nate  Cool

I'm still using the maps/sets for other things so I'll keep track of your great work guys.
Cheers,
keith

Pages: 1 [2]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

xsi3rr4x (52 views)
2014-04-15 18:08:23

BurntPizza (49 views)
2014-04-15 03:46:01

UprightPath (64 views)
2014-04-14 17:39:50

UprightPath (46 views)
2014-04-14 17:35:47

Porlus (63 views)
2014-04-14 15:48:38

tom_mai78101 (88 views)
2014-04-10 04:04:31

BurntPizza (147 views)
2014-04-08 23:06:04

tom_mai78101 (244 views)
2014-04-05 13:34:39

trollwarrior1 (203 views)
2014-04-04 12:06:45

CJLetsGame (210 views)
2014-04-01 02:16:10
List of Learning Resources
by SHC
2014-04-18 03:17:39

List of Learning Resources
by Longarmx
2014-04-08 03:14:44

Good Examples
by matheus23
2014-04-05 13:51:37

Good Examples
by Grunnt
2014-04-03 15:48:46

Good Examples
by Grunnt
2014-04-03 15:48:37

Good Examples
by matheus23
2014-04-01 18:40:51

Good Examples
by matheus23
2014-04-01 18:40:34

Anonymous/Local/Inner class gotchas
by Roquen
2014-03-11 15:22:30
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!