Java-Gaming.org Hi !
Featured games (83)
games approved by the League of Dukes
Games in Showcase (538)
Games in Android Showcase (132)
games submitted by our members
Games in WIP (601)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1] 2 3
  ignore  |  Print  
  JAVA IS SLOW!  (Read 11359 times)
0 Members and 1 Guest are viewing this topic.
Offline rreyelts

Junior Devvie




There is nothing Nu under the sun


« Posted 2003-12-10 18:08:26 »

Now that I've got your attention, I hope you'll stick around for the real topic.

I developed a fibonacci heap (based on the most venerable text "Introduction to Algorithms"), and it gets what you might normally think of as pretty decent performance. An average insert takes about 4 microseconds and an average extractMin takes about 12 microseconds. Compared to some other (good) Java fibonacci heaps out there, mine is about 25% faster.

The truth, though, is that performance is actually very slow. Our algorithm performs roughly 1 million inserts and extractMins and we expect to be able to do it in under a second. Obviously, that's not possible with the current timings. I did nearly a line for line translation of my fibonacci heap from Java to C++ (not optimizing for C++), and my C++ fibonacci heap gets performance that is nearly ten times faster than its Java counterpart. Inserts happen in 0.26 microseconds and extractMins take as long from 0.6 to 3 microseconds depending upon the size of the heap. (Realistically it'll be in the 0.6 microsecond range for us, because we're going to have fairly small heap sizes).

So the big question in my mind is WHY OH WHY? I've poured over and over my Java code, and I just can't see anything that can be done to improve the performance. The scary thing is that the algorithm actually requires a significant amount of heap allocation and doesn't do comparatively many array accesses, so it's actually already playing to Java's strengths.

I tested this with Sun's 1.4.1 and 1.4.2 VMs with and without the server options. I've tried with different heap sizes and different garbage collection options and the numbers I give above are the best I could get. I compiled my C++ code under VC++ 7.1 with default optimizations.

So, I challenge specifically Jeff or Chris to examine my heap and either a) explain exactly how it can be re-written to get the same performance or b) get Sun's vm team to fix the vm so that we can get the same kind of performance.

Currently, our only option is to ditch Java for C++.

God bless,
-Toby Reyelts

About me: http://jroller.com/page/rreyelts
Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
Offline kevglass

« JGO Spiffy Duke »


Medals: 212
Projects: 24
Exp: 18 years


Coder, Trainee Pixel Artist, Game Reviewer


« Reply #1 - Posted 2003-12-10 18:18:25 »

You might want to post the two bits of code (Java/C++) ?

Kev

Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #2 - Posted 2003-12-10 18:42:27 »

Yeah, I second that; please post the code!   Smiley
Did you profile it? Maybe something you didn't think of that slows things down in java.

Erik

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline rreyelts

Junior Devvie




There is nothing Nu under the sun


« Reply #3 - Posted 2003-12-10 18:51:05 »

You might want to post the two bits of code (Java/C++) ?

I received permission to post the Java code. Let's just say that the C++ code looks mostly the same. It goes without saying that my company reserves the copyright to this code, so you should avoid copying it.

A couple of notes. 1) I extracted comments - code was too long for the board 2) Don't worry about the asserts 3) Performance numbers came from running inserts and extractMins a million times and taking average. Best times are with -server and sufficient heap (i.e. 256M). Tests took place on a 2.4GHZ PIV with 512M.

Thanks and God bless,
-Toby Reyelts

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
44  
45  
46  
47  
48  
49  
50  
51  
52  
53  
54  
55  
56  
57  
58  
59  
60  
61  
62  
63  
64  
65  
66  
67  
68  
69  
70  
71  
72  
73  
74  
75  
76  
77  
78  
79  
80  
81  
82  
83  
84  
85  
86  
87  
88  
89  
90  
91  
92  
93  
94  
95  
96  
97  
98  
99  
100  
101  
102  
103  
104  
105  
106  
107  
108  
109  
110  
111  
112  
113  
114  
115  
116  
117  
118  
119  
120  
121  
122  
123  
124  
125  
126  
127  
128  
129  
130  
131  
132  
133  
134  
135  
136  
137  
138  
139  
140  
141  
142  
143  
144  
145  
146  
147  
148  
149  
150  
151  
152  
153  
154  
155  
156  
157  
158  
159  
160  
161  
162  
163  
164  
165  
166  
167  
168  
169  
170  
171  
172  
173  
174  
175  
176  
177  
178  
179  
180  
181  
182  
183  
184  
185  
186  
187  
188  
189  
190  
191  
192  
193  
194  
195  
196  
197  
198  
199  
200  
201  
202  
203  
204  
205  
206  
207  
208  
209  
210  
211  
212  
213  
214  
215  
216  
217  
218  
219  
220  
221  
222  
223  
224  
225  
226  
227  
228  
229  
230  
231  
232  
233  
234  
235  
236  
237  
238  
239  
240  
241  
242  
243  
244  
245  
246  
247  
248  
249  
250  
251  
252  
253  
254  
255  
256  
257  
258  
259  
260  
261  
262  
263  
264  
265  
266  
267  
268  
269  
270  
271  
272  
273  
274  
275  
276  
277  
278  
279  
280  
281  
282  
283  
284  
285  
286  
287  
288  
289  
290  
291  
292  
293  
294  
295  
296  
297  
298  
299  
300  
301  
302  
303  
304  
305  
306  
307  
308  
309  
310  
311  
312  
313  
314  
315  
316  
317  
318  
319  
320  
321  
322  
323  
324  
325  
326  
327  
328  
329  
330  
331  
332  
333  
334  
335  
336  
337  
338  
339  
340  
341  
342  
343  
344  
345  
346  
347  
348  
349  
350  
351  
352  
353  
354  
355  
356  
357  
358  
359  
360  
361  
362  
363  
364  
365  
366  
367  
368  
369  
370  
371  
372  
373  
374  
375  
376  
377  
378  
379  
380  
381  
382  
383  
384  
385  
386  
387  
388  
389  
390  
391  
392  
393  
394  
395  
396  
397  
398  
399  
400  
401  
402  
403  
404  
405  
406  
407  
408  
409  
410  
411  
412  
413  
414  
415  
416  
417  
418  
419  
420  
421  
422  
423  
424  
425  
426  
427  
428  
429  
430  
431  
432  
433  
434  
435  
436  
437  
438  
439  
440  
441  
442  
443  
444  
445  
446  
447  
448  
449  
450  
451  
452  
453  
454  
455  
456  
457  
458  
459  
460  
461  
462  
463  
464  
465  
466  
467  
import java.util.*;

/**
 * A FibonacciHeap specialized for keys of type int.
 *
 * Copyright SSA Global 12/10/2003
 *
 * @author Toby Reyelts
 *
 */

public class IntFibonacciHeap<V> {

  // Used in max degree calculations
  private static final double PHI = ( 1.0d + Math.sqrt( 5 ) ) / 2.0d;
  private static final double LOG_PHI = Math.log( PHI );

  private Node<V> min; // Both the minimum node and the pointer to the root list
  private int size; // The total number of nodes in the tree

  // Performance statistics
  private double totalInsertTime;
  private double totalDecreaseKeyTime;
  private double totalExtractMinTime;
  private double totalInsertCalls;
  private double totalDecreaseKeyCalls;
  private double totalExtractMinCalls;
  private double largestSize;

  public IntFibonacciHeap() {}

  public static class Node<V> {

    public Node( int key, V value ) {

      assert( key >= 0 );

      this.key = key;
      this.value = value;
      right = this;
      left = this;
    }

    private int degree;
    private Node<V> parent;
    private Node<V> child; // only valid when degree is > 0
    private Node<V> right;
    private Node<V> left;
    private boolean mark;
    private boolean deleted;

    private int key;
    protected V value;

    public int key() {
      return key;
    }

    public V value() {
      return value;
    }

    public V setValue( V value ) {
      V temp = this.value;
      this.value = value;
      return temp;
    }

    public String toString() {
      return key + "";
    }

    public String debugString() {
      return key + " parent: " + parent + " right: " + right + " left: " + left + " degree: " + degree + " child: " + child;
    }
  }

  public void insert( Node<V> node ) {

    assert( ! node.deleted );

    ++totalInsertCalls;

    assert( node != null );
    assertNullParents();

    assert( node.parent == null );
    assert( node.child == null );
    assert( node.right == node );
    assert( node.left == node );
    assert( node.mark == false );
    assert( node.degree == 0 );

    if ( min == null ) {
      min = node;
      ++size;
      return;
    }

    // Insert the new node
    node.left = min;
    node.right = min.right;
    min.right.left = node;
    min.right = node;

    // Update the minimum if necessary
    if ( node.key < min.key ) {
      min = node;
    }

    ++size;
    assertNullParents();
  }

  public Node<V> min() {
    return min;
  }

  public Node<V> extractMin() {

    ++totalExtractMinCalls;

    if ( min == null ) {
      return null;
    }

    assertNullParents();

    Node<V> child = min.child;
    Node<V> currentMin = min;

    if ( min.degree > 0 ) {
      for ( int i = 0; i < min.degree; ++i ) {
        child.parent = null;
        child = child.right;
      }

      min.degree = 0;
      min.child = null;

      Node<V> rightOfMin = min.right;
      Node<V> leftOfChild = child.left;

      min.right = child;
      child.left  = min;
      leftOfChild.right = rightOfMin;
      rightOfMin.left  = leftOfChild;
    }

    if ( size == 1 ) {
      min = null;
    }
    else {
      min.right.left = min.left;
      min.left.right = min.right;
      min = min.right; // pick an arbitrary min - we have to recalculate it anyway
      consolidate();
    }

    --size;

    assertNullParents();

    currentMin.deleted = true;

    return currentMin;
  }

  public int size() {
    assert( size >= 0 );
    return size;
  }

  private void assertNullParents() {
    /*
    int i = 0;

    if ( min == null ) {
      return;
    }

    Node<V> temp = min;

    do {
      assert( temp.parent == null );
      temp = temp.right;
      ++i;
      assert( i <= size );
    }
    while ( temp != min );
    */

  }

  private void consolidate() {

    assert( min != null );
    assert( min.parent == null );
    assertNullParents();
 
    int largestDegree = ( int ) Math.floor( Math.log( size ) / LOG_PHI );
    Node[] nodes = new Node[ largestDegree + 1 ];

    Node<V> x = min;

    while ( true ) {

      Node<V> current = x;
      assert( current.parent == null );

      Node<V> next = x.right;

      x.right.left = x.left;
      x.left.right = x.right;
      x.right = x;
      x.left = x;

      int degree = x.degree;

      while ( true ) {

        Node<V> y = ( Node<V> ) nodes[ degree ];

        if ( y == null ) {
          break;
        }

        assert( y.parent == null );

        if ( x.key > y.key ) {
          Node<V> temp = x;
          x = y;
          y = temp;
        }

        link( y, x );

        assert( x.parent == null );

        nodes[ degree ] = null;
        ++degree;
      }

      assert( x.parent == null );

      nodes[ degree ] = x;

      if ( next == current ) {
        break;
      }

      x = next;
    }

    min = null;
    int length = nodes.length;

    for ( int i = 0; i < length; ++i ) {

      Node<V> node = nodes[ i ];

      if ( node != null ) {

        assert( node.parent == null );

        if ( min == null ) {
          min = node;
          continue;
        }

        node.left = min;
        node.right = min.right;
        min.right.left = node;
        min.right = node;

        if ( node.key < min.key ) {
          min = node;
        }
      }
    }
  }

  private void link( Node<V> y, Node<V> x ) {

    assert( y != null );
    assert( x != null );
    assert( y.parent == null );
    assert( x.parent == null );

    y.parent = x;

    if ( x.degree == 0 ) {
      x.child = y;
      y.right = y;
      y.left = y;
    }
    else {
      y.left = x.child;
      y.right = x.child.right;
      x.child.right.left = y;
      x.child.right = y;
    }

    ++x.degree;
    y.mark = false;
  }

  public void decreaseKey( Node<V> node, int key ) {

    assert ( ! node.deleted );

    ++totalDecreaseKeyCalls;

    assert( node.key >= key );
    assertNullParents();

    node.key = key;

    Node<V> parent = node.parent;

    if ( parent != null && ( key < parent.key ) ) {
      cut( node, parent );
      cascadingCut( parent );
    }

    if ( key < min.key ) {
      min = node;
    }

    assertNullParents();
  }

  public void delete( Node<V> node ) {

    assert ( ! node.deleted );

    assertNullParents();

    Node<V> parent = node.parent;

    if ( parent != null ) {
      cut( node, parent );
      cascadingCut( parent );
    }

    min = node;

    extractMin();

    assertNullParents();
  }

  private void cut( Node<V> child, Node<V> parent ) {

    assertNullParents();

    child.parent = null;
    --parent.degree;
   
    if ( parent.degree == 0 ) {
      parent.child = null;
    }
    else {
      parent.child = child.right;
      child.right.left = child.left;
      child.left.right = child.right;
    }

    child.left = min;
    child.right = min.right;
    min.right.left = child;
    min.right = child;

    child.mark = false;

    assertNullParents();
  }

  private void cascadingCut( Node<V> node ) {

    Node<V> parent = node.parent;
   
    if ( parent == null ) {
      return;
    }

    if ( ! node.mark ) {
      node.mark = true;
    }
    else {
      cut( node, parent );
      cascadingCut( parent );
    }
  }

  public void print( Node<V> node, boolean debug ) {

    if ( node == null ) {
      return;
    }

    Node<V> begin = node;
    Node<V> current = node;

    while ( true ) {
      if ( debug ) {
        System.out.println( current.debugString() );
      }
      else {
        System.out.println( current );
      }
      if ( current.degree > 0 ) {
        print( current.child, debug );
      }
      current = current.right;
      if ( current == begin ) {
        return;
      }
    }
  }

  public void print( boolean debug ) {
    print( min, debug );
  }

  public double totalInsertTime() {
    return totalInsertTime;
  }

  public double totalExtractMinTime() {
    return totalExtractMinTime;
  }

  public double totalDecreaseKeyTime() {
    return totalDecreaseKeyTime;
  }

  public double totalInsertCalls() {
    return totalInsertCalls;
  }

  public double totalExtractMinCalls() {
    return totalExtractMinCalls;
  }

  public double totalDecreaseKeyCalls() {
    return totalDecreaseKeyCalls;
  }

  public double largestSize() {
    return largestSize;
  }

  public static void main( String[] args ) {
    IntFibonacciHeap<String> heap = new IntFibonacciHeap<String>();
    int[] ints = new int[] { 23, 7, 3, 3, 3, 14, 14, 7, 17, 30, 24, 26, 46, 35 };
    Node[] nodes = new Node[ ints.length ];

    for ( int i = 0; i < ints.length; ++i ) {
      Node<String> node = new Node<String>( ints[ i ], "" + i );
      heap.insert( node );
      nodes[ i ] = node;
    }

    while ( heap.size() > 0 ) {
      System.out.println( "Heap min: " + heap.extractMin() );
    }
  }
}

About me: http://jroller.com/page/rreyelts
Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
Offline princec

« JGO Spiffy Duke »


Medals: 434
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #4 - Posted 2003-12-10 19:13:01 »

Errr.... that would appear to be JDK1.5 code? Which we can't compile?

Cas Smiley

Offline rreyelts

Junior Devvie




There is nothing Nu under the sun


« Reply #5 - Posted 2003-12-10 19:24:36 »

Quote
Errr.... that would appear to be JDK1.5 code? Which we can't compile?

It's JDK1.4+ code. You can compile and run this with Sun's freely available 1.2 adding generics prototype which is just an add on to the 1.4 compiler. Alternatively, you can join the CAP program and compile and run this with the 1.5 VM. Finally, you could just take the 60 seconds to get rid of the parameterization. It doesn't have any affect on the running time, because the heap doesn't mess with the values.

God bless,
-Toby Reyelts

About me: http://jroller.com/page/rreyelts
Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
Offline endolf

JGO Coder


Medals: 7
Exp: 15 years


Current project release date: sometime in 3003


« Reply #6 - Posted 2003-12-10 19:39:17 »

Quote
Check out the free, open-source, JNI toolkit, Jace - http://jace.reyelts.com/jace


I would, but your dyndns account seems broken as it won't resolve reyelts.dyndns.org.

Endolf

Offline rreyelts

Junior Devvie




There is nothing Nu under the sun


« Reply #7 - Posted 2003-12-10 19:51:55 »

I would, but your dyndns account seems broken as it won't resolve reyelts.dyndns.org

Lol. My server just started suffering spontaneous rebooting fits, so the domain is down for a few days while I put together a new box. The project is also hosted at sourceforge - http://sf.net/projects/jace

God bless,
-Toby Reyelts

About me: http://jroller.com/page/rreyelts
Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
Offline Jeff

JGO Coder




Got any cats?


« Reply #8 - Posted 2003-12-11 05:31:07 »

Quote
The truth, though, is that performance is actually very slow. Our algorithm performs roughly 1 million inserts and extractMins and we expect to be able to do it in under a second. Obviously, that's not possible with the current timings. I did nearly a line for line translation of my fibonacci heap from Java to C++ (not optimizing for C++),


Hmm.  It may still be code that is better suited to C++, where a re-workign of the algorithym might make the Java fly.

But before we even go there, lets ask some basic Java microbenchamrking questions:

(1) Did you run this with the -Xcompile flag to force it to compile all the methods right away?

(2) How long did yo urun thsi etst for?  even with -Xcompile you are liley I think to pay some start-up penalty.  When benchmarking on the JDK tema we always did the foloowing:
(a) NEVER ran abench mark that ran for less then about 60 seconds.
(b) ALWAYS ran the benchmark 5 times in sucession fro mwithin the same Java program.
(c) THROUGH OUT the entire first run.

Unless you are doing these things its quite possible that you aren't warming up the VM and/or allowing the one-time cost of compilation to have an unreasonably large effect  on your final numbers.


Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline altair

Senior Newbie





« Reply #9 - Posted 2003-12-11 06:19:27 »

Toby,
How did you measure exactly ?
1 Million times insert/extract or 1 Million times
for ( int i = 0; i < ints.length; ++i )
insert()
while ( heap.size() > 0 )
extract()
?
or something else ?
The results seem to be pretty slow indeed. Did you try without the generics because the generics prototype may be slowing down everything ? There is nothing suspect in the code AFAIK.
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline princec

« JGO Spiffy Duke »


Medals: 434
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #10 - Posted 2003-12-11 11:07:36 »

Well, following the general guidelines above, I ran the following benchmark:
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
  public static void benchmark() {
       
        int[] fn = new int[] { 23, 7, 3, 3, 14, 17, 30, 24, 26, 46, 35 };
        int[] ints = new int[1000000]; //{ 23, 7, 3, 3, 3, 14, 14, 7, 17, 30, 24, 26, 46, 35 };
        for (int i = 0; i < ints.length; i ++) {
              ints[i] = fn[Util.random(0, fn.length - 1)];
        }
        long then = Sys.getTime();
        IntFibonacciHeap heap = new IntFibonacciHeap();
        Node[] nodes = new Node[ ints.length ];
       
        for ( int i = 0; i < ints.length; ++i ) {
              Node node = new Node( ints[ i ], String.valueOf(i));
              heap.insert( node );
              nodes[ i ] = node;
        }
       
        Object[] mins = new Object[heap.size()];
        int count = 0;
        while ( heap.size() > 0 ) {
              mins[count ++] = heap.extractMin().value();
        }
       
        System.out.println("Time: "+((float)(Sys.getTime() - then) / (float)Sys.getTimerResolution()));
        int sum = 0;
        for (int i = 0; i < mins.length; i ++) {
              sum += Integer.parseInt((String) mins[i]);
        }
        System.out.println("Sum: "+sum);
       
  }

On a dual P3-1GHz using the following parameters:
1  
-server -Xcomp -Xmx128m -Xms128m

using the LWJGL hires timer and got an average time of 9.7s running for 1min30s. A vanilla run with just the client VM and no params got me an average 11.5s.

Is this too slow by a factor of 10 then? What does the C++ one manage?

Cas Smiley

Offline altair

Senior Newbie





« Reply #11 - Posted 2003-12-11 18:30:19 »

Princec,
You are measuring also the creation of 1000000 node objects, you should move the creation of those nodes (and the associated strings) out of the insert() loop.
Also, my understanding is that the heaps remain small but yours grows to 1M nodes.
Toby,
Have you tried the profiler (-Xrunhprof:cpu=times) to measure the time spent in your methods ?
With my Ahtlon 1.4Ghz and JVM 1.4.2-beta-b19, I get (no optimization options) :

CPU TIME (ms) BEGIN (total = 36706) Thu Dec 11 13:15:37 2003
rank   self  accum   count trace method
  1 13.62% 13.62%       1    27 test.IntFibonacciHeap.main
  2 11.45% 25.07%  700000    30 java.lang.String.concat
  3  9.96% 35.03% 1400000    22 java.lang.String.getChars
  4  9.58% 44.61%  700000     3 test.IntFibonacciHeap.insert
...
That means an insert around 5 microsec on my box.
With a longer test and Princec's JVM options, it is around 4.8 microsec.
Offline princec

« JGO Spiffy Duke »


Medals: 434
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #12 - Posted 2003-12-11 18:35:59 »

I included the creation time of the nodes (and the class instance itself) because it doesn't work without them being created, and the C version would do similar (although maybe not on the heap).

I'd like to see the C comparison now doing exactly the same thing. I'm certain C will be at least twice as fast but worried if it's 10 times as fast :/

Cas Smiley

Offline altair

Senior Newbie





« Reply #13 - Posted 2003-12-11 18:57:08 »

Actually I do not know if Toby's figures include node creation or not. But, if you include the creation of the nodes, most of the time is spent in there and not in the insert() method.
Offline rreyelts

Junior Devvie




There is nothing Nu under the sun


« Reply #14 - Posted 2003-12-11 20:07:04 »

Quote
Is this too slow by a factor of 10 then? What does the C++ one manage?

Yes. That is too slow by about a factor of 10.

God bless,
-Toby Reyelts

About me: http://jroller.com/page/rreyelts
Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
Offline rreyelts

Junior Devvie




There is nothing Nu under the sun


« Reply #15 - Posted 2003-12-11 20:20:46 »

Quote
Actually I do not know if Toby's figures include node creation or not. But, if you include the creation of the nodes, most of the time is spent in there and not in the insert() method.

1) He should include node creation. It's something that is a required part of the insert. Most data structures (like HashMap) hide this inside of the insert, but with a fibonacci heap you have to give that Node back to the client anyway, so I just make the allocation the responsibility of the client.

2) The heap instance creation is just noise compared to a million inserts.

3) The String.valueOf really shouldn't be part of that though.

I decided to break down and use an actual commercial profiler to see what it would tell me. JProfiler tells me that, on average, the inserts are taking 4 microseconds, and the extract mins are taking 70 microseconds. (I believe the maximum heap size has changed significantly, since some changes we made to the algorithm). It also says that the fibonacci heap is responsible for 80% of the running time of the main algorithm.

When I get finished translating the algorithm to C++ (soon), I'll run some statistics on it and see what kind of results I get.

God bless,
-Toby Reyelts

About me: http://jroller.com/page/rreyelts
Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
Offline altair

Senior Newbie





« Reply #16 - Posted 2003-12-16 04:21:19 »

In my case, the tests show that roughly 75% is due to the node creation, not the insertion per se.
If pre-creating the nodes is an option, then you will see a big increase in speed.
Offline Mark Thornton

Senior Devvie





« Reply #17 - Posted 2003-12-21 14:01:54 »

I suspect that the fibonacci heap (or at least your implementation of it) is not well suited to Java. Try implementing a basic binary heap instead. The array used in the consolidate method is probably very cheap in C++ (stack allocated), but costs quite a bit more in Java.
Offline jherber

Junior Devvie




Java games rock!


« Reply #18 - Posted 2004-01-13 17:19:07 »

macroscopically, you want to "public static final " all your methods.

microscopically, classes in java are expensive.  if you can come up with a tricky way to store your data in an array and then just change integer pointers into that array you will get very high performance.  

writing fast java code reminds me of writing fast c code - avoid objects!
Offline blahblahblahh

JGO Coder


Medals: 1


http://t-machine.org


« Reply #19 - Posted 2004-01-13 17:37:03 »

Quote

writing fast java code reminds me of writing fast c code - avoid objects!


Ha! I remember the time when you had to also "avoid methods" for optimal performance, particularly methods in other classes or objects; I had a fractal generator that showed a considerable speed boost when all several thousand lines of code were placed in one class - I can't remember if I had the courage to manually inline the few recursive methods too (I don't think so; it would have made the code seriously painful to change...).

Thankfully there's normally no difference now between method calls to the current object and to other objects...

malloc will be first against the wall when the revolution comes...
Offline Mark Thornton

Senior Devvie





« Reply #20 - Posted 2004-01-13 18:17:27 »

Quote
macroscopically, you want to "public static final " all your methods.

A waste of time. These days the JIT compiler will do this without the need for hints. Objects aren't slow anymore either. In fact some techniques you might use to avoid them are more expensive than the code they were intended to improve. You will often be able to access a field in an object faster than an arbitrary element of an array (i.e. an access where the JIT can't guarantee is within the array bounds and thus must emit code to check the bounds).
Offline abies

Senior Devvie





« Reply #21 - Posted 2004-01-13 20:25:16 »

Quote
macroscopically, you want to "public static final " all your methods.


Current JITs are able to optimize even instance virtual calls perfectly in most cases. Anyway, EVEN if you don't believe jit to do it for you, you need only static modified - final one is superflous (all static methods are 'final').

Artur Biesiadowski
Offline Jeff

JGO Coder




Got any cats?


« Reply #22 - Posted 2004-01-13 23:36:41 »

Yup.  this " make everything static and final" business has been nothing but harmful to code for a few generations of JITs now.

Ofcourse if your stuck on an ancient VM then I won't hazard a guess what the results would be.

Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline Jeff

JGO Coder




Got any cats?


« Reply #23 - Posted 2004-01-13 23:43:38 »

Quote
writing fast java code reminds me of writing fast c code - avoid objects!


Um no.  Fast C code is generally slow Java code today
(assuming modern JITs).

C is fastest doing everything through array refernces.  In Java non-linear array referenes are expensive. C is slow generally at recursion, Java recursion can be blindingly fast. (Faster then a local array based stack in fact.)

Memory allocation in most C compilers is dirt slow.  Memory allocation in Java is highly highly optimized.

As already mentioned calling through a class or an interface is no different then calling through a static method.  This has been the case since at least JDK1.2

An old but still good introduction to some of the differences between how C and Java code execute can be found here.  Java VMs have actually imrpoved at running "c-like" code in some ways sicne this article was written but the base lessons are still pretty accurate:

http://www.aceshardware.com/Spades/read.php?article_id=153

There are also some things on this in Steve and my book.  In particular, the recursion test.

http://java.sun.com/docs/books/performance.



Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline rreyelts

Junior Devvie




There is nothing Nu under the sun


« Reply #24 - Posted 2004-01-13 23:58:06 »

Quote
I suspect that the fibonacci heap (or at least your implementation of it) is not well suited to Java. Try implementing a basic binary heap instead. The array used in the consolidate method is probably very cheap in C++ (stack allocated), but costs quite a bit more in Java.


If you can spot "problems" in my implementation, speak up now. That's why I went through the trouble of gaining permission to post my code in the first place.

Telling me to implement a d-heap is just skirting the issue and is utterly pointless as that data structure doesn't give me the algorithmic performance characteristics I need.

God bless,
-Toby Reyelts

About me: http://jroller.com/page/rreyelts
Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
Offline rreyelts

Junior Devvie




There is nothing Nu under the sun


« Reply #25 - Posted 2004-01-14 00:01:17 »

Quote
In my case, the tests show that roughly 75% is due to the node creation, not the insertion per se.
If pre-creating the nodes is an option, then you will see a big increase in speed.


I would say that something is wrong with your timing then, because the profiler shows that the bulk of the time is spent in extractMin - not allocating nodes nor inserting nodes.

God bless,
-Toby Reyelts

About me: http://jroller.com/page/rreyelts
Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
Offline rreyelts

Junior Devvie




There is nothing Nu under the sun


« Reply #26 - Posted 2004-01-14 00:11:22 »

Quote
The array used in the consolidate method is probably very cheap in C++ (stack allocated), but costs quite a bit more in Java.

You can't allocate the array on the stack unless you use something platform dependent like alloca. I pull the heap allocation out from the function call anyway. I have to memset the array in every single call to consolidate, so it's not exactly free.

I could do the same thing in Java, but it's actually more expensive to clear the array than it is to heap allocate it from scratch. Of course, its kind of hard to measure the resulting effect on garbage collection.

God bless,
-Toby Reyelts

About me: http://jroller.com/page/rreyelts
Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
Offline rreyelts

Junior Devvie




There is nothing Nu under the sun


« Reply #27 - Posted 2004-01-14 00:28:31 »

Just thought I'd let people know that I've finished doing the port and the results are dramatic. On a small size problem set, the C++ code is about five times faster. The gap between performance in the C++ and Java code increases rapidly (perhaps geometrically?) as the size of the data set increases. (We work with data sets ranging from hundreds of thousands of objects to about 100 million). Unfortunately, I can't run the Java program with the largest data sets, because the virtual machine can't allocate over 1.7G of heap space.

The algorithm used in both programs is identical. Of course, both programs try to take advantage of any naturally well performing features their respective languages carry.

God bless,
-Toby Reyelts

About me: http://jroller.com/page/rreyelts
Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
Offline rreyelts

Junior Devvie




There is nothing Nu under the sun


« Reply #28 - Posted 2004-01-14 00:48:57 »

Quote

Java recursion can be blindingly fast.


Are you trying to say that the current crop of VMs perform tail-recursion optimization? I've heard people (those who do things like implement Scheme on top of the JVM) complaining of the exact opposite - that it doesn't happen.

God bless,
-Toby Reyelts

About me: http://jroller.com/page/rreyelts
Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
Offline princec

« JGO Spiffy Duke »


Medals: 434
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #29 - Posted 2004-01-14 06:45:49 »

Aha, now if the problem is scaling geometrically on Java  and not in C it points to something else that's scaling wrongly rather than your direct interpretation of the port. Could it be GC?

Ask yourself this though before losing more hair - would you rather have an implementation with guaranteed safe GC and null pointer detection that runs slower, or a super-fast but entirely unsafe version of the algorithm?

As this is a serverside task your doing IIRC, you're probably better off trading performance for reliability and investing your money in faster processors for the server. Programmer time is very expensive indeed compared to CPU upgrades!

This is, of course, no real excuse for the algorithm being so slow in Java. What parameters are you running your benchmark with?

Cas Smiley

Pages: [1] 2 3
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

rwatson462 (30 views)
2014-12-15 09:26:44

Mr.CodeIt (20 views)
2014-12-14 19:50:38

BurntPizza (42 views)
2014-12-09 22:41:13

BurntPizza (76 views)
2014-12-08 04:46:31

JscottyBieshaar (37 views)
2014-12-05 12:39:02

SHC (51 views)
2014-12-03 16:27:13

CopyableCougar4 (48 views)
2014-11-29 21:32:03

toopeicgaming1999 (115 views)
2014-11-26 15:22:04

toopeicgaming1999 (103 views)
2014-11-26 15:20:36

toopeicgaming1999 (30 views)
2014-11-26 15:20:08
Resources for WIP games
by kpars
2014-12-18 10:26:14

Understanding relations between setOrigin, setScale and setPosition in libGdx
by mbabuskov
2014-10-09 22:35:00

Definite guide to supporting multiple device resolutions on Android (2014)
by mbabuskov
2014-10-02 22:36:02

List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!