ags1


«
Posted
20160314 11:44:49 » 

I need to generate a big table of primes (well, I could just download them I suppose  but do I trust some random Internet person to get it right?). I wrote the following naive code: EDIT: Riven suggestion highlighted. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 public class FindPrimes {
public static void main(String[] args) { ++ List<Long> primes = new ArrayList<>(1000_000); ++ primes.add(2L); ++ long candidate = 3; while (primes.size() < 1000_000) { boolean isPrime = true; for (long prime : primes) { if (candidate % prime == 0) { isPrime = false; break; } } if (isPrime) { primes.add(candidate); System.out.println(primes.size() + "\t: " + candidate); } ++ candidate += 2; }
} } 
I've looked up https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes, but I don't see how that is going to be faster.... EDIT: I should of course set capacity of the ArrayList to 1000_000....




Riven


«
Reply #1  Posted
20160314 11:51:35 » 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 public class FindPrimes {
public static void main(String[] args) { List<Long> primes = new ArrayList<>(); long candidate = 2; while (primes.size() < 1000_000) { boolean isPrime = true; for (long prime : primes) { if (candidate % prime == 0) { isPrime = false; break; } } if (isPrime) { primes.add(candidate); System.out.println(primes.size() + "\t: " + candidate); }  candidate++; + candidate+=2; }
} } 
You're welcome, twice as fast

Hi, appreciate more people! Σ ♥ = ¾ Learn how to award medals... and work your way up the social rankings!



ags1


«
Reply #2  Posted
20160314 11:58:49 » 

It will be much faster, because the loop starts at 2, and with your change will then test 4, 6, 8, etc for primeness? ? But even with that change I don't think it will go significantly faster. The prime test loop exits on item 0 in this case, so that's not causing the slowdown.




Games published by our own members! Check 'em out!


ags1


«
Reply #3  Posted
20160314 12:21:50 » 

This goes faster: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
 public static void main(String[] args) { long[] primes = new long[1000_000]; primes[0] = 2; int primeCount = 1; long candidate = 3; while (primeCount < 1000_000) { boolean isPrime = true; for (long prime : primes) { if (prime == 0) { break; } if (candidate % prime == 0) { isPrime = false; break; } } if (isPrime) { primes[primeCount++] = candidate; System.out.println(primeCount + "\t: " + candidate); } candidate += 2; }
} 




KaiHH


«
Reply #4  Posted
20160314 15:09:11 » 

The sieve still is many many orders of magnitude faster. One factor being: It does not need the modulus operator. The other factor being, it is O(n log log n) instead of n². And it only requires 122 KB of memory instead of 7.6 MB for your 1,000,000 numbers. (the following is literally converted from the Wikipedia article to Java) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
 import java.util.BitSet; public class Sieve { public static void sieve() { int n = 1000000; BitSet candidates = new BitSet(n); candidates.set(2, n  1, true); int sqrtN = (int) Math.sqrt(n); for (int i = 2; i <= sqrtN; i++) if (candidates.get(i)) for (int j = i * i; j <= n; j += i) candidates.set(j, false); for (int i = 2; i < n; i++) if (candidates.get(i)) System.out.println(i); } } 
Just counting the number of primes between 0 and 100,000,000 (onehundred million) with this algorithm, by not sysout but instead incrementing a counter, takes just 886 Milliseconds and outputs 5,761,455.




Riven


«
Reply #5  Posted
20160314 15:59:51 » 

@KaiHH: a noticable difference is that ags1's code returns the first 1_000_000 primes, and your code returns all primes in the range [0, 1_000_000] You also mention that you don't do a modulo, but that is effectively done by flagging all multiples of a number (above a threshold) . This is a linear operation, while a modulo is a constant operation. Trying to find an undefined number of primes, and this inner loop will be the bottleneck (as it keeps on flagging endless amounts of 'slots' to false). I know this is a theoretic complaint, but still BTW: You can equally use a BitSet in ags1's algorithm, so save memory, at the cost of performance.

Hi, appreciate more people! Σ ♥ = ¾ Learn how to award medals... and work your way up the social rankings!



KaiHH


«
Reply #6  Posted
20160314 16:09:02 » 

@KaiHH: a noticable difference is that ags1's code returns the first 1_000_000 primes, and your code returns all primes in the range [0, 1_000_000]
No, it does not, because it would need to have pi^1(x) large of a long array, which for 1,000,000 is something between 10,000,000 and 100,000,000.EDIT: Sorry, I was wrong! Did not really see that the while loop went for as long as it took to really find N primes, and did not got over the size of the array. Sorry.




Riven


«
Reply #7  Posted
20160314 16:11:49 » 

@KaiHH: a noticable difference is that ags1's code returns the first 1_000_000 primes, and your code returns all primes in the range [0, 1_000_000]
No, it does not, because it would need to have pi(x)*ln(x) large of a long array. Can you explain what you mean? I'm just observing that you both have a 'collection' of size 1_000_000 in the end, but ags1's collection has 1_000_000 primes, while your collection holds 1_000_000 bits describing whether each slot holds a prime.

Hi, appreciate more people! Σ ♥ = ¾ Learn how to award medals... and work your way up the social rankings!



KaiHH


«
Reply #8  Posted
20160314 16:13:01 » 

Edited my post.




KaiHH


«
Reply #9  Posted
20160314 16:30:11 » 

@KaiHH: a noticable difference is that ags1's code returns the first 1_000_000 primes, and your code returns all primes in the range [0, 1_000_000]
Here is a version which only computes/outputs the first 'count' primes: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
 public static void firstCountPrimes(int count) { int n = (int) (count * (Math.log(count) + Math.log(Math.log(count)) + 1)); BitSet notPrime = new BitSet(n); int sqrtN = (int) Math.sqrt(n); for (int i = 2; i <= sqrtN; i++) if (!notPrime.get(i)) for (int j = i * i; j <= n; j += i) notPrime.set(j, true); int primeCount = 1; for (int i = 2; primeCount <= count; i++) if (!notPrime.get(i)) { System.out.println(primeCount + " = " + i); primeCount++; } } 




Games published by our own members! Check 'em out!


HeroesGraveDev


«
Reply #10  Posted
20160314 18:55:31 » 

You can skip over all multiples of 2/3 by incrementing in steps of 6 (starting at 0) and checking the number above and the number below.




KaiHH


«
Reply #11  Posted
20160314 19:20:07 » 

Also look at Melissa O’Neill's "The Genuine Sieve of Eratosthenes."




ags1


«
Reply #12  Posted
20160314 19:30:00 » 

I got home and benchmarked my primitive efforts versus KaiHH BitSet boss code: http://pastebin.javagaming.org/f83ef6614431dI only did it to 100,000 primes because my code is rather slow. Run 1: Ags1 v1 took 87706ms and found highest prime: 1299709 Ags1 v2 took 12452ms and found highest prime: 1299709 KaiHH method took 27ms and found highest prime: 1299709 Run 2: Ags1 v1 took 85349ms and found highest prime: 1299709 Ags1 v2 took 12133ms and found highest prime: 1299709 KaiHH method took 29ms and found highest prime: 1299709 Run 3: Ags1 v1 took 84787ms and found highest prime: 1299709 Ags1 v2 took 12146ms and found highest prime: 1299709 KaiHH method took 26ms and found highest prime: 1299709




KaiHH


«
Reply #13  Posted
20160314 19:37:14 » 

I want to say, this is not my code (well code yes, but not algorithm). I just saw your post about finding the first N primes and then read Wikipedia. When it comes to such mathematical algorithms I never try to come up with an own algorithm. There usually were far more clever people back in the B.C. ages who found the most efficient "algorithms" by carving them into stone. EDIT: Also, I updated my last code. It's a bit faster, by inverting the meaning of the bits and therefore not having to initialize all bits to true at first. Might want to benchmark it again.




ags1


«
Reply #14  Posted
20160314 19:47:04 » 

I also updated my v2 code to use an int[] and not a long[]  primes are more common than I thought. I needed to think about your code but now I get it It is also interesting to me that cleaning up my quick and dirty v1 code (using a primitive array and getting rid of any and all autoboxing) gave a 7x speedup.




gouessej


«
Reply #15  Posted
20160315 01:17:03 » 

When it comes to such mathematical algorithms I never try to come up with an own algorithm.
I often have to write my own algorithms even though some more clever people are able to write more efficient ones than mine but because they are unable or they don't want to explain them precisely to the masses unlike me. For example, I had to write my own algorithm to solve this kind of problem: https://en.wikipedia.org/wiki/Largest_empty_rectangleI have found no robust implementation in Java that supports jagged arrays. When I don't understand a source code, I prefer not using it as if something breaks, I won't be able to repair it. Thank you for the code




delt0r


«
Reply #16  Posted
20160315 04:30:11 » 

I had some fast code for this and even larger primes using some fancy math crap. But that was back when I was a MSc student and computers were around 1000x slower than today. Used Ecliptic curves and stuff. I don't remember any of that stuff now.
First the first million primes, so pretty small stuff really. The Sieve is by far the best trade off in time to code and time to run.

I have no special talents. I am only passionately curious.Albert Einstein



Roquen


«
Reply #17  Posted
20160315 09:27:33 » 

I have to ask: why primes? I can't think of a single time I've used a prime number. Squarefree numbers and/or relatively prime numbers...that's a different story. That aside, note that finding X primes and finding the first X primes are different problems. EDIT: should have provided an example  https://en.wikipedia.org/wiki/Sieve_of_Atkin#Explanation




princec


«
Reply #18  Posted
20160315 09:41:48 » 

I have to ask: why primes? Increasingly it looks like... because they're there to be found! This seems to be quite an entertaining problem. Cas




KaiHH


«
Reply #19  Posted
20160315 10:04:45 » 

No one is disagreeing with you there, or even saying that both problems were the same. If you are trying to say, that you should use different algorithms to solve those two different problems, then your reference to the Sieve of Atkin does not quite support that, because it also is used to search for the primes in [0..N] and not for enumerating the first N primes, as is the Sieve of Eratosthenes. Or maybe I misunderstood that Atkin algorithm. But thanks for the algorithm reference! Btw.: Do you know a good algorithm to actually find the first N primes faster than _abusing_ the algorithms to find primes within [0..pi^1(N)]?




jonjava


«
Reply #20  Posted
20160315 10:52:04 » 

I have to ask: why primes?
Primes are important. Number theory, encryption  that kind of thing.




ags1


«
Reply #21  Posted
20160315 11:13:59 » 

I need the primes to test a number crunching program I am writing. I realized that I overlooked the need in my code to restrict factor checks to factors <= sqrt(candidate). so there is a good deal of useless calculation going on  for the millionth prime I am checking hundreds of thousands of unnecessary divisors. Correcting this makes the naive modulus method look somewhat better versus the Erastothenes sieve: Run 1: Ags1 v3 Took 2240ms to find 1000000 primes, highest being 15485863 Kaihh v2 Took 214ms to find 1000000 primes, highest being 15485863 Run 2: Ags1 v3 Took 2306ms to find 1000000 primes, highest being 15485863 Kaihh v2 Took 224ms to find 1000000 primes, highest being 15485863 Run 3: Ags1 v3 Took 2306ms to find 1000000 primes, highest being 15485863 Kaihh v2 Took 237ms to find 1000000 primes, highest being 15485863 Benchmark code: http://pastebin.javagaming.org/83ef674134d1eOnly one order of magnitude between them now, and the naive code is even fast enough for practical purposes Also, to win this "contest" I only need to change scale of the problem: Ags1 v3 Took 1ms to find 1000 primes, highest being 7919 Kaihh v2 Took 9ms to find 1000 primes, highest being 7919 Up till around 8000 primes, the BitSetbased solution performs a bit slower, of course only half a dozen milliseconds slower EDIT: If you put the two methods in a loop (which you wouldn't do, because the program has a practical purpose), then the BitSet methods pulls ahead again even for small prime result sets. JIT doing it's thing...




KaiHH


«
Reply #22  Posted
20160315 11:39:15 » 

Kaihh v3 "BitSet traversal optimization" 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
 public static void kaihh() { int n = (int) (NUM_PRIMES * (Math.log(NUM_PRIMES) + Math.log(Math.log(NUM_PRIMES)) + 1)); BitSet notPrime = new BitSet(n); int sqrtN = (int) Math.sqrt(n); for (int i = 2; i <= sqrtN; i = notPrime.nextClearBit(i + 1)) { for (int j = i * i; j <= n; j += i) { notPrime.set(j, true); } } int primeCount = 1; int lastPrime = 2; for (int i = 2; primeCount <= NUM_PRIMES; i = notPrime.nextClearBit(i + 1)) { lastPrime = i; primeCount++; } } 
EDIT: First traversal can be optimized the same way.




ags1


«
Reply #23  Posted
20160315 11:48:22 » 

Kaihh v3 Took 179ms to find 1000000 primes, highest being 15485863 Kaihh v2 Took 321ms to find 1000000 primes, highest being 15485863 Ags1 v3 Took 2209ms to find 1000000 primes, highest being 15485863
Kaihh v3 Took 209ms to find 1000000 primes, highest being 15485863 Kaihh v2 Took 359ms to find 1000000 primes, highest being 15485863 Ags1 v3 Took 2207ms to find 1000000 primes, highest being 15485863
Kaihh v3 Took 180ms to find 1000000 primes, highest being 15485863 Kaihh v2 Took 296ms to find 1000000 primes, highest being 15485863 Ags1 v3 Took 2427ms to find 1000000 primes, highest being 15485863




Roquen


«
Reply #24  Posted
20160315 13:00:42 » 

Btw.: Do you know a good algorithm to actually find the first N primes faster than _abusing_ the algorithms to find primes within [0..pi^1(N)]?
Nothing that's remotely current. If someone wants to knock themselves out: http://cr.yp.to/primetests/prime200420040212.pdf




