Java-Gaming.org    
Featured games (79)
games approved by the League of Dukes
Games in Showcase (477)
Games in Android Showcase (109)
games submitted by our members
Games in WIP (536)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  TrieHard - A Trie implementation!  (Read 6751 times)
0 Members and 1 Guest are viewing this topic.
Offline ClickerMonkey

JGO Coder


Medals: 20


Game Engineer


« Posted 2013-07-29 22:25:34 »

I'm not sure if someone will find a Trie implementation useful, but I just finished one here:

https://github.com/ClickerMonkey/TrieHard

If you can think of another use case other than determining exclusion/inclusion or auto-complete... please let me know!

I imagine this would only be useful for game libraries, not so much games.

Offline relminator
« Reply #1 - Posted 2013-07-30 12:07:58 »

Nice name.  I would assume the next build would be: "trie hard 2 - trie harder".
Offline actual

JGO Coder


Medals: 23



« Reply #2 - Posted 2013-07-30 13:11:26 »

So that's you on reddit Smiley.

Why are you using Boolean.TRUE and Boolean.FALSE rather than just true amd false?  I don't think I've ever seen a Boolean object used in production code before
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline ClickerMonkey

JGO Coder


Medals: 20


Game Engineer


« Reply #3 - Posted 2013-07-30 14:09:13 »

When dealing with generic classes in Java (like Trie<String, Boolean>) you ARE dealing with the Boolean object. Sure you can pass true and false in there, but it's auto-boxed into the Boolean.TRUE and BOOLEAN.FALSE. The same applies to using Integer, you can pass in ints (i.e. x), but it auto-boxes it to new Integer(x).

I should change the example so it looks a little nicer, even though what I have is perfectly correct.

Offline ClickerMonkey

JGO Coder


Medals: 20


Game Engineer


« Reply #4 - Posted 2013-08-03 04:54:20 »

TrieHard has been updated so a Trie can be used as a Map now. Insertion, removal, getting, and traversal have all had speed improvements.

TrieHard now outperforms all other Trie implementations I've found! (put+get)

Offline Several Kilo-Bytes

Senior Member


Medals: 11



« Reply #5 - Posted 2013-08-20 21:48:24 »

Some Strings cause an exception. I obtained a Trie through Tries.forString() and am using the Moby word list names.txt file for keys/values. I put each trim()ed String as the key and value in order. The exception is thrown for the 771st String "Ameli". Stare

Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 5
   at java.lang.String.charAt(String.java:658)
   at org.magnos.trie.TrieSequencerCharSequence.hashOf(TrieSequencerCharSequence.java:54)
   at org.magnos.trie.TrieSequencerCharSequence.hashOf(TrieSequencerCharSequence.java:28)
   at org.magnos.trie.Trie.put(Trie.java:165)
Offline Nate

JGO Kernel


Medals: 145
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #6 - Posted 2013-08-21 14:52:48 »

I wrote a fancy trie once. It is readonly though, the encoder is separate. It only supports 75bit text but is ultra efficient disk- and memory-wise. Eg, it stores 293,365 words in 889,614 bytes.

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
44  
45  
46  
47  
48  
49  
50  
51  
52  
53  
54  
55  
56  
57  
58  
59  
60  
61  
62  
63  
64  
65  
66  
67  
68  
69  
70  
71  
72  
73  
74  
75  
76  
77  
78  
79  
80  
81  
82  
83  
84  
85  
86  
87  
88  
89  
90  
91  
92  
93  
94  
95  
96  
97  
public class Trie {
   static public final int START = 0xC0000000;
   static public final int INVALID = 0x800000;
   static public final int LEAF = 0x80000000;

   static private final int END_OF_WORD = 0x80;
   static private final int HAS_MORE = 0x40;
   static private final int IS_LEAF = 0x20;

   byte[] data;

   public Trie () {
   }

   public Trie (FileHandle file) {
      DataInputStream input = new DataInputStream(file.read());
      try {
         data = new byte[input.readInt()];
         int bytes = 0;
         while (true) {
            int count = input.read(data, bytes, data.length - bytes);
            if (count == 0) break;
            bytes += count;
         }
      } catch (IOException ex) {
         throw new RuntimeException("Error reading file: " + file, ex);
      }
   }

   public int get (int c, int offset) {
      if (offset == LEAF) return INVALID;

      byte[] data = this.data;
      c -= 97;

      if (offset == START) {
         offset = c * 3;
         int index = (data[offset++] & 0xFF) << 16;
         index |= (data[offset++] & 0xFF) << 8;
         index |= data[offset] & 0xFF;
         return index;
      }
      if (offset < 0) offset = -offset;

      // Find matching char.
     int indexOffset = 0, charIndex = 0;
      byte b;
      while (true) {
         b = data[offset++];
         if ((b & 31) == c) break;
         if ((b & HAS_MORE) == 0) return INVALID;
         if ((b & IS_LEAF) == 0) indexOffset++; // A leaf has no indexes.
        charIndex++;
      }
      if ((b & IS_LEAF) != 0) return LEAF;

      int nextIndex;
      if (charIndex == 0 && (b & HAS_MORE) == 0) {
         // A single child has no indexes, use the next byte.
        nextIndex = offset;
      } else {
         // Skip remaining chars.
        while ((b & HAS_MORE) != 0)
            b = data[offset++];
         // Skip indexes before the one we want.
        int indexesStart = offset;
         for (int i = 0; i < indexOffset; i++) {
            if ((data[offset++] & 0x80) == 0) continue;
            if ((data[offset++] & 0x80) == 0) continue;
            offset++;
         }
         // Decode the one we want.
        b = data[offset++];
         nextIndex = b & 0x7F;
         if ((b & 0x80) != 0) {
            b = data[offset++];
            nextIndex |= (b & 0x7F) << 7;
            if ((b & 0x80) != 0) nextIndex |= (data[offset] & 0x7F) << 14;
         }
         nextIndex += indexesStart;
      }

      if ((data[nextIndex] & END_OF_WORD) != 0) return -nextIndex;
      return nextIndex;
   }

   public boolean contains (String text) {
      int index = START;
      for (int i = 0, n = text.length(); i < n; i++) {
         // System.out.println(text.charAt(i) + ": "
        // + (get(text.charAt(i), index) == INVALID ? "INVALID" : get(text.charAt(i), index)));
        index = get(text.charAt(i), index);
         if (index == INVALID) return false;
      }
      return index < 0;
   }
}

Offline Several Kilo-Bytes

Senior Member


Medals: 11



« Reply #7 - Posted 2013-08-22 02:23:15 »

Optimized read only structures are neat. Let me ask/confirm a few things:
Isn't your implementation 5 bit? (And uses lowercase letters)
What do you use to encode the trie? Are there multiple encodings?
I wonder how if you did have a 7-bit tree if it would be relatively simple to encode UTF-8.
I don't know if it qualifies as efficient for disk storage. It might be close to zip compression. Can the structure be compressed further?
Offline Nate

JGO Kernel


Medals: 145
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #8 - Posted 2013-08-22 02:54:23 »

Oops, is it 5 bit? Was so long ago that I wrote it and I just pasted. Cheesy

I have a tool somewhere that encodes the trie. IIRC it loads it up into inefficient data structures, does some processing, and then writes out the magic format the code above reads. Just one encoding.

You're right, zipping my 293,365 word trie file goes from 889,614 to 512,862 bytes. The raw data is 2.77MB, down to 797,139 bytes zipped.  Smiley

Offline Several Kilo-Bytes

Senior Member


Medals: 11



« Reply #9 - Posted 2013-08-24 00:04:59 »

@ClickerMonkey: I loaded the text file with the wrong encoding. (It was not UTF.) However loading the entire list changed the String for which the exception was thrown.

@Nate: I think you could use 7 bits per node if you used a terminator character. The structure would be bigger, but if the Strings were Huffman encoded you could fit more characters into fewer nodes. Not that I am suggesting that. Just pointing it out.
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline ClickerMonkey

JGO Coder


Medals: 20


Game Engineer


« Reply #10 - Posted 2013-08-26 13:42:56 »

@Several Kilo-Bytes

So you are still having an exception? Can you post the full stack trace if so? Also, can you send me that list of words... the words inserted before the culprit are the ones to blame mostly. Also, make sure you are using the most up-to-date JAR, I fixed a NPE about two weeks ago. Thanks!

@Nate

That's a pretty awesome Trie! Is there a way to efficiently query all words that start with "moo" or something? I skimmed through the code, but since you didn't post the encoder it's harder to just stare at and understand how it works =P


Offline Several Kilo-Bytes

Senior Member


Medals: 11



« Reply #11 - Posted 2013-08-26 18:23:34 »

I use the Moby word lists because it's public domain and already in plain text format. My JRE is up to date. I downloaded the source as a zip file. The stack trace is:
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 3
   at java.lang.String.charAt
   at org.magnos.trie.TrieSequencerCharSequence.hashOf(TrieSequencerCharSequence.java:54)
   at org.magnos.trie.TrieSequencerCharSequence.hashOf(TrieSequencerCharSequence.java:28)
   at org.magnos.trie.Trie.put(Trie.java:165)
        ...
Offline ClickerMonkey

JGO Coder


Medals: 20


Game Engineer


« Reply #12 - Posted 2013-08-26 19:50:12 »

Awesome, found the bug and fixed it! It only happened in very specific scenarios... but it's all better now. The builds and code has been updated on GitHub.

Offline Nate

JGO Kernel


Medals: 145
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #13 - Posted 2013-08-26 22:43:41 »

That's a pretty awesome Trie! Is there a way to efficiently query all words that start with "moo" or something? I skimmed through the code, but since you didn't post the encoder it's harder to just stare at and understand how it works =P
You could walk the trie and get out words that start with, but only a contains method is implemented. The encoder is junk, super inefficient in pretty much all ways, but since it runs offline it isn't a huge deal. It'd be better to use something like your trie then write out the readonly encoding. I'd post the encoder but I'm not yet sure if I'll do something with the trie.

Offline ClickerMonkey

JGO Coder


Medals: 20


Game Engineer


« Reply #14 - Posted 2013-08-27 15:49:39 »

I also just added TrieSet, it implements java.util.Set and is pretty easy to create:

1  
2  
3  
4  
5  
TreeSet<String> set = new TreeSet<String>(Tries.forStrings());

set.add( "meow" );

assert ( set.contains( "meow" ) );

Offline omadawn

Junior Newbie





« Reply #15 - Posted 2013-11-07 09:29:35 »

I can't think of anything I'd need this for in a game but I have a non-game use for it. I was actually looking at trie a few weeks ago wondering if there was a java lib out there that would make it easy to pull off.
* omadawn heads over to github.

TY
Offline ClickerMonkey

JGO Coder


Medals: 20


Game Engineer


« Reply #16 - Posted 2013-11-15 19:40:22 »

No problem!

Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

CogWheelz (18 views)
2014-07-30 21:08:39

Riven (26 views)
2014-07-29 18:09:19

Riven (15 views)
2014-07-29 18:08:52

Dwinin (13 views)
2014-07-29 10:59:34

E.R. Fleming (34 views)
2014-07-29 03:07:13

E.R. Fleming (12 views)
2014-07-29 03:06:25

pw (43 views)
2014-07-24 01:59:36

Riven (44 views)
2014-07-23 21:16:32

Riven (30 views)
2014-07-23 21:07:15

Riven (31 views)
2014-07-23 20:56:16
List of Learning Resources
by SilverTiger
2014-07-31 18:29:50

List of Learning Resources
by SilverTiger
2014-07-31 18:26:06

List of Learning Resources
by SilverTiger
2014-07-31 13:54:12

HotSpot Options
by dleskov
2014-07-08 03:59:08

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:58:24

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:47:22

How do I start Java Game Development?
by ra4king
2014-05-17 11:13:37

HotSpot Options
by Roquen
2014-05-15 09:59:54
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!