Java-Gaming.org Hi !
Featured games (83)
games approved by the League of Dukes
Games in Showcase (511)
Games in Android Showcase (119)
games submitted by our members
Games in WIP (577)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  anything better to use than a StringBuffer  (Read 1677 times)
0 Members and 1 Guest are viewing this topic.
Offline Kommi

Junior Duke




All opinions will be lined up and shot!


« Posted 2006-01-06 16:55:10 »

Hey I have a small app that constantly searchers through a dataBuffer looking for words and sentenses. One operation would be to get the location of the 20th occurance of the word 'the'. I was wondering if there is a better structure to use rather than a stringBuffer. Maybe some kind of Hash structure? Since the buffer is large, the indexOf() operations takes time to execute, and I am trying to improve the performance. Anyone have any suggestions?

Kommi
Offline swpalmer

JGO Coder


Exp: 12 years


Where's the Kaboom?


« Reply #1 - Posted 2006-01-06 18:32:04 »

StringBuilder in Java 5 is an unsynchronized version.

If you are searching the same string multiple times there may be some optimizations to do.  How large is the databuffer?  You might consider getting the array of chars and writing your own search method.

When searching for long strings, like sentences in your case, there are some good optimization techniques to speed up the linear search.

Eg. lets say the data buffer contains the string:

"The quick brown fox jumps over the lazy dog."

and you are searching for the string "over"

You can do more than just a linear search for the 'o' and then check for the 'ver'...
you can check for a match on the LAST character and based on the character that you find in the buffer at the position where you will compare with the last character of your search string, you can decide to advance your search window by more than one for the next compare.

e.g.
"The "
"over"

the 'r' does not match the space (' '), but not only that, you know that there is no space in your search string so you canjump your search window ahead by 4 characters right away.
"quic"
"over"

There is no 'c' in "over" so you can jump ahead by 4 again...

"k bro"
"over"

This time you compare the 'r' from "over" with the 'o' from "brown"... there IS an 'o' in over, but it is at the first position only.. so you can advance your search window by 3.

"own "
"over"

again there is no space in "over"... advance by 4... etc.

When you find a character that is in your search phrase more than once, you advance your search window to align that character from the data buffer with the LAST occurance of the character in your search phrase.

Offline Jeff

JGO Coder




Got any cats?


« Reply #2 - Posted 2006-01-06 19:13:04 »

As far as data structures, there are probably much better approaches but to understand which to use you need to analyze the expected behavior of your program.

If the data is large but not enormous, and you are going to be doing a grat many such searches,  you might consider scanning the entire text once at building a hash table based index to all the words in it.

If the data is going to be too big, ofcourse, this will run into memory problems and you might need to do more sophisticated disk-based indexing...  Simialrly if your only going to do one or two searchs then the cost of scanning the data and building the index might be more then the cost of your brute force search.

JK

Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Kommi

Junior Duke




All opinions will be lined up and shot!


« Reply #3 - Posted 2006-01-09 16:42:43 »

Hum Jeff I am interested in trying the approach you are talking about. Lets say I store the following paragraph into a hash table (paragraph out of random cnn article)

"Iran's nuclear programs are a source of contention with the West. Iran's hard-line conservative government insists the programs are aimed at peaceful purposes, and that it has the right to restart nuclear facilities and enrich uranium for the production of nuclear energy."

How exaclty should I store it in a hash table? I am guessing the id of each word is just the word number in the paragraph.

How would I search for a sentence in a hash table? I am guessing I get all the id's for all the words and check to see if they are next to each other? Is that really going to be faster than an indexOf operation?

Kommi
Offline Jeff

JGO Coder




Got any cats?


« Reply #4 - Posted 2006-01-09 18:07:32 »

Hmm. i missed that you are  lookign for more then one word at a time.

What I think I would do as a first cut is this:

Create a hash table of the form Map<String,List<Long>>.
The key is a word, the entry is a list of all the starting locations of that word in your text data.
I'd use StringTokenizer to scan through the String finding the words and building the table.

When you want to find a sentance, I'd use the table to find the first word  in the sentance, then do a comparison between the string you are trying to match and the substring of the entire text that begins at that location.




Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline Kommi

Junior Duke




All opinions will be lined up and shot!


« Reply #5 - Posted 2006-01-09 18:42:58 »

ah ok, so I have a hash map with all of the locations of all the words. That way I dont have to do an indexOf operation. Then just use the StringBuffer object to compare against my seach string, checking at each location of the first word. Got it, thanks.

Kommi
Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

Longarmx (50 views)
2014-10-16 18:59:02

Norakomi (40 views)
2014-10-16 06:22:06

Norakomi (31 views)
2014-10-16 06:20:20

lcass (36 views)
2014-10-15 07:18:58

TehJavaDev (67 views)
2014-10-13 15:39:48

TehJavaDev (65 views)
2014-10-13 15:35:47

TehJavaDev (57 views)
2014-10-13 15:32:37

BurntPizza (72 views)
2014-10-11 14:24:42

BurntPizza (44 views)
2014-10-11 14:10:45

BurntPizza (84 views)
2014-10-11 13:30:10
Understanding relations between setOrigin, setScale and setPosition in libGdx
by mbabuskov
2014-10-09 13:35:00

Definite guide to supporting multiple device resolutions on Android (2014)
by mbabuskov
2014-10-02 13:36:02

List of Learning Resources
by Longor1996
2014-08-16 01:40:00

List of Learning Resources
by SilverTiger
2014-08-05 10:33:27

Resources for WIP games
by CogWheelz
2014-08-01 07:20:17

Resources for WIP games
by CogWheelz
2014-08-01 07:19:50

List of Learning Resources
by SilverTiger
2014-07-31 07:29:50

List of Learning Resources
by SilverTiger
2014-07-31 07:26:06
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!