|
|
Wednesday, July 16, 2003 |
Kata 6.Joey did it too, and I just couldn't resist. Here's Dave's Kata (Dave Thomas is a prof of mine from umpteen years ago). I was more interested in how fast I could get it to go under Java, since Java doesn't have a great reputation as a "cruncher" language. I haven't done all the tricks yet, but I think that this is pretty acceptable:
You can see the VM heats up after the first run...gets going nicely. Curiously, Dave says that there are 2530 there. I guess I found four more. Or maybe there's a criteria I'm not aware of. Pretty fast! Each of those runs also includes reading in the entire file -- I didn't optimize out the read. My machine is an AMD 2.5Ghz. Those times are using JDK 1.4.2. When you consider that this is also doing full character translation (byte-->char), it's pretty impressive. Here's my slightly scrambled answer, in case you want to do it for yourself first. Just format this in your favorite editor and you can read it. import java.io.BufferedReader; import java.io.FileNotFoundException; import java.io.FileReader; import java.io.IOException; import java.util.ArrayList; import java.util.HashMap; import java.util.Iterator; final public class Anagramer { public static void main(String[] args) { for (int i = 0; i < 10; i++) { System.gc(); go(); } } public static void go() { try { HashMap letterCounts = new HashMap(); ArrayList anagrams = new ArrayList(); BufferedReader b = new BufferedReader(new FileReader("c:/wordlist.txt")); int wc = 0; long startTime = System.currentTimeMillis(); String word = b.readLine(); while (word != null) { word = word.toLowerCase(); if (word.length() > 0) { wc++; //if (wc % 1000 == 0) { // System.out.println( // "Process " // + wc // + " words; found " // + anagrams.size() // + " so far."); //} LetterCount check = new LetterCount(word); LetterCount lc = (LetterCount) letterCounts.get(check); if (lc == null) { letterCounts.put(check, check); } else { if (lc.addWord(word)) { anagrams.add(lc); } } } word = b.readLine(); } System.out.println("Discovered " + anagrams.size() + " anagrams in " + (System.currentTimeMillis() - startTime) + " ms in " + wc + " words."); b.close(); //for (Iterator iter = anagrams.iterator(); iter.hasNext();) { // LetterCount lc = (LetterCount) iter.next(); // System.out.println(lc.toString()); //} } catch (FileNotFoundException e) { } catch (IOException ie) { } } static private int[] generateCount(String word) { int[] ret = new int[26]; for (int i = 0; i < word.length(); i++) { char c = word.charAt(i); if (c >= 'a' && c <= 'z') { int code = (int) c - (int) 'a'; ret[code]++; } } return ret; } final static class LetterCount { String firstWord; int[] counters; ArrayList words; int hash; public LetterCount(String word) { counters = generateCount(word); firstWord = word; int h = 0; for (int i = 0; i < counters.length; i++) { h = 31*h + (counters[i]+'a'); } hash = h; } public String toString() { String result = firstWord; if (words != null) { for (Iterator iter = words.iterator(); iter.hasNext();) { String word = (String) iter.next(); result = result + " " + word; } } return result; } public boolean addWord(String word) { if (words == null) { words = new ArrayList(); } words.add(word); return words.size() == 1; } public boolean equals(Object obj) { if (((LetterCount)obj).hash == hash) { for (int i = 0; i < counters.length; i++) { if (counters[i] != ((LetterCount) obj).counters[i]) { return false; } } return true; } else { return false; } } public int hashCode() { return hash; } } } 4:59:03 PM |
Just4Log.I need to agree with the BileBlog. All this class modification crap is going on because there's some basic stuff that's just missing from the Java VM. In the case of Just4log, it's all about the fact that the VM doesn't do lazy evaluation of what's passed in. The VM certainly could do this, although that might break the Java spec. Is the VM allowed to NOT evaluate an argument to a function if it can prove that it is not needed? Does the spec require that the side effects occur? Languages like Haskell explicity make every expression lazy. The compiler and runtime are free to avoid evaluation of just about anything they want to. Of course, you can force evaluation. 2:03:23 PM |