Wednesday, July 16, 2003


Kata 6.

Joey did it too, and I just couldn't resist.  Here's Dave's Kata (Dave Thomas is a prof of mine from umpteen years ago).  I was more interested in how fast I could get it to go under Java, since Java doesn't have a great reputation as a "cruncher" language.

I haven't done all the tricks yet, but I think that this is pretty acceptable:

  1. Discovered 2534 anagrams in 416 ms in 45425 words.
  2. Discovered 2534 anagrams in 139 ms in 45425 words.
  3. Discovered 2534 anagrams in 293 ms in 45425 words.
  4. Discovered 2534 anagrams in 170 ms in 45425 words.
  5. Discovered 2534 anagrams in 123 ms in 45425 words.
  6. Discovered 2534 anagrams in 123 ms in 45425 words.
  7. Discovered 2534 anagrams in 123 ms in 45425 words.
  8. Discovered 2534 anagrams in 123 ms in 45425 words.
  9. Discovered 2534 anagrams in 123 ms in 45425 words.
  10. Discovered 2534 anagrams in 123 ms in 45425 words.

You can see the VM heats up after the first run...gets going nicely.  Curiously, Dave says that there are 2530 there.  I guess I found four more.  Or maybe there's a criteria I'm not aware of.  Pretty fast!  Each of those runs also includes reading in the entire file -- I didn't optimize out the read. 

My machine is an AMD 2.5Ghz.  Those times are using JDK 1.4.2.

When you consider that this is also doing full character translation (byte-->char), it's pretty impressive. 

Here's my slightly scrambled answer, in case you want to do it for yourself first.  Just format this in your favorite editor and you can read it.

import java.io.BufferedReader; import java.io.FileNotFoundException; import java.io.FileReader; import java.io.IOException; import java.util.ArrayList; import java.util.HashMap; import java.util.Iterator; final public class Anagramer { public static void main(String[] args) { for (int i = 0; i < 10; i++) { System.gc(); go(); } } public static void go() { try { HashMap letterCounts = new HashMap(); ArrayList anagrams = new ArrayList(); BufferedReader b = new BufferedReader(new FileReader("c:/wordlist.txt")); int wc = 0; long startTime = System.currentTimeMillis(); String word = b.readLine(); while (word != null) { word = word.toLowerCase(); if (word.length() > 0) { wc++; //if (wc % 1000 == 0) { // System.out.println( // "Process " // + wc // + " words; found " // + anagrams.size() // + " so far."); //} LetterCount check = new LetterCount(word); LetterCount lc = (LetterCount) letterCounts.get(check); if (lc == null) { letterCounts.put(check, check); } else { if (lc.addWord(word)) { anagrams.add(lc); } } } word = b.readLine(); } System.out.println("Discovered " + anagrams.size() + " anagrams in " + (System.currentTimeMillis() - startTime) + " ms in " + wc + " words."); b.close(); //for (Iterator iter = anagrams.iterator(); iter.hasNext();) { // LetterCount lc = (LetterCount) iter.next(); // System.out.println(lc.toString()); //} } catch (FileNotFoundException e) { } catch (IOException ie) { } } static private int[] generateCount(String word) { int[] ret = new int[26]; for (int i = 0; i < word.length(); i++) { char c = word.charAt(i); if (c >= 'a' && c <= 'z') { int code = (int) c - (int) 'a'; ret[code]++; } } return ret; } final static class LetterCount { String firstWord; int[] counters; ArrayList words; int hash; public LetterCount(String word) { counters = generateCount(word); firstWord = word; int h = 0; for (int i = 0; i < counters.length; i++) { h = 31*h + (counters[i]+'a'); } hash = h; } public String toString() { String result = firstWord; if (words != null) { for (Iterator iter = words.iterator(); iter.hasNext();) { String word = (String) iter.next(); result = result + " " + word; } } return result; } public boolean addWord(String word) { if (words == null) { words = new ArrayList(); } words.add(word); return words.size() == 1; } public boolean equals(Object obj) { if (((LetterCount)obj).hash == hash) { for (int i = 0; i < counters.length; i++) { if (counters[i] != ((LetterCount) obj).counters[i]) { return false; } } return true; } else { return false; } } public int hashCode() { return hash; } } }
4:59:03 PM    

Just4Log.

I need to agree with the BileBlog.  All this class modification crap is going on because there's some basic stuff that's just missing from the Java VM.

In the case of Just4log, it's all about the fact that the VM doesn't do lazy evaluation of what's passed in.  The VM certainly could do this, although that might break the Java spec.  Is the VM allowed to NOT evaluate an argument to a function if it can prove that it is not needed?  Does the spec require that the side effects occur?

Languages like Haskell explicity make every expression lazy.  The compiler and runtime are free to avoid evaluation of just about anything they want to.  Of course, you can force evaluation.
This is where the key difference between procedures and functions occur.  A procedure creates a side effect, and must always be evaluated.  A function generates a result, but often has side effects too (which it shouldn't).  Languages need to differentiate between the two.


2:03:23 PM    

Overclocking.

There are two conventional barriers that apply to overclocking mainboards.  First, the circuits usually have timing limits that prevent them from being used beyond certain boundaries.  Errors often increase as these boundaries are approached.  Second, there is the heat dissipation problem -- as chipsets are run faster and faster, they lose their ability to discard all the heat being generated.

I was just thinking that what we need is a "continuous" overclocking feature, as a standard part of the motherboard.  Rather than operate at just one speed, set in the BIOS, the OS monitors the usage state.  When the system is idle, processor and bus speed can be "idled down", to save power and reduce heat.  My laptop computer already does this with processor speed; why can't it be applied to the bus speeds as well?

If heat starts to become a problem, the system can automatically reduce the speeds and dissipate heat faster.

The advantage here is that when the system is under heavy load, we can deliver a "burst" of extra power right when it's needed, perhaps speeding up the CPU and bus by a significant percentage.  At the conclusion of the activity or when measured conditions merit, both are retarded.

Most computers spend most of their time idling.  During idling we could potentially dramatically reduce bus speed and processor speed.

Let's hope that a motherboard manufacturer tries this!

The only question is, what parts of the OS need to be aware of a continuously variable timing on the bus?  Most should already be largely immune to changing processor speeds, because laptops have been doing that forever.


1:16:16 PM