Types
of collections
The
standard Java 1.0
and 1.1 library comes with a bare minimum set of collection classes, but
they’re probably enough to get by with for many of your programming
projects. (As you’ll see at the end of this chapter, Java 1.2
provides a radically redesigned and filled-out library of collections.)
Vector
The
Vector
is quite simple to use, as you’ve seen so far. Although most of the time
you’ll just use
addElement( )
to insert objects,
elementAt( )
to get them out one at a time, and
elements( )
to get an
Enumeration
to the sequence, there’s also a set of other methods that can be useful.
As usual with the Java libraries, we won’t use or talk about them all
here, but be sure to look them up in the electronic documentation to get a feel
for what they can do.
Crashing
Java
The
Java standard collections contain a
toString( )
method so they can produce a
String
representation of themselves, including the objects they hold. Inside of
Vector,
for example, the
toString( )
steps through the elements of the
Vector
and calls
toString( )
for each one. Suppose you’d like to print out the address of your class.
It seems to make sense to simply refer to
this
(in
particular, C++ programmers are prone to this approach):
//: CrashJava.java
// One way to crash Java
import java.util.*;
public class CrashJava {
public String toString() {
return "CrashJava address: " + this + "\n";
}
public static void main(String[] args) {
Vector v = new Vector();
for(int i = 0; i < 10; i++)
v.addElement(new CrashJava());
System.out.println(v);
}
} ///:~
It
turns out that if you simply create a
CrashJava
object and print it out, you’ll get an endless sequence of exceptions.
However, if you place the
CrashJava
objects in a
Vector
and print out that
Vector
as shown here, it can’t handle it and you don’t even get an
exception; Java just crashes. (But at least it didn’t bring down my
operating system.) This was tested with Java 1.1. What’s
happening is automatic type conversion for
Strings.
When you say:
"CrashJava
address: " + this
The
compiler sees a
String
followed by a ‘
+’
and
something
that’s not a
String,
so it tries to convert
this
to a
String.
It does this conversion by calling
toString( ),
which produces a recursive
call. When this occurs inside a
Vector,
it appears that the stack
overflows without the exception-handling mechanism getting a chance to respond.
If
you really do want to print the address of the object in this case, the
solution is to call the
Object
toString( )
method, which does just that. So instead of saying
this,
you’d say
super.toString( ).
(This only works if you're directly inheriting from
Object
or if none of your parent classes have overridden the
toString( )
method).
BitSet
A
BitSet
is really a
Vector
of bits, and it is used if you want to efficiently store a lot of on-off
information. It’s efficient only from the standpoint of size; if
you’re looking for efficient access, it is slightly slower than using an
array of some native type.
In
addition, the minimum size of the
BitSet
is that of a long: 64 bits. This implies that if you’re storing anything
smaller, like 8 bits, a
BitSet
will be wasteful, so you’re better off creating your own class to hold
your flags.
In
a normal
Vector,
the collection will expand as you add more elements. The
BitSet
does this as well – sort of. That is, sometimes it works and sometimes it
doesn’t, which makes it appear that the Java version 1.0 implementation of
BitSet
is just badly done. (It is fixed in Java 1.1.)
The following example shows how the
BitSet
works
and demonstrates the version 1.0 bug:
//: Bits.java
// Demonstration of BitSet
import java.util.*;
public class Bits {
public static void main(String[] args) {
Random rand = new Random();
// Take the LSB of nextInt():
byte bt = (byte)rand.nextInt();
BitSet bb = new BitSet();
for(int i = 7; i >=0; i--)
if(((1 << i) & bt) != 0)
bb.set(i);
else
bb.clear(i);
System.out.println("byte value: " + bt);
printBitSet(bb);
short st = (short)rand.nextInt();
BitSet bs = new BitSet();
for(int i = 15; i >=0; i--)
if(((1 << i) & st) != 0)
bs.set(i);
else
bs.clear(i);
System.out.println("short value: " + st);
printBitSet(bs);
int it = rand.nextInt();
BitSet bi = new BitSet();
for(int i = 31; i >=0; i--)
if(((1 << i) & it) != 0)
bi.set(i);
else
bi.clear(i);
System.out.println("int value: " + it);
printBitSet(bi);
// Test bitsets >= 64 bits:
BitSet b127 = new BitSet();
b127.set(127);
System.out.println("set bit 127: " + b127);
BitSet b255 = new BitSet(65);
b255.set(255);
System.out.println("set bit 255: " + b255);
BitSet b1023 = new BitSet(512);
// Without the following, an exception is thrown
// in the Java 1.0 implementation of BitSet:
// b1023.set(1023);
b1023.set(1024);
System.out.println("set bit 1023: " + b1023);
}
static void printBitSet(BitSet b) {
System.out.println("bits: " + b);
String bbits = new String();
for(int j = 0; j < b.size() ; j++)
bbits += (b.get(j) ? "1" : "0");
System.out.println("bit pattern: " + bbits);
}
} ///:~
The
random number generator is used to create a random
byte,
short,
and
int,
and each one is transformed into a corresponding bit pattern in a
BitSet.
This works fine because a
BitSet
is 64 bits, so none of these cause it to increase in size. But in Java 1.0,
when the
BitSet
is greater than 64 bits, some strange behavior occurs. If you set a bit
that’s just one greater than the
BitSet’s
currently-allocated storage, it will expand nicely. But if you try to set bits
at higher locations than that without first just touching the boundary,
you’ll get an exception, since the
BitSet
won’t expand properly in Java 1.0. The example shows a
BitSet
of 512 bits being created. The constructor allocates storage for twice that
number of bits. Then if you try to set bit 1024 or greater without first
setting bit 1023, you’ll throw an exception in Java
1.0. Fortunately, this is fixed in Java 1.1,
but avoid using the
BitSet
if you write code for Java 1.0.
Stack
A
Stack
is sometimes referred to as a “last-in, first-out” (LIFO)
collection. That is, whatever you “push” on the
Stack
last is the first item you can “pop” out. Like all of the other
collections in Java, what you push and pop are
Objects,
so you must cast what you pop.
What’s
rather odd is that instead of using a Vector
as a building block to create a
Stack,
Stack
is
inherited from
Vector.
So it has all of the characteristics and behaviors of a
Vector
plus
some extra
Stack
behaviors. It’s difficult to know whether the designers explicitly
decided that this was an especially useful way to do things, or whether it was
just a naïve design.
Here’s
a simple demonstration of
Stack
that reads each line from an array and pushes it as a
String:
//: Stacks.java
// Demonstration of Stack Class
import java.util.*;
public class Stacks {
static String[] months = {
"January", "February", "March", "April",
"May", "June", "July", "August", "September",
"October", "November", "December" };
public static void main(String[] args) {
Stack stk = new Stack();
for(int i = 0; i < months.length; i++)
stk.push(months[i] + " ");
System.out.println("stk = " + stk);
// Treating a stack as a Vector:
stk.addElement("The last line");
System.out.println(
"element 5 = " + stk.elementAt(5));
System.out.println("popping elements:");
while(!stk.empty())
System.out.println(stk.pop());
}
} ///:~
Each
line in the
months
array
is inserted into the
Stack
with
push( ),
and later fetched from the top of the stack with a
pop( ).
To make a point,
Vector
operations
are also performed on the
Stack
object. This is possible because, by virtue of inheritance, a
Stack
is
a
Vector.
Thus, all operations that can be performed on a
Vector
can also be performed on a
Stack,
such as
elementAt( ).
Hashtable
A
Vector
allows you to select from a sequence of objects using a number, so in a sense
it associates numbers to objects. But what if you’d like to select from a
sequence of objects using some other criterion? A
Stack
is an example: its selection criterion is “the last thing pushed on the
stack.” A powerful twist on this idea of “selecting from a
sequence” is alternately termed a map,
a dictionary,
or an associative
array
.
Conceptually, it seems like a vector, but instead of looking up objects using a
number, you look them up using
another
object
!
This is often a key process in a program.
The
concept shows up in Java as the
abstract
class
Dictionary
.
The interface for this class is straightforward:
size( )
tells you how many elements are within,
isEmpty( )
is
true
if there are no elements,
put(Object
key, Object value)
adds a value (the thing you want), and associates it with a key (the thing you
look it up with).
get(Object
key)
produces the value given the corresponding key, and
remove(Object
key)
removes the key-value pair from the list. There are enumerations:
keys( )
produces an
Enumeration
of the keys, and
elements( )
produces an
Enumeration
of all the values. That’s all there is to a
Dictionary. A
Dictionary
isn’t terribly difficult to implement. Here’s a simple approach,
which uses two
Vectors,
one for keys and one for values:
//: AssocArray.java
// Simple version of a Dictionary
import java.util.*;
public class AssocArray extends Dictionary {
private Vector keys = new Vector();
private Vector values = new Vector();
public int size() { return keys.size(); }
public boolean isEmpty() {
return keys.isEmpty();
}
public Object put(Object key, Object value) {
keys.addElement(key);
values.addElement(value);
return key;
}
public Object get(Object key) {
int index = keys.indexOf(key);
// indexOf() Returns -1 if key not found:
if(index == -1) return null;
return values.elementAt(index);
}
public Object remove(Object key) {
int index = keys.indexOf(key);
if(index == -1) return null;
keys.removeElementAt(index);
Object returnval = values.elementAt(index);
values.removeElementAt(index);
return returnval;
}
public Enumeration keys() {
return keys.elements();
}
public Enumeration elements() {
return values.elements();
}
// Test it:
public static void main(String[] args) {
AssocArray aa = new AssocArray();
for(char c = 'a'; c <= 'z'; c++)
aa.put(String.valueOf(c),
String.valueOf(c)
.toUpperCase());
char[] ca = { 'a', 'e', 'i', 'o', 'u' };
for(int i = 0; i < ca.length; i++)
System.out.println("Uppercase: " +
aa.get(String.valueOf(ca[i])));
}
} ///:~
The
first thing you see in the definition of
AssocArray
is that it
extends
Dictionary
.
This means that
AssocArray
is
a type of
Dictionary,
so you can make the same requests of it that you can a
Dictionary.
If you make your own
Dictionary,
as is done here, all you need to do is fill in all the methods that are in
Dictionary.
(And you
must
override all the methods because all of them – with the exception of the
constructor – are abstract.)
The
Vectors
keys
and
values
are linked by a common index number. That is, if you call
put( )
with a key of “roof” and a value of “blue” (assuming
you’re associating the various parts of a house with the colors they are
to be painted) and there are already 100 elements in the
AssocArray,
then “roof” will be the 101 element of
keys
and “blue” will be the 101 element of
values.
And if you look at
get( ),
when you pass “roof” in as the key, it produces the index number
with
keys.indexOf( ),
and then uses that index number to produce the value in the associated
values
vector.
The
test in
main( )
is simple; it’s just a map of lowercase characters to uppercase
characters, which could obviously be done in a number of more efficient ways.
But it shows that
AssocArray
is functional.
The
standard Java library contains only one embodiment of a
Dictionary,
called
Hashtable.[34]
Java’s
Hashtable
has the same basic interface as
AssocArray
(since
they both inherit
Dictionary),
but it
differs
in one distinct way: efficiency. If you look at what must be done for a
get( ),
it seems pretty slow to search through a
Vector
for the key. This is where
Hashtable
speeds things up. Instead of the tedious linear search for the key, it uses a
special value called a
hash
code
.
The
hash code is a way to take some information in the object in question and turn
it into a “relatively unique”
int
for that object. All objects have a hash code, and hashCode( )
is a method in the root class Object.
A
Hashtable
takes
the
hashCode( )
of the object and uses it to quickly hunt for the key. This results in a
dramatic performance improvement.
[35]
The
way
that a
Hashtable
works is beyond the scope of this book
[36]
– all you need to know is that
Hashtable
is a fast
Dictionary,
and that a
Dictionary
is a useful tool.
As
an example of the use of a
Hashtable,
consider a program to check the randomness of Java’s Math.random( )
method. Ideally, it would produce a perfect distribution of random numbers, but
to test this you need to generate a bunch of random numbers and count the ones
that fall in the various ranges. A
Hashtable
is perfect for this, since it associates objects with objects (in this case,
the values produced by
Math.random( )
with the number of times those values appear):
//: Statistics.java
// Simple demonstration of Hashtable
import java.util.*;
class Counter {
int i = 1;
public String toString() {
return Integer.toString(i);
}
}
class Statistics {
public static void main(String[] args) {
Hashtable ht = new Hashtable();
for(int i = 0; i < 10000; i++) {
// Produce a number between 0 and 20:
Integer r =
new Integer((int)(Math.random() * 20));
if(ht.containsKey(r))
((Counter)ht.get(r)).i++;
else
ht.put(r, new Counter());
}
System.out.println(ht);
}
} ///:~
In
main( ),
each time a random number is generated it is wrapped inside an
Integer
object so that handle can be used with the
Hashtable.
(You can’t use a primitive with a collection, only an object handle.) The
containsKey( )
method checks to see if this key is already in the collection. (That is, has
the number been found already?) If so, the get( )
methods gets the associated value for the key, which in this case is a
Counter
object. The value
i
inside the counter is then incremented to indicate that one more of this
particular random number has been found.
If
the key has not been found yet, the method put( )
will place a new key-value pair into the
Hashtable.
Since
Counter
automatically initializes its variable
i
to one when it’s created, it indicates the first occurrence of this
particular random number.
To
display the
Hashtable,
it is simply printed out. The
Hashtable
toString( )
method moves through all the key-value pairs and calls the
toString( )
for each one. The
Integer
toString( )
is pre-defined, and you can see the
toString( )
for
Counter.
The output from one run (with some line breaks added) is:
{19=526, 18=533, 17=460, 16=513, 15=521, 14=495,
13=512, 12=483, 11=488, 10=487, 9=514, 8=523,
7=497, 6=487, 5=480, 4=489, 3=509, 2=503, 1=475,
0=505}
You
might wonder at the necessity of the class
Counter,
which seems like it doesn’t even have the functionality of the wrapper
class
Integer.
Why not use
int
or
Integer?
Well, you can’t use an
int
because all of the collections can hold only
Object
handles. After seeing collections the wrapper classes might begin to make a
little more sense to you, since you can’t put any of the primitive types
in collections. However, the only thing you
can
do with the Java wrappers
is to initialize them to a particular value and read that value. That is,
there’s no way to change a value once a wrapper object has been created.
This makes the
Integer
wrapper immediately useless to solve our problem, so we’re forced to
create a new class that does satisfy the need.
Creating
“key” classes
In
the previous example, a standard library class (
Integer)
was used as a key for the
Hashtable.
It worked fine as a key, because it has all the necessary wiring to make it
work correctly as a key. But a common pitfall occurs when using
Hashtables
when you create your own classes to be used as keys. For example, consider a
weather predicting system that matches
Groundhog
objects to
Prediction
objects. It seems fairly straightforward: you create the two classes and use
Groundhog
as the key and
Prediction
as the value:
//: SpringDetector.java
// Looks plausible, but doesn't work right.
import java.util.*;
class Groundhog {
int ghNumber;
Groundhog(int n) { ghNumber = n; }
}
class Prediction {
boolean shadow = Math.random() > 0.5;
public String toString() {
if(shadow)
return "Six more weeks of Winter!";
else
return "Early Spring!";
}
}
public class SpringDetector {
public static void main(String[] args) {
Hashtable ht = new Hashtable();
for(int i = 0; i < 10; i++)
ht.put(new Groundhog(i), new Prediction());
System.out.println("ht = " + ht + "\n");
System.out.println(
"Looking up prediction for groundhog #3:");
Groundhog gh = new Groundhog(3);
if(ht.containsKey(gh))
System.out.println((Prediction)ht.get(gh));
}
} ///:~
Each
Groundhog
is given an identity number, so you can look up a
Prediction
in the
Hashtable
by saying “Give me the
Prediction
associated
with
Groundhog
number 3.” The
Prediction
class contains a
boolean
that is initialized using
Math.random( ),
and a
toString( )
that interprets the result for you. In
main( ),
a
Hashtable
is filled with
Groundhogs
and their associated
Predictions.
The
Hashtable
is
printed so you can see that it has been filled. Then a
Groundhog
with an identity number of 3 is used to look up the prediction for
Groundhog
#3.
It
seems simple enough, but it doesn’t work. The problem is that
Groundhog
is inherited from the common root class
Object
(which is what happens if you don’t specify a base class, thus all
classes are ultimately inherited from
Object).
It is
Object’s
hashCode( )
method that is used to generate the hash code for each object, and by default
it just uses the address of its object. Thus, the first instance of
Groundhog(3)
does
not
produce a hash code equal to the hash code for the second instance of
Groundhog(3)
that we tried to use as a lookup.
You
might think that all you need to do is write an appropriate override for hashCode( ).
But it still won’t work until you’ve done one more thing: override
the equals( )
that is also part of
Object.
This method is used by the
Hashtable
when trying to determine if your key is equal to any of the keys in the table.
Again, the default
Object.equals( )
simply compares object addresses, so one
Groundhog(3)
is not equal to another
Groundhog(3). Thus,
to use your own classes as keys in a
Hashtable,
you must override both
hashCode( )
and
equals( ),
as shown in the following solution to the problem above:
//: SpringDetector2.java
// If you create a class that's used as a key in
// a Hashtable, you must override hashCode()
// and equals().
import java.util.*;
class Groundhog2 {
int ghNumber;
Groundhog2(int n) { ghNumber = n; }
public int hashCode() { return ghNumber; }
public boolean equals(Object o) {
if ((o != null) && (o instanceof Groundhog2))
return
ghNumber == ((Groundhog2)o).ghNumber;
else return false;
}
}
public class SpringDetector2 {
public static void main(String[] args) {
Hashtable ht = new Hashtable();
for(int i = 0; i < 10; i++)
ht.put(new Groundhog2(i),new Prediction());
System.out.println("ht = " + ht + "\n");
System.out.println(
"Looking up prediction for groundhog #3:");
Groundhog2 gh = new Groundhog2(3);
if(ht.containsKey(gh))
System.out.println((Prediction)ht.get(gh));
}
} ///:~
Note
that this uses the
Prediction
class from the previous example, so
SpringDetector.java
must be compiled first or you’ll get a compile-time error when you try to
compile
SpringDetector2.java
. Groundhog2.hashCode( )
returns the ground hog number as an identifier. (In this example, the
programmer is responsible for ensuring that no two ground hogs exist with the
same ID number.) The
hashCode( )
is
not required to return a unique identifier, but the
equals( )
method must be able to strictly determine whether two objects are equivalent.
The
equals( )
method does two sanity checks: to see if the object is
null,
and if not, whether it is an instance of
Groundhog2
(using the
instanceof
keyword, which is fully explained in Chapter 11). It should be a
Groundhog2
to
even continue executing
equals( ).
The comparison, as you can see, is based on the actual
ghNumbers.
This time, when you run the program, you’ll see it produces the correct
output. (Many of the Java library classes override the
hashcode( )
and
equals( )
methods to be based upon their contents.)
Properties:
a type of Hashtable
In
the first example in this book, a type of
Hashtable
was used called
Properties.
In that example, the lines:
Properties p = System.getProperties();
p.list(System.out);
called
the
static
method getProperties( )
to get a special
Properties
object that described the system characteristics. The method
list( )
is a method of
Properties
that
sends the contents to any stream output that you choose. There’s also a
save( )
method to allow you to write your property list to a file in a way that it can
be retrieved later with the
load( )
method.
Although
the Properties
class is inherited from
Hashtable,
it also
contains
a second
Hashtable
that acts to hold the list of “default” properties. So if a
property isn’t found in the primary list, the defaults will be searched.
The
Properties
class is also available for use in your programs (an example is
ClassScanner.java
in Chapter 17). You can find more complete details in the Java library
documentation.
Enumerators
revisited
We
can now demonstrate the true power of the Enumeration:
the ability to separate the operation of traversing a sequence from the
underlying structure of that sequence. In the following example, the class
PrintData
uses an
Enumeration
to move through a sequence and call the toString( )
method for every object. Two different types of collections are created, a Vector
and a Hashtable,
and they are each filled with, respectively,
Mouse
and
Hamster
objects.
(These classes are defined earlier in the chapter; notice you must have compiled
HamsterMaze.java
and
WorksAnyway.java
for the following program to compile.) Because an
Enumeration
hides
the structure of the underlying collection,
PrintData
doesn’t know or care what kind of collection the
Enumeration
comes
from:
//: Enumerators2.java
// Revisiting Enumerations
import java.util.*;
class PrintData {
static void print(Enumeration e) {
while(e.hasMoreElements())
System.out.println(
e.nextElement().toString());
}
}
class Enumerators2 {
public static void main(String[] args) {
Vector v = new Vector();
for(int i = 0; i < 5; i++)
v.addElement(new Mouse(i));
Hashtable h = new Hashtable();
for(int i = 0; i < 5; i++)
h.put(new Integer(i), new Hamster(i));
System.out.println("Vector");
PrintData.print(v.elements());
System.out.println("Hashtable");
PrintData.print(h.elements());
}
} ///:~
Note
that
PrintData.print( )
takes advantage of the fact that the objects in these collections are of class
Object
so it can call
toString( ).
It’s more likely that in your problem, you must make the assumption that
your
Enumeration
is walking through a collection of some specific type. For example, you might
assume that everything in the collection is a
Shape
with a
draw( )
method. Then you must downcast from the
Object
that
Enumeration.nextElement()
returns to produce a
Shape.
[34]
If you plan to use RMI (described in Chapter 15), you should be aware that
there’s a problem when putting remote objects into a
Hashtable.
(See
Core
Java
,
by Cornell & Horstmann, Prentice-Hall 1997).
[35]
If these speedups still don’t meet your performance needs, you can
further accelerate table lookup by writing your own hash table routine. This
avoids delays due to casting to and from
Objects
and synchronization built into the Java Class Library hash table routine. To
reach even higher levels of performance, speed enthusiasts can use Donald
Knuth’s
The
Art of Computer Programming, Volume 3: Sorting and Searching, Second Edition
to replace overflow bucket lists with arrays that have two additional benefits:
they can be optimized for disk storage characteristics and they can save most
of the time of creating and garbage collecting individual records
.
[36]
The best reference I know of is
Practical
Algorithms for Programmers
,
by Andrew Binstock and John Rex, Addison-Wesley 1995.