Datagrams
The
examples you’ve seen so far use the Transmission
Control Protocol
(TCP, also known as stream-based
sockets
),
which is designed for ultimate reliability and guarantees that the data will
get there. It allows retransmission of lost data, it provides multiple paths
through different routers in case one goes down, and bytes are delivered in the
order they are sent. All this control and reliability comes at a cost: TCP has
a high overhead.
There’s
a second protocol, called User
Datagram Protocol
(UDP), which doesn’t guarantee that the packets will be delivered and
doesn’t guarantee that they will arrive in the order they were sent.
It’s called an “unreliable
protocol” (TCP is a “reliable
protocol”), which sounds bad, but because it’s much faster it can
be useful. There are some applications, such as an audio signal, in which it
isn’t so critical if a few packets are dropped here or there but speed is
vital. Or consider a time-of-day server, where it really doesn’t matter
if one of the messages is lost. Also, some applications might be able to fire
off a UDP message to a server and can then assume, if there is no response in a
reasonable period of time, that the message was lost.
The
support for datagrams in Java has the same feel as its support for TCP sockets,
but there are significant differences. With datagrams, you put a DatagramSocket
on both the client and server, but there is no analogy to the
ServerSocket
that waits around for a connection. That’s because there is no
“connection,” but instead a datagram just shows up. Another
fundamental difference is that with TCP sockets, once you’ve made the
connection you don’t need to worry about who’s talking to whom
anymore; you just send the data back and forth through conventional streams.
However, with datagrams, the datagram packet must know where it came from and
where it’s supposed to go. That means you must know these things for each
datagram packet that you load up and ship off.
A
DatagramSocket
sends and receives the packets, and the DatagramPacket
contains the information. When you’re receiving a datagram, you need only
provide a buffer in which the data will be placed; the information about the
Internet address and port number where the information came from will be
automatically initialized when the packet arrives through the
DatagramSocket.
So the constructor for a
DatagramPacket
to
receive datagrams is:
DatagramPacket(buf,
buf.length)
in
which
buf
is
an array of
byte.
Since
buf
is an array, you might wonder why the constructor couldn’t figure out the
length of the array on its own. I wondered this, and can only guess that
it’s a throwback to C-style programming, in which of course arrays
can’t tell you how big they are.
You
can reuse a receiving datagram; you don’t have to make a new one each
time. Every time you reuse it, the data in the buffer is overwritten.
The
maximum size of the buffer is restricted only by the allowable datagram packet
size, which limits it to slightly less than 64Kbytes. However, in many
applications you’ll want it to be much smaller, certainly when
you’re sending data. Your chosen packet size depends on what you need for
your particular application.
When
you send a datagram, the
DatagramPacket
must contain not only the data, but also the Internet address and port where it
will be sent. So the constructor for an outgoing
DatagramPacket
is:
DatagramPacket(buf,
length, inetAddress, port)
This
time,
buf
(which is a
byte
array)
already contains the data that you want to send out. The
length
might be the length of
buf,
but it can also be shorter, indicating that you want to send only that many
bytes. The other two arguments are the Internet address where the packet is
going and the destination port within that machine.
[64] You
might think that the two constructors create two different objects: one for
receiving datagrams and one for sending them. Good OO design would suggest that
these should be two different classes, rather than one class with different
behavior depending on how you construct the object. This is probably true, but
fortunately the use of
DatagramPackets
is simple enough that you’re not tripped up by the problem, as you can
see in the following example. This example is similar to the
MultiJabberServer
and
MultiJabberClient
example for TCP sockets. Multiple clients will send datagrams to a server,
which will echo them back to the same client that sent the message.
To
simplify the creation of a
DatagramPacket
from a
String
and vice-versa, the example begins with a utility class,
Dgram,
to do the work for you:
//: Dgram.java
// A utility class to convert back and forth
// Between Strings and DataGramPackets.
import java.net.*;
public class Dgram {
public static DatagramPacket toDatagram(
String s, InetAddress destIA, int destPort) {
// Deprecated in Java 1.1, but it works:
byte[] buf = new byte[s.length() + 1];
s.getBytes(0, s.length(), buf, 0);
// The correct Java 1.1 approach, but it's
// Broken (it truncates the String):
// byte[] buf = s.getBytes();
return new DatagramPacket(buf, buf.length,
destIA, destPort);
}
public static String toString(DatagramPacket p){
// The Java 1.0 approach:
// return new String(p.getData(),
// 0, 0, p.getLength());
// The Java 1.1 approach:
return
new String(p.getData(), 0, p.getLength());
}
} ///:~
The
first method of
Dgram
takes a
String,
an
InetAddress,
and a port number and builds a
DatagramPacket
by copying the contents of the
String
into a
byte
buffer and passing the buffer into the
DatagramPacket
constructor. Notice the “+1” in the buffer allocation – this
was necessary to prevent truncation. The
getBytes( )
method of
String
is a special operation that copies the
chars
of a
String
into a
byte
buffer. This method is now deprecated; Java 1.1
has a “better” way to do this but it’s commented out here
because it truncates the
String.
So you’ll get a deprecation message when you compile it under Java 1.1,
but the behavior will be correct. (This bug might be fixed by the time you read
this.)
The
Dgram.toString( )
method shows both the Java 1.0
approach and the Java 1.1 approach (which is different because there’s a
new kind of
String
constructor).
Here
is the server for the datagram demonstration:
//: ChatterServer.java
// A server that echoes datagrams
import java.net.*;
import java.io.*;
import java.util.*;
public class ChatterServer {
static final int INPORT = 1711;
private byte[] buf = new byte[1000];
private DatagramPacket dp =
new DatagramPacket(buf, buf.length);
// Can listen & send on the same socket:
private DatagramSocket socket;
public ChatterServer() {
try {
socket = new DatagramSocket(INPORT);
System.out.println("Server started");
while(true) {
// Block until a datagram appears:
socket.receive(dp);
String rcvd = Dgram.toString(dp) +
", from address: " + dp.getAddress() +
", port: " + dp.getPort();
System.out.println(rcvd);
String echoString =
"Echoed: " + rcvd;
// Extract the address and port from the
// received datagram to find out where to
// send it back:
DatagramPacket echo =
Dgram.toDatagram(echoString,
dp.getAddress(), dp.getPort());
socket.send(echo);
}
} catch(SocketException e) {
System.err.println("Can't open socket");
System.exit(1);
} catch(IOException e) {
System.err.println("Communication error");
e.printStackTrace();
}
}
public static void main(String[] args) {
new ChatterServer();
}
} ///:~
The
ChatterServer
contains a single
DatagramSocket
for receiving messages, instead of creating one each time you’re ready to
receive a new message. The single
DatagramSocket
can be used repeatedly. This
DatagramSocket
has a port number because this is the server and the client must have an exact
address where it wants to send the datagram. It is given a port number but not
an Internet address because it resides on “this” machine so it
knows what its Internet address is (in this case, the default
localhost).
In the infinite
while
loop, the
socket
is told to receive( ),
whereupon it blocks until a datagram shows up, and then sticks it into our
designated receiver, the
DatagramPacket
dp
.
The packet is converted to a
String
along with information about the Internet address and socket where the packet
came from. This information is displayed, and then an extra string is added to
indicate that it is being echoed back from the server.
Now
there’s a bit of a quandary. As you will see, there are potentially many
different Internet addresses and port numbers that the messages might come from
– that is, the clients can reside on any machine. (In this demonstration
they all reside on the
localhost,
but the port number for each client is different.) To send a message back to
the client that originated it, you need to know that client’s Internet
address and port number. Fortunately, this information is conveniently packaged
inside the DatagramPacket
that sent the message, so all you have to do is pull it out using getAddress( )
and getPort( ),
which are used to build the
DatagramPacket
echo
that is sent back through the same socket that’s doing the receiving. In
addition, when the socket sends the datagram, it automatically adds the
Internet address and port information of
this
machine, so that when the client receives the message, it can use
getAddress( )
and
getPort( )
to find out where the datagram came from. In fact, the only time that
getAddress( )
and
getPort( )
don’t tell you where the datagram came from is if you create a datagram
to send and you call
getAddress( )
and
getPort( )
before
you send the datagram (in which case it tells the address and port of this
machine, the one the datagram is being sent from). This is an essential part of
datagrams: you don’t need to keep track of where a message came from
because it’s always stored inside the datagram. In fact, the most
reliable way to program is if you don’t try to keep track, but instead
always extract the address and port from the datagram in question (as is done
here).
To
test this server, here’s a program that makes a number of clients, all of
which fire datagram packets to the server and wait for the server to echo them
back.
//: ChatterClient.java
// Tests the ChatterServer by starting multiple
// clients, each of which sends datagrams.
import java.lang.Thread;
import java.net.*;
import java.io.*;
public class ChatterClient extends Thread {
// Can listen & send on the same socket:
private DatagramSocket s;
private InetAddress hostAddress;
private byte[] buf = new byte[1000];
private DatagramPacket dp =
new DatagramPacket(buf, buf.length);
private int id;
public ChatterClient(int identifier) {
id = identifier;
try {
// Auto-assign port number:
s = new DatagramSocket();
hostAddress =
InetAddress.getByName("localhost");
} catch(UnknownHostException e) {
System.err.println("Cannot find host");
System.exit(1);
} catch(SocketException e) {
System.err.println("Can't open socket");
e.printStackTrace();
System.exit(1);
}
System.out.println("ChatterClient starting");
}
public void run() {
try {
for(int i = 0; i < 25; i++) {
String outMessage = "Client #" +
id + ", message #" + i;
// Make and send a datagram:
s.send(Dgram.toDatagram(outMessage,
hostAddress,
ChatterServer.INPORT));
// Block until it echoes back:
s.receive(dp);
// Print out the echoed contents:
String rcvd = "Client #" + id +
", rcvd from " +
dp.getAddress() + ", " +
dp.getPort() + ": " +
Dgram.toString(dp);
System.out.println(rcvd);
}
} catch(IOException e) {
e.printStackTrace();
System.exit(1);
}
}
public static void main(String[] args) {
for(int i = 0; i < 10; i++)
new ChatterClient(i).start();
}
} ///:~
ChatterClient
is created as a
Thread
so that multiple clients can be made to bother the server. Here you can see
that the receiving
DatagramPacket
looks just like the one used for
ChatterServer.
In the constructor, the
DatagramSocket
is created with no arguments since it doesn’t need to advertise itself as
being at a particular port number. The Internet address used for this socket
will be “this machine” (for the example,
localhost)
and the port number will be automatically assigned, as you will see from the
output. This
DatagramSocket,
like the one for the server, will be used both for sending and receiving.
The
hostAddress
is the Internet address of the host machine you want to talk to. The one part
of the program in which you must know an exact Internet address and port number
is the part in which you make the outgoing
DatagramPacket.
As is always the case, the host must be at a known address and port number so
that clients can originate conversations with the host.
Each
thread is given a unique identification number (although the port number
automatically assigned to the thread would also provide a unique identifier). In
run( ),
a message
String
is created that contains the thread’s identification number and the
message number this thread is currently sending. This
String
is used to create a datagram that is sent to the host at its address; the port
number is taken directly from a constant in
ChatterServer.
Once the message is sent,
receive( )
blocks until the server replies with an echoing message. All of the information
that’s shipped around with the message allows you to see that what comes
back to this particular thread is derived from the message that originated from
it. In this example, even though UDP is an “unreliable” protocol,
you’ll see that all of the datagrams get where they’re supposed to.
(This will be true for localhost and LAN situations, but you might begin to see
some failures for non-local connections.)
When
you run this program, you’ll see that each of the threads finishes, which
means that each of the datagram packets sent to the server is turned around and
echoed to the correct recipient; otherwise one or more threads would hang,
blocking until their input shows up.
You
might think that the only right way to, for example, transfer a file from one
machine to another is through TCP sockets, since they’re
“reliable.” However, because of the speed of datagrams they can
actually be a better solution. You simply break the file up into packets and
number each packet. The receiving machine takes the packets and reassembles
them; a “header packet” tells the machine how many to expect and
any other important information. If a packet is lost, the receiving machine
sends a datagram back telling the sender to retransmit.
[64]
TCP and UDP ports are considered unique. That is, you can simultaneously run a
TCP and UDP server on port 8080 without interference.