Overview of JXTA
August 21, 2002
Messages
Messaging in JXTA is done in two different ways. First is the standard way
that would be expected with XML. The messages are packets that contain a payload
of data formatted to follow XML standards.
The second type of message is a very economical binary message. Despite the
desire to use XML for all JXTA messages, the reality is that there are many
messages sent and received. The bulk of XML messages send in large volumes is
very inefficient. Also, because messages are usually sent from application to
application, it is simple to standardize on the contents of the message. The
remainders of the protocols are still XML.
The use of binary messages in an XML protocol may seem counterintuitive. The
truth is that there are more advantages than just compactness. The first is that
data can be compressed using standard techniques. Compression of data, such as
text, can create a huge savings in time to transmit.
Another reason for binary data is that many messages are already binary. For
example, a document-sharing program will most likely share binary documents. If
you were transferring messages via XML, the data would need to be converted to
XML.
Another reason for binary messages is encryption. Because the data needs to
be converted to binary for encryption, moving straight to and from binary
instead of XML makes sense.
Identifiers
JXTA has a wide selection of different identifiers. Identifiers range from
large, unique identifiers to names and URLs. Identifiers are used like pointers
of references. In the reference platform, identifiers are used for indexing,
filenames, and searching.
Rendezvous Peer
A rendezvous is a peer that processes queries from other peers. The
rendezvous can also delegate queries to other peers, which must also be a
rendezvous. A key purpose of rendezvous is to facilitate searching of
advertisements beyond a peer's local network. Rendezvous usually have more
resources than other peers and can store large amounts of information about the
peers around it. In a peer network, information is scattered among peers and not
stored entirely on any single machine, such as a server. Instead, there are
rendezvous that distribute the storage of the advertisements.
Rendezvous peers can also act as relays of searches. The rendezvous peer can
forward discovery requests to other rendezvous peers that receive their
information from peers with whom they have exchanged advertisements. Each
rendezvous will forward on a request if it does not have the information
requested.
A typical search is illustrated in Figure 2.1.
The remote search starts from Peer 1 which firsts queries local Peers 2 and
3 via IP Multicast. These Peers (2 and 3) are most likely on the local LAN and
are quickly accessed. Next, if these Peers do not have the specified resource,
a rendezvous peer is searched. If the rendezvous peer does not have the advertisement,
successive rendezvous peers are searched. Note that besides the peers local
to the query peer, only rendezvous peers are searched.

Figure 2.1: Rendezvous query routing.
IP Multicast
IP Multicast is a one-to-many messaging protocol. IP Multicast is used to
send one copy of data to a group address, reaching all recipients who are
configured to receive it.
IP Multicast has two benefits over P2P applications. First, because multicast
uses a group address instead of IP addresses, a peer sending a message can do so
without knowledge of the listening peer's address. The result is that all
peers within the multicast network can now respond to the caller with
information to the query and even their IP address for direct communication.
IP Multicast's second benefit is the reduction of bandwidth. Because all
peers can see a single message, there is no need to send a copy of the message
to each peer. This is very important when sending large amounts of data to a
group of peers.
A drawback to using multicasting is that some firewalls and routers block
multicast messages. There is some support for sending multicast messages via
Internet backbones between Internet providers, but it is often a service for
which you must pay extra. There are other barriers to IP-Multicast. These can
include personal firewalls and subnet routers. This is why JXTA supports more
than just IP-Multicast.
In general, the multicast support available behind a firewall is sufficient
for most P2P needs. You can take advantage of localized multicast support by
sending duplicate messages for each network to a specific peer within the
network for rebroadcast via multicasting to the local peers.
Only a rendezvous allows searching beyond a local network. A peer has the
option of being a rendezvous, but it is not required. There is a side benefit of
being a rendezvousthe peer will retain a cached copy of the results from
other rendezvous of the result of cached answers to requests.
On the negative side of being a rendezvous, the peer will use more memory and
higher bandwidth. Because of the possibly high number of requests and the
resources consumed by a large database of advertisements, the case can be made
for a dedicated rendezvous peer. In corporate installations, the rendezvous
could also perform the duties of gateway and router for the corporate intranet.
The effect would be similar to the use of a traditional router. Additional
scaling could use a rendezvous in each of the corporation's subnets.
The need for dedicated rendezvous peers depends on security and the scale of
P2P applications used. The P2P network topology should be examined on a
case-by-case basis and monitored regularly.
It is important to mention here that a P2P network becomes more efficient as
services are duplicated among a large number of peers. However, there may be a
point where additional rendezvous do not add to efficiency.
Rendezvous are used when a peer is searching for an advertisement or when
other services use the rendezvous mechanism to route messages, so the need by a
peer is not constant. A rendezvous connected to the Internet will be exposed to
possibly thousands of peers. Within the bounds of a firewall-isolated network,
having all peers configured as a rendezvous will probably not have a large
affect caused by rendezvous tasks. Note that this observation may prove
incorrect if there are many requests that have large search scopes.
Rendezvous are also used for application specific queries. In Chapter 9,
"Synchronizing Data Between Peers," peers use peers acting as
rendezvous to propagate information about new appointments in a calendar as well
as synchronizing an address book.
Note
When considering any network topology or
deciding if your peers should be a gateway, router, or rendezvous, be sure to
check the current implementation for its efficiency and capabilities. Remember
also that some applications may specifically use these services in unique ways.
The JXTA platform will continue to evolve and should eventually cover most
topologies or be configurable for many situations. But each environment is
unique, so experimentation and monitoring of actual use may be the best way to
configure your peersespecially in a large corporate environment.
Router Peer
A router in JXTA is any peer that supports the peer endpoint protocol. Not
all peers need to implement the protocol because, like traditional network
routers, you only need a few to support a large network. JXTA routers are very
similar to a traditional router. The primary difference is that a P2P network is
less stable and includes many addresses that are not static.
Figure 2.2 is an example of how a route
is created. The request for a route starts at Peer 1 with the request for a
route passed to available routers until the complete path to Peer 8 is built.
Note that because a peer can be a gateway as well as a router, Peer 2 includes
itself as a node in the route. Please understand that this is just conceptually
how routers work, not how they are actually implemented. Routers can support
caching or complex algorithms. For example, instead of the first router forwarding
a request for the rest of the route, this router may already have a cached version
of the route available.

Figure 2.2: Conceptual example of peer endpoint routers creating a route between Peer 1 and
Peer 8.
Gateway Peer
A gateway is a peer that acts as a communications relay. Don't confuse
gateways with rendezvous. A gateway is used to relay messages between peers, not
requests.
Gateways are like radio repeaters or a middleman between peers used to relay
messages. Gateways are critical to connectivity because of firewalls, NAT
devices, and network proxies. Gateways can store messages and wait for their
intended recipient to collect the messages.
Gateways exist because the Internet is very messy. The mess is caused by the
fact that we have all sorts of security and barriers that prevent a common way
to communicate between peers. Another bit of this mess is the difference between
protocols supported by peers. Some peers may use TCP, others may use HTTP. With
wireless, we would need Wireless Application Protocol (WAP) as well. The gateway
supports as many of these protocols as is possible so that it can act as a
middleman between different types of protocols. JXTA started with support for
TCP and HTTP, but other gateways are in development.
Gateways are key to getting around most of the security on the Internet. Firewalls,
proxy servers, and NAT devices are the common security barriers. Figure
2.3 shows how the gateway Peer 2 is used to interface between Peer 1 and
Peer 3. The gateway translates HTTP messages from peer 1 to TCP for delivery
to Peer 3. When messages are sent from Peer 3, they are sent via TCP to Peer
2, which holds the message until Peer 1 makes an HTTP request to retrieve the
data.

Figure 2.3: Example of gateway participation in a single pipe.
Note
JXTA has started using the term relay to
merge the terms rendezvous, router and gateway. Relay will also be used to
include concepts like proxy, transcoding, and related "JXTA network
helpers," rather than proliferating a bunch of overlapping terms. It is
also expected that there will be specialized peers or even commercial appliances
that have just these functions, so a single name is more marketable. We will use
the terms interchangeably for now, depending on the focus of discussion.
Why We Need Relays (Routers and Gateways)?
Although we have touched on the subjects, we need to specifically cover why
relays are required for a P2P network. The following sections discuss each of
the barriers to a P2P network. Each of these creates a need for us to abstract
the network and create a virtual network where the P2P system provides routing
and messaging via HTTP tunneling or to switch transport protocols.
Firewalls
Firewalls, which filter almost everything except HTTP, are most often found
at larger companies, but they are also now found in homes that use special
firewall routers. There are also personal firewalls, so called because they run
on the user's personal computer. Firewalls are often configured to filter
almost everything except HTTP. HTTP only allows communications that are
initiated by the client.
For example, when you are requesting a Web page, connecting to a Web server,
sending the request, receiving the requested page, and then disconnecting. At no
time does the Web server initiate a connection to the Web browser.
Because there is only one direction of communication that can be initiated,
the gateway acts as a virtual agent that accepts messages for later delivery.
Therefore, if a peer attempts to talk to a peer that can only initiate HTTP
communications, the gateway holds the message until the HTTP peer contacts the
gateway and asks for messages addressed to it.
NAT (Network Address Translator)
A Network Address Translator (NAT) device is just as disruptive as a
firewall. Most wideband routers for cable and DSL use NAT.
NAT lets you use a single IP address for a whole network of computers.
Because Internet providers charge by the unique IP address, many use a NAT to
save money. The NAT sits between the public Internet and the a local area
network (LAN) where it rewrites IP addresses and port numbers in IP headers
on-the-fly so that the packets all appear to be with the public IP address of
the NAT device instead of the actual source or destination. This causes multiple
problems with applications that pass addresses and ports back and forth across
the NAT. The NAT simply cannot detect and correct the message to reflect the
mapped address. Essentially, this means that if you are behind a NAT, you will
have trouble telling anyone what your address really is.
Many NATs, for security reasons, only allow incoming traffic from an outside
address only if an outgoing packet has already been sent to that outside
address. This is like a poor man's firewall, because it prevents anyone
from directly connecting to your computer without you first initiating
communications. This is like a phone that cannot receive calls but will still
let you call anyone.
A socket connection may be assigned to the same external address/port on
subsequent connections. This means that you cannot be sure of a return path for
messages. This makes it very difficult to create a two-way conversation.
The gateway gets around the NAT the same way it gets around a firewall. By
using the HTTP protocol, the gateway on the other side of the NAT ensures that
peers can communicate with the peer behind the NAT.
Proxy Server
A proxy server is a device that sits between the Internet and a LAN. The
proxy servers provide services like filtering, caching, and monitoring of
traffic.
The result of having a network proxy is similar to a NAT. The proxy device
can limit addresses as well as map them to others, such as NAT. Proxy devices
can be as sophisticated as a firewall and limit certain types of communication.
For example, a proxy service can prevent you from accessing a forbidden Web
site. Some proxy servers can even detect viruses embedded in incoming e-mail
before they ever reach your e-mail's in-box.
The gateway can usually get around a proxy server by bridging the gap with
HTTP. Some very sophisticated systems can be programmed to detect and prevent
such traffic. Some companies only allow pure HTML to pass and discard all other
types of data. In some cases, you may not be able to use JXTA applications
without specific configuration and permission of the network administrator.
DHCP
Many companies, and especially Internet service providers utilize Dynamic
Host Configuration Protocol (DHCP). DHCP allocates IP addresses dynamically. The
effect is that each time the DHCP server re-boots or a user's computer
re-boots, the IP address is changed. The address can also change if the IP
address lease expires. The effect is that even a known address is a temporary
address. Because of DHCP, even when you are behind a firewall, peers may still
not have addresses on which you can depend. This makes it difficult when
communicating across the firewall to peers on the other side.
The possibility of changing addresses is greatly improved by router peers.
Router peers are able to create new routes between peers when addresses
change.