Introduction
Docbrokers may be thought of as the ‘tour guides’ for Docbases and
their clients. A client must always ask a docbroker for information
about how to contact a certain docbase, and if the docbroker doesnt
know then the client can do nothing apart from try to find another
docbroker to ask.
When they initialize, docbases inform one or more docbrokers of their
name and the host network address and port on which they are listening
for connection requests. They periodically refresh this information so
as to allow the docbroker to keep a reasonably accurate picture of
what docbases are available under what names, on what hosts and
through which network ports. When a client requests the contact
information for a certain, named, docbase the docbroker consults its
local table for the entry matching the name. If the entry is found then
the host network address and port number are returned to the client,
which can then open a connection directly to the docbase.
However, it is possible to configure a docbroker so that the
information returned to a client is not the network address and port
number that was supplied by the docbase, but is a transformation of
these fields. This facility turns out to be extremely useful when the
installation is running with Network Address Translation (NAT).
Network Address Translation
Although an indepth discussion of Network Address Translation is
beyond the scope of this article, it is probably useful to include a
brief overview of the mechanisms involved, since this can make
troubleshooting faulty configurations a lot easier.
For many reasons, but usually because of a lack of IP address space or
because of network security concerns, enterprises may choose to run in
a network address translating environment. That is, the range of
addresses used on the enterprise intranet may be totally unrelated to
the range of real ‘internet’ addresses available to the enterprise.
For this example consider a fictitious enterprise that leases a small
range (64) of addresses from its ISP (say 192.10.20.0 / 26), but runs
a substantial internal network (say 10.0.0.0 / 8). This allows the
enterprise over 16 million internal network addresses, without having
to pay a fortune to its ISP. [This “slash” notation combines the network
address (the part before the ‘/’) and subnet mask (the part after the
‘/’) into a single value. In this case the subnet masks would be
255.255.255.192 for the /26 network and 255.0.0.0 for the /8 network).
Since it is not possible to make every internal address correspond to a real
internet address, the enterprise uses a Network Address Translation
gateway, where a dedicated piece of hardware inspects all IP traffic
passing through it and ‘translates’ the network addresses according to
its configuration so as to allow communication between the two networks.
To illustrate this process, consider a host on the internal network that is
attempting to contact a web server. It will send a packet to the
server, but this packet will be intercepted by the translation
gateway. The gateway will select a currently unused address from its
configured range and allocate it to this new connection request. It
will then rewrite the source address (and potentially, the source port
number too) before forwarding the message to the internet where it
eventually arrives at the web server. The web server makes a response
to the ‘source’ address, which is received by the translation
gateway. After consulting its tables of current connections, the
gateway will rewrite the destination address (and potentially port
number) of the response before forwarding the response on to the
internal network, where it eventually arrives back at the original
host machine.
In this example the host on the internal network is able to contact
the web server because the all traffic that is not addressed to the
internal network gets sent to the internet through the translation
gateway. However, it is not possible for the host ‘outside’ the
internal network to establish a connection with a host on the internal
network, since there destination network is not part of the internet
address space. In technical terms, the destination is ‘not routable’;
in laymans terms ‘you cant get there from here’. For internet clients
wishing to make use of docbases that exist on corporate intranets,
this is a problem, since even if these clients know both the network
address and port number to use – they cant get there from here.
To get around this obstacle, installations using translated networks
that wish to offer services need to make these available within their
‘real’ internet address range. This allows potential clients to know
where to send their requests. The other part of the solution is once
again the translation gateway, which is configured to accept messages
sent to the ‘real’ internet address range and forward these to hosts
on the internal network.
Making Docbases available outside the local network
As discussed above, docbrokers usually respond to queries by
forwarding the network address and port information that was supplied
to them by the relevant docbase when it initialized. However, for an
enterprise that needs to make some of its docbases available to
clients via the internet, and that use NAT to access the internet,
this approach will not work. The docbroker will reliably inform its clients
of the network addresses and ports needed to reach the various docbases.
However, these addresses will be local, untranslated addresses and
consequently unreachable from the internet. What needs to happen is that the
docbroker returns a translated version of the network address and port
information that matches the configuration of the NAT tables at the
internet gateway, in order for the external client to be able to
address the internal docbase.
The translation gateway cannot handle all of this work on its
own. This is because these gateways read and transform only the IP
header information of the packets it forwards. They have not capable
of picking apart an arbitrary packet and transforming any sequence of
bytes that ‘looks like’ an IP address. Since the contents of a
docbroker response are not examined by the translation gateway, the
docbroker itself needs to help out. An example may make things clearer.
Again, let’s assume that the internal network is 10.0.0.0 / 8. In
this case the actual docbase ‘mydocbase’ is hosted at 10.0.0.200, port
1000. Again, the IP address space for the enterprise is 192.10.20.0 /
8. The NAT gateway is configured to translate incoming messages for
192.10.20.30 on port 2000 into messages addressed to 10.0.0.200 on
port 1000. [It should be noted that there does not need to be a
physical host at address 192.10.20.30 in order for this mechanism to
work.]
So, an internet client (address 1.2.3.4) calls the docbroker and requests
the docbase information for ‘mydocbase’.
The docbroker responds with a message indicating that the docbase may
be found at 192.10.20.30, port 2000.
The client sends a message to 192.10.20.30, port 2000 to open a
connection to the docbase. As with all IP messages, the client
includes its own address and port information so that the recipient of
the message can send a response. In this example, the client picks
port 3000 as the port on which it will listen for a response. Having
sent the message, the client waits for a response addressed to itself
(1.2.3.4) on port 3000 being sent from 192.10.20.30, port 2000.
The NAT gateway receives the message, recognises the address and port
combination from its configuration tables and rewrites the destination
address to 10.0.0.200 and the destination port to 1000. It also makes
a note of the sender address and port number, (1.2.3.4, 3000). It then
sends the transformed message across the internal network to the
docbase.
On receiving the message, the docbase sends its response back to
1.2.3.4 on port 3000. On its way to the internet this message is
received by the NAT gateway, which finds the destination address and
port combination (1.2.3.4, 3000) in its local table and so rewrites
the source address and port information to make it appear that the
response has actually come from 192.10.20.30, port 2000, since this is
what the client will be expecting. The gateway then sends message
across the internet to the client, which is then content that it has
established contact with the docbase.
Docbroker Configuration
The docbroker software will read its configuration from an
initialization file when it is invoked with the ‘-init_file’
argument. In order to cause the docbroker to translate its responses
for information, this initialization file needs a section along the
following lines:
[TRANSLATION]
port = '<port-specification-list>'
host = '<host-specification-list>'
The grammar to define these entries is:
<port-specification-list> ::= <port-specification>[','<port-specification>]
<port-specification> ::= <to-port>'='<from-port>
<to-port> ::= decimal integer
<from-port> ::= decimal integer
<host-specification-list> ::= <host-specification>[','<host-specification>]
<host-specification> ::= <to-address>'='<from-address>
<to-address> ::= dotted-quad IP address
<from-address> ::= dotted-quad IP address
In order to configure the docbroker to respond as outlined in the
example, the translation section in the initialization file would
appear as:
[TRANSLATION]
port = '2000=1000'
host = '192.10.20.30=10.0.0.200'
When the docbroker receives a request for information where the
response matches a ‘from’ field, it will replace that information with
the contents of the corresponding ‘to’ field. For responses that do
not match any configured ‘from’ field, no translation will take place
and the response will be the information as originally supplied by the
docbase.
Neither the port nor host specifications are required, although at
least one should be present. But, if simple address translation is
sufficient and no port translation is required then only the ‘host’
line need be included in the configuration.
It is important to note that every docbase that is made available
through this translation mechanism needs to be configured with its own
unique port number. It is not possible to arrange for two docbases on
different hosts, say 10.0.0.200 and 10.0.0.201 but both configured to
use port 4000, to be made available through this translation
mechanism. One of the docbases would need to be reconfigured to use
(say) port 4001. Fortunately, this reconfiguration is usually simple,
although it is a system-dependent task.
It should be understood that the docbroker will apply these
translations in all cases where the response to a request for
information matches a translation clause. This means that clients
which are on the same intranet as the docbroker will receive bogus
answers and will not be able to contact the docbases even though they
are apparently available. To avoid this, most enterprises will run at
least two docbrokers – one for internal clients and one for external
clients.
By default a documentum installation does not provide a docbroker
initialization file, and so the standard scripts provided by
Documentum to start the docbroker need modification before the
docbroker will take note. By convention, the initialization file for
the docbroker would be called docbroker.ini, and stored in the $HOME
directory for the documentum installation owner. For most
installations, the script dm_launch_docbroker needs the line where the
docbroker is invoked to be modified along the following lines:
./dmdocbroker -init_file $HOME/docbroker.ini >> $logfile 2>&1 &