WebCache Intro and Q&A

Share This Post


WebCache is a Documentum tool that allows quick access to
content and (optionally) it’s associated attributes by storing
them on a flat filesystem and an RDBMS, respectively.

WebCache is intended to be used to allow a website to use
content that exists in Documentum, but without the overhead
of talking directly to the server. Once you have configured
WebCache in your environment, you can write programs for your
website that do anything you want with the content and it’s
associated attributes.

WebCache consists of two major components:

WebCache Source: The machine on which the docbase resides.
This component of WebCache is responsible for sending changed
data and attributes to the WebCache target, either at periodic
intervals (nightly, daily, hourly), or triggered by the manual
invokation of a dm_job.

WebCache Target: This consists of at least one component:
A copy of the content from the source Docbase which is to be
cached. Alternatively, it may also consist of a Database component
(the RDBMS doesn’t have to reside on the same machine), which
provides attribute information on the content that has been
cached to the filesystem.

This article describes several of the details about
Documentum WebCache in a question/answer format.

Where do documents go on the target?

After a WebCache operation, the content files are saved
on the filesystem of the WebCache target. The root
location of where these are stored can be configured to be
anywhere. They are structured in directories, based on
the path within the Source docbase.

How, exactly, are attributes stored in the Target database?

All attributes of WebCached content are stored somewhere inside
two special tables. The first part of the name of the tables
can be configured to whatever you want. In this example,
and throughout this document, we will assume WebCache has
been configured to name these tables starting with PROPS.

Single-valued attributes are stored in a table called PROPS_S:


More columns are added to the single-valued attribute table if you configure
WebCache to use additional attributes from your source documents (any
attribute can be used, but you must specify which ones)

The A_WEBC_URL column is the unique identifier of the content that is
being described. It looks like a path. The A_WEBC_URL is a key into
the multi-valued property table, too.

Multi-valued attributes are stored in a table called PROPS_R


In this example, STATES is the name of a repeating attribute of a
custom object type. There will be as many rows in this table for
a given A_WEBC_URL as there are STATES for that document.

Things to keep in mind about repeating attributes:

  • The order of repeating attributes is preserved. That means,
    if you put a bunch of values in a specific order in documentum,
    you can expect to find them in the same order within the WebCache
    Target DB’s PROPS_R table.
  • If you have multiple repeating attributes, for each A_WEBC_URL,
    there will be as many rows as the maximum number of populated
    attributes. Empty repeating attribute row’s entries are NULL.
    For example, if you have repeating attributes
    ABC and XYZ, if, for a certain document you have 3 ABC’s and
    10 XYZ’s, there will be seven rows for the associated A_WEBC_URL
    in which the value for ABC is NULL.

How do you control which documents are published to the cache?

In the WebCache configuration object in the docbase, you define a
starting folder, a version, and an effective label (optional) for each
WebCache configuration object.

There is no configurable “where” clause. However, if you have Documentum
WebPublisher installed, you can publish one document at a time, given it’s objectID.
So, you could emulate “where” clause behavior by writing your own
program to return a list of content in documentum, then call this
special WebPublisher publish method for each other those object.

WebCache optionally pays attention to the a_effective_label, a_effective_date, and
a_expiration_date attributes of each document. If a document has an
a_effective_label matching the effective label specified in the
WebCache configuration object, it will be made available on the target
only for the period of time occurring before the a_effective_date and
a_expiration_date specified for that document.

Note: Because Documentum WebPublisher uses these special attributes,
they shouldn’t be used in conjunction with WebCache if WebPublisher is
running on the source docbase.

How does WebCache support multiple renditions?

Multiple formats for objects are supported. In the webcache
configuration object in the docbase, you specify which formats
should be published when multiples exist.

When multiple formats exist, they are placed in the same directory.
The ‘primary’ format’s filename is the object name. Other formats of the
same object are named as the object name (minus the extension, if one exists),
plus the dos_extension of the format (from dm_format).

For example, say you have a document called testdoc2.txt.
In the docbase, here are the relevant attributes:

______|______ ______|______
| | | |
| | | |
------------ ------------- -------------
testdoc2.txt crtext txt
testdoc2.txt html htm

Target Database gets a unique A_WEBC_URL entry for each
format. The I_FULL_FORMAT value is also propagated to
the target DB:

---------------------- -----------------------
TestFolder/testdoc2.txt crtext
TestFolder/testdoc2.htm html

The A_WEBC_URL represents the path to the document from the root directory
for webcache’s file dumps on the target.

Note: Because documents are given more or less “standard” extensions
during the webcache process, if you’re serving them directly from a webserver,
the target webserver should deliver them with
the correct MIME type. For non-standard extensions, you may need to add
those manually to your webserver’s configuration. Relevant MIME type data can be
gathered from the the mime_type and dos_extension fields in the docbases’s
dm_format table.

Is it possible to publish from two distinct WebCache sources
to one target?

Documentum says that this shouldn’t be attempted because
files will end up over-writing each other and it will end up
being a big mess.

Sometimes it’s desirable to have data and attributes from
separate docbases available on the same website.

You could do this by:

  • (for content files) Setting up multiple targets,
    and pretending they are one target from the webserver
    site. You can create symbolic links into the target
    content directories from your webserver, or (a more
    drastic approach) write a website front-end that doesn’t
    hit the filesystem based on the URL, but instead takes
    the request and decides which target webcache file area
    to retrieve it from.
  • (for attributes) Publish to differently named
    tables for each of the multiple targets. Set database
    triggers on these tables which will reflect changes to
    a master table on-the-fly. This way, you’ll have only
    one table to query for attributes, instead of two.

What gets copied to the cache when a source document is linked
to another folder?

If the links reside under the same webcache configured root source
folder, a copy is made on the target for each instance of the document,
and for each copy, a set of attributes exists in the Database, if
RDBMS functionality is enabled for WebCache.

$ ls -la
total 48
drwxr-xr-x 3 /articles/dmin staff 512 Jun 1 13:57 .
drwxr-xr-x 3 /articles/dmin staff 512 May 24 15:51 ..
drwxr-xr-x 2 /articles/dmin staff 512 Jun 1 13:57 InnerFolder
-rw-r--r-- 1 /articles/dmin staff 23 May 24 16:07 testdoc1
-rw-r--r-- 1 /articles/dmin staff 18603 May 24 15:51 testdoc2.htm
-rw-r--r-- 1 /articles/dmin staff 46 May 24 15:51 testdoc2.txt
$ cd InnerFolder
$ ls -la
total 6
drwxr-xr-x 2 /articles/dmin staff 512 Jun 1 13:57 .
drwxr-xr-x 3 /articles/dmin staff 512 Jun 1 13:57 ..
-rw-r--r-- 1 /articles/dmin staff 23 Jun 1 13:57 testdoc1

Can contentless objects be exported?

Although the documentation states that they can, it is currently not
possible (as of WebCache version 4.2). This has been reported as
a bug.

A workaround is to attach a 0-byte piece of content to items that don’t have
to have content. If a more recent version than 4.2 exists since the
publishing of this article, the workaround may not be needed.

Does WebCache copy virtual documents or multiple versions of
the same document?

When a document is copied, only one version gets pushed to a given
webcache target.

When a virtual document is copied, if all it’s components are
present in the to-be-webcached directory, they will be copied,
but the parent/child relationships will not.

To get around the virtual document limitation, you could:

  • Not use/rely on virtual documents for your website
    — or —
  • Instead of using the built-in relationship management mechanism
    for parent/child VDoc relationships, you could use your own
    attribute (i.e. a new attribute called child_object_ids and/or

To get around the multiple versions limitation, you could:

  • Create a job in documentum to split up the versions
    beforehand, into separate folders. Then do multiple
    WebCache jobs… one for each source folder.
    — or —
  • Create a job in documentum to split up the versions into
    different objects beforehand, each with a name that indicates
    it’s version. Then copy them out into the to-be-webcached folder
    and do a WebCache job. You should get all versions that way.
    — or —
  • Create multiple webcache targets for the same source. Each
    target would be configured to copy a specific version.
    Of course, this creates the problem that multiple targets
    aren’t seen as one cohesive set.
    See the answer to the question “Is it possible to publish
    from two distinct WebCache sources
    to one target?”
    for ideas on making two targets appear
    as one.

More To Explore

mind mapping for effective decision making

Mind Mapping for Ethical Decision Making

As a follow-up to my previous article about creating an ethical framework for design and decision-making, I want to explore a public policy use case using visual modeling to really