Fascinating Facts about FAST and FTDQL

For those of you who have worked with previous version of Documentum Content Server (5.2.x and below) and are using 5.3.x, you should already know that with the release of 5.3, Documentum has switched their full text engine from Verity to FAST.  Given that FAST engine is entirely different from Verity engine, Documentum has tried to document some of the inner workings for FAST.  Most of the information I am describing can be found in Ed Bueché‘s Tuning Webtop: Advanced Search FTDQL Behavior.

Fascinating Fact #1:

Since FAST is no longer directly tied to the installation of Content Server, you have the option of not installing FAST.  If you do not install the full text search engine, simple search will perform a database search (case sensitive) against object_name, title, and subject attribute against all dm_sysobjects.

Fascinating Fact #2:

One of the key features with FAST engine is that by default, all attribute values are indexed along with the content.  In Verity, we had to explicitly define which attributes to include as part of the index.  The reasoning behind this is that Documentum determined that in most cases the full text engine would perform significantly better searching on attribute information than using database engine.  This performance improvement is evident in the following two scenarios:

  1. Searching for both attribute value(s) along with full text search
  2. Case insensitive search for attribute value(s)

Fascinating Fact #3:

The price of the performance improvement comes at a cost.  In the following scenarios, the actual search performance declines:

  1. ID based searches
  2. Date ranges
  3. Queries using the FOLDER(DESCEND) predicate.

If your users are mainly performing attribute value searches using some ID value (eg invoice number) and are not concerned with full text search, then the default configuration for advanced search to use full text search should be changed.

Folder Hierarchy Structure – KIS (Keep It Simple)

When setting up a cabinet/folder structure in a repository, a lot of people ask what is the appropriate number of cabinets and folders that one should initially create. As you can probably guess, there is no magic answer.

If your company well organized, it makes sense to establish cabinets that mimic your business units (eg departments). You may have to abstract higher (eg division) if you are setting up a repository for the enterprise; especially if your company is very large. If your company is not well organized or your application is not centered around business units, you may want to allow a free-form folder structure. See my other blog entry on locked-down vs free-form folder structure.

I follow the KIS methodology and try to limit the number of public (visible) cabinets to up 10. I also try to apply this philosophy to folders and subfolders as well. Any more than this, navigation can become cumbersome.

Be careful not to create too many tiers in your folder hierarchy – this too becomes cumbersome to users if they have to drill down a lot. I try to limit the number of tiers to up to 5. This might not be practical if you are setting up a repository for an enterprise. Then again, you might want to get a consultant if you trying to do this on your own without any prior experience.

Follow

Get every new post delivered to your Inbox.

Join 45 other followers