Fascinating Facts about FAST and FTDQL

For those of you who have worked with previous version of Documentum Content Server (5.2.x and below) and are using 5.3.x, you should already know that with the release of 5.3, Documentum has switched their full text engine from Verity to FAST.  Given that FAST engine is entirely different from Verity engine, Documentum has tried to document some of the inner workings for FAST.  Most of the information I am describing can be found in Ed Bueché‘s Tuning Webtop: Advanced Search FTDQL Behavior.

Fascinating Fact #1:

Since FAST is no longer directly tied to the installation of Content Server, you have the option of not installing FAST.  If you do not install the full text search engine, simple search will perform a database search (case sensitive) against object_name, title, and subject attribute against all dm_sysobjects.

Fascinating Fact #2:

One of the key features with FAST engine is that by default, all attribute values are indexed along with the content.  In Verity, we had to explicitly define which attributes to include as part of the index.  The reasoning behind this is that Documentum determined that in most cases the full text engine would perform significantly better searching on attribute information than using database engine.  This performance improvement is evident in the following two scenarios:

  1. Searching for both attribute value(s) along with full text search
  2. Case insensitive search for attribute value(s)

Fascinating Fact #3:

The price of the performance improvement comes at a cost.  In the following scenarios, the actual search performance declines:

  1. ID based searches
  2. Date ranges
  3. Queries using the FOLDER(DESCEND) predicate.

If your users are mainly performing attribute value searches using some ID value (eg invoice number) and are not concerned with full text search, then the default configuration for advanced search to use full text search should be changed.

Advertisements

15 responses to “Fascinating Facts about FAST and FTDQL

  1. Dear Johny
    we have implemented a custom search (extending advsearch 5.3) and I would like to know what is the best way to improve the performance..
    we are only providing meta data based search and not content based search

  2. The other cost associated with the new FAST engine is literally the cost ($$$). The memory and storage requirements are such that it really needs to be hosted on separate physical host to work properly.

    We also discovered that multi-threading the indexing process greatly increases indexing performance but also memory requirements. By adding additional document processors (using the FAST admin web tool – System Management page), we got about a 10-fold increase in indexing speed.

    -Darla

  3. Hi dhruv,
    If you are not interested in content based search, then just disable the FTDQL and/or customize simple search to search against your metadata.

    Darla,
    Thanks for sharing your experience. I didnt think increasing the document processors could generate a 10 fold increase in indexing speed.

  4. Hi Johnny,

    Darla talks of the web admin tool for FAST – do you know what that is ? Whats the best proactice for adding docuemnt processors ?

    Dave

  5. Admin tool is web app that is part of Index Server install. I dont know what the url is off the top of my head. Have you read through the Index Server install guide?

  6. Dear Johnny,

    do you know how we could avoid that FAST going to index our job logs and reports?
    we have publishing jobs which are running in a tight schedule and are generating logs for possible trouble shooting but we dont want them to be indexed by fast.
    the point is that the log files of these jobs (or of all jobs) are dm_document objects so unregistering events for dm_document type will may lead to the problem that no dm_document objects will be index any more.
    Is there any solution to avoid that job log files/reports been picked up by index agent to get indexed?

    many thanks in advance
    abdolreza

  7. Hi abdolreza,
    The easiest way to solve this problem is to upgrade to 53 SP5, which just got released last night. The new version DA allows you to configure which object types should be indexed. If you cant upgrade to SP5, I think there is a manual (yet tedious) procedure that you can to correct this. You have to contact Documentum Tech Support to get this process – I havent done this before.

  8. Hi Johnny,
    Many thanks for your response and sorry for my delayed answer.
    Our current docbase version is 5.3 SP3. You have mentioned that one workaround to solve the indexing problem is to upgrade to 5.3 SP5.
    Could we only upgrade Index Server to 5.3 SP5 or should be upgrade bothe (docbase & index server). We are in controlled environment so upgradeinfg docbase versions are therefore triggering a lot of other sideeffects. So it would be good if it would be possible to combine Docbase version 5.3 SP2 with index Server 5.3 SP5 . is that possible.

    may thanks & best regards
    abdolreza

  9. Hi abdolreza,
    You should contact Documentum Support to get the definite answer before proceeding.

  10. we are using Documentum 5.3 SP4 with FAST indexing enabled. Via the WepTop Advanced Search screen I perform 3 searches:
    1. ‘Application Name’ attribute ‘=’ to the string ‘cdo’. Returns case insensitive matches (ie. matchs ‘CDO’)
    2. ‘Application Name’ attribute ‘contains’ the string ‘cdo’. Returns case insensitive matches (ie. matchs ‘CDO’)
    3. ‘Application Name’ attribute ‘in’ to the string ‘cdo’. Returns no results. I need to enter the strnig in upper case ‘CDO’ for the query to work.
    I have confirmed this behavior in out of the box DA with a dm_document object.
    Can anyone confirm this behavior? It seems to be inconsistent, and I am also unable to find this documented anywhere.
    thanks

  11. This is a support question. I suggest you post your observations on EMC Support Forums to get better feedback.

  12. In Documentum 5.3, Is it possible to explicitly or configure to say index only specific attributes then to say all attributes by default?

    thanks

  13. Hi Deepak,
    The index server either indexes all of the attributes or NONE (including content). If you configure none, then you would have to manually add database indexes to your specific attributes that you want to index.

  14. Johnnygee…by chance are you available for consulting/contract work?

  15. Hi johnny,
    Do you know about any issues regarding preforming a search by a word in a scanned document?

    we have strange thing goin on, when we preforming a search by a word (using FTDQL) some of the results are not seem to be relevant to the search criteria.

    any ideas?

    10x

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s