Designing custom object types and their hierarchy

One of the most asked questions when implementing Documentum is how to best design custom object types. Content Server presents everything as objects, even though the data is stored in a relational database. Most custom object types are derived from SysObject (aka dm_sysobject). The SysObject is the basis for most objects in a repository and contains the most common attributes that associated with content (eg object_name, r_creation_date, owner_name, etc). When designing custom types and attributes, there are some common questions you might ask:

  • How many subtypes should I create?
  • How many sub-levels from dm_sysobject can I create without impacting performance?
  • When should I create a subtype object vs custom attribute that can be used to store object type?

Here are some guidelines that I use based on the 10 years I have been working with Documentum.

  1. Create a corporate/enterprise object type that will be contain custom attributes that is applicable across the enterprise.  This custom type is a subtype of dm_document and is not visible to the normal users.  Visible custom types will be derived from this corporate object type.
  2. When creating subtypes, always be concerned about the total number of object types that will be presented to a user during import/content creation process.  From a conceptual level, it might make sense to create 20+ object types that map to different documents available at the corporate level.  However, it can be troublesome later on when trying to train a user on which type to select when importing if there are too many object types.  Try to keep number of visible objects types to 7-10.
  3. In older versions of Content Server, creating custom object types several levels deep from dm_sysobject would significantly impact performance when querying.  This is no longer a real problem these days given the optimization of the Content Server and enhancements to database performance.  The main driver in keeping the object hierarchy flat (vs deep) is purely management of custom attributes.  Keeping a flat object hierarchy forces users to think about object types and attributes on a enterprise level vs dept level.  Organizations that have deep object hierarchies may encounter object migration issues when they need to consolidate object types due to shear number of custom types that have proliferated.  Try not to exceed 5 sub-levels below dm_sysobject.
  4. Create subtype to add custom attributes to the parent type.  Do not create sibling types if they do not have unique attributes that would distinguish them from one another.  If you want to identify the type, use a custom attribute (eg media_type).
  5. Do not create custom subtype and/or attributes unless there is a business need to capture that information.  Intuitively, you might want to capture all information/metadata related to a document and therefore you create a custom type with all of these custom attributes.  Consider this: who is going to enter all of that attrbute values.  Minimize the number of attributes that are essential for business purposes (eg searching).  If more information is required, try to think of automated ways to populate the information (eg only require zip code, have an automated program that would populate city and state).  The same principle applies to custom types.  Do not create them unless there is a business requirement to only search for this custom type and exclude other types.

There are many more guidelines I use that are very dependent on the type of application and business rules.  Feel free to share your experiences.

Advertisements

21 responses to “Designing custom object types and their hierarchy

  1. Just to be sure I understand this fully…. You create custom types based on the document stored. These custom types (or table in DB world) contain attributes (fields) that hold the information within the document. Or is there another layer?

    dm_document
    |_corporate_document
    |_custom_type

    or

    dm_document
    |_custom_type

    Based on that, if you have 1 type of document stored in Documentum (invoices, contracts, whatever), is that the corporate or custom?

  2. I am working in a large financial company with branches worldwide. My manager proposed a flexible approach for custom object types. That is, using the registered reference tables synchronized from normalized database and saving primary key(s) of reference tables as required attribute(s) in custom object types. And more, custom document inherits those primary key(s) from custom folder it links to.

    The advantage is that clients may rename the description whenever they want. But a big performance problem comes as trade-off, we display description in front end. Everytime to call Business Object to perform get/set properties, link/unlink etc, there is a database joint effort.

    Any suggestion?

  3. ktjoker,
    I would create this
    dm_document
    |_corp_type
    |_invoice
    |_contract

    This would allow me to define common attributes at corp_custom_type level (eg dept) and then individual attributes at lower level (eg invoice_num for invoice and contract_num for contract).

  4. ledward,
    Unfortunately Documentum object model functionality is not designed to be dynamic in nature. This is more evident when defining attributes as optional for one lifecycle state and then required later on. I have no suggestions at this moment, but if I come up with something, I’ll definitely post it.

  5. Johnnygee

    So both invoice and contract custom types are siblings under corp_type? Based off this approach, would good examples of attributes stored at the corp_type be customer information since it can be found in both invoice and contracts?

    Also, Can searches be performed against this model without having to pick an object type? If so, can the following occur:

    1- User wants to search all documents for a particular customer (attributes reside in corp_type).

    2- User wants to search for documents that may be link to attributes that are in invoice and contracts? For example, a search page with attributes that are in both, and they enter information in those attributes.

    Thanks,

  6. ktjoker:
    1) Yes
    2) Yes
    3) If you are performing a simple search in Webtop, then the selection of an object type is not necessary, since all of the attribute values are indexed regardless of object type. If you are performing an Advanced Search, then the user needs to select object type in order to get list of attributes that he/she can search against. If you are creating a custom application, you have full control over what a user can search against.

  7. Thanks Johnnygee.

    So for the advanced search, the user will have to choose the corp_type, contracts, or invoice object type. If that is the case, then I’m assuming that data is denormalized within each of the custom types. Example: contracts can be multiple levels in a DB structure. Both those leves would exist in flat format in contracts custom type.

    I guess for our purposes it would be good idea to have the following since the users feel they want to 1) search all contracts based on attributes that can be in all 2) search for only current contracts 3) search just legacy.

    dm_Document
    |_Customer
    |_Contracts
    |_Legacy Contracts

    I put Customer cause this is a common (or global) attribute set.
    Thanks,

  8. The database structure is normalized. Custom attributes defined at corp_type is part of corp_type tables. Custom attributes at contracts are defined at contracts table. When selecting attributes for contracts, content server performs the appropriate joins to get all of the custom attributes for contracts and inherited attributes from parent object types (eg corp_type, dm_document, and dm_sysobject).

    Your model should be fine as long as current contracts will never change to legacy (ie legacy contracts never change). Otherwise, you will have to change the object types whenever current becomes legacy.

  9. Johnnygee

    Thanks for input. I meant normalized one more level down. Contracts type can have multiple levels (1 to many attachments per agreement which is actual contract).

    So…

    Contracts
    |_Attachments

    I’m assuming that this isn’t necessary (not sure if good approach though) and it is fine to keep it flat since the users will request to search for all contracts for a particular customer. They want to see all agreements and attachments for each contract or virtual document. Also, I’ve read it isn’t good to have many levels in the structure for performance reasons.

    Thanks

  10. It should be kept flat. See #5 item in my original post. Also, I have seen virtual documents used in the scenario you described.

  11. Johnnygee

    Great feedback, thanks.

  12. My gut feel on this is that I don’t like creating attributes if only a fraction of document types (i.e. real documents) utilize them. I also don’t like creating attributes that may lead to indexers using them for multiple definitions/purposes (e.g. “reference_no”, which could be defined differently across the enterprise).

    I think the ironic thing here is that you have Documentum, which is a highly objected oriented architecture, but when it comes time to designing custom object types, some basic OO prinicpals tend to take a back seat to things like search optimization and indexing. But it’s a balance, for sure. And nobody wants to do additional indexing when it’s possible to auto generate values where possible.

    I hear feedback from novice business users and developers that balks at the notion of more than a couple levels deep (after dm_document). This concern, to me, without any other justication besides the older issue of performance, is misplaced.

  13. As far as #4 – What if you had some types of documents logically different, yet sharing the same attributes, each having a different lifecycle. Would you still stick with a custom attribute to set them apart, then use some other means to auto-apply the appropriate lifecycle based on the value set in the custom attribute?

  14. Besides having different lifecycles, do you plan on processing the documents differently? For example in WDK, you might want to disable permissions tab for one type, but not another. This is the one situation where I think creating overhead of additional types would be beneficial, since you can scope WDK components by type very easily. If you dont plan on processing the document types differently, I dont see why an attribute would not suffice. In these situation, I have actually created doc_type attribute on past projects.

  15. Thanks johnnygee – You do bring an excellent point in that we do want to utilized some of the attribute configuration to display the ‘doc_types’ differently, so in this case its probably worth going forward with different object types.

  16. Hi Johnny,

    As per this article, I understand that the object model hierarchy has NO impact on performance of the system ina 5.3 environment. It only has to do with the management of the object heirarchy.

    Would this be a correct assumption?

    I had seen that in DCTM 5.2.5 the deeper the hierarchy, the greater the performance impact.

    Regards,
    Bonson

  17. There is no real SIGNIFICANT impact. There will always be a performance hit when joining subtypes. The deeper the tree, the more table joins the content server has to make to retrieve all of object attributes.

  18. Hi Johnny,

    I understood that there should be no performance drawback anymore when working with a deep hierarchy. (although 5 levels seems to be a common accepted amount of levels). Do you think there would be a performance issue when only 1 (custom) document type is used for let’s say 3 million documents and the rest of the categorization is done at attribute level?

    Regards,
    Dave

  19. Hi Dave,
    3 million object instances of one type is no issue. I’m not sure what you mean by “categorization” at attribute level. If you are asking whether it makes sense to create a custom attribute (eg doctype) to store the “object type” vs create a custom object type, then the answer is yes (see guideline #4).

  20. Hi Johnny,
    I’m currently learning about Documentum while applying for work. I’ve faced quite a lot of errors while trying but I have no idea what went wrong.

    May I know is there any restriction on the naming of the custom object type? I’ve added new custom object type with custom attribute. However, if I named them as dm_xxx, they are not considered custom type and I could not delete them. I’ve tried using DQL DROP TYPE “dm_xxx” but it said type can’t be dropped. May I know how can I resolve this? Thanks!

  21. Hi Johnny,
    I’ve just read a forum and it said object type name can’t start with “dm”. My mistake… however, is it possible to delete it?

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s