Well, if you have deployed Documentum before, you have probably heard this from at least one of your clients. The great thing about Google is the speed and simpleness of the search execution (I won’t discuss search relevancy, since this can be very subjective). The problem with Google as it relates to ECM is that all content is created equal. In other words, all content that Google indexes is treated identically – there is no individually security assigned to each document, web page, etc. I am aware that you can configure Google to filter content located in different file stores/locations to limit access to certain groups of users. However, this is not true document level security.
In the ECM world, one must be able to apply different permissions to different documents and these permissions must be adhered to while performing searches. This is used to be problematic for Google, until the release of the Google Search Appliance 5.0 (GSA). With GSA 5.0, there is now an Enterprise Connector for EMC Documentum.
The connector first crawls Documentum repository using a superuser account. The crawling process extracts metadata and content from the repository to be used by GSA to generate index. Users performs search against the GSA, not against the Content Server directly.
When GSA returns results, it then prompts the user to enter his/her credentials using Webtop interface. The connector passes these credentials along with the search results to the Content Server. The Content Server evaluates whether the user has permissions to view each document in the search results and then returns authorized result set back to the connector.
Finally, GSA displays only the results that the user has access to. Problem solved! Maybe…
I have not implemented this solution provided by Google. A few questions that I still have.
- What is the performance, given that connector has to communicate with Content Server to evaluate permissions on the result set? This shouldn’t be a concern if the result set is small, but what happens if the result set is large?
- Why is Webtop needed? It would be ideal if connector provided an interface to authenticate directly with Content Server.
- How is document versioning is handled (i.e. updated content, newer versions, etc)? Are renditions supported?