Happy Birthday Beach Street!

Today mark’s the 10th year since Beach Street Consulting was formed. I have been very fortunate to belong to this family of friends for the past 7.5 years.  We have been through ups and downs as like many other IT consulting companies.  The one thing that has remain constant is our sincere desire to help our customers, our belief in making every employee feel like they are part of a family, and our strong interest in building solutions and solving complex problems.

Looking forward to many more years to come.

Johnny Gee, CTO

Review – Alfresco 4 Enterprise Content Management Implementation.

Alfresco 4 Enterprise Content Management Implementation

Its been over 3 years, since I reviewed Alfresco 3 Enterprise Content Management Implementation.  Since then, Munwar Shariff has published revised version for Alfresco 4 Enterprise Content Management Implementation.  I thought it was a good opportunity to catch up on the latest features of Alfresco 4.  Unlike previous book reviews, I personally purchased this book.

If you are new comer to Alfresco, then this book is for you.  It covers all the basics for Alfresco 4.1.2 Enterprise Edition.  If you are familiar with Alfresco 3 or you have purchased the previous version Alfresco 3 book, there is not really much of a change.  In fact, the table of contents of Alfresco 4 is pretty much the same as Alfresco 3.  You can see my previous blog post for highlights of Alfresco 3.

I will cover the new stuff that has been introduced in the Alfresco 4 version of the book.

Chapter 1 introduces new Solr search and Activiti workflow engine for Business Process Management.  There is entire chapter dedicated to Activiti later on.  It also includes brief descriptions of newer products like Alfresco in the cloud, Alfresco Workdesk, as well as enhancements to iPad and Android mobile apps.

Chapter 9 discusses various methods to integrate with other applications.  In this version of the book, the author provides more in-depth coverage of Web Scripts.  There are several examples of webscripts that reader can use as starting point to integrate with other systems. The chapter also describes the REST APIs that can be used as well.

Chapter 10 has been entirely rewritten (as compared to Alfresco 3 book).  It reads more like an administrator training manual and assumes that the reader understands the purpose of Alfresco Share vs Alfresco Explorer.

I got the most value out of this book from Chapters 8 and 13.

Chapter 8 covers both simple workflows and advanced workflows.  Support for advanced workflows is now provided by Activiti workflow engine.  The author does an excellent job introducing BPMN and workflow basics of Activiti workflow.  There are step by step instructions on how to use Activiti Process Designer to define custom workflow as well as potential extensions to the individual tasks in the workflow.

Chapter 13 introduces the integration between Ephesoft and Alfresco.  If you are not familiar with various scanning products like Captiva and Kofax, Ephesoft is open source product that integrates directly with Alfresco.  Not only does it have basic features OCR and indexing, it also has ability to classify and automatically extract data from forms.

All of major ECM vendors have advanced workflow engine and integration with advanced scanning solution.  With the introduction of the Activiti and tight integration with Ephesoft, Alfresco 4 is quickly catching up to EMC, IBM, and Microsoft.  I am glad that the author covered these two topics; my only wish is that he would have covered them in more detail.

Momentum DevCon Theme = Harmonization

Before I dive into the conference, I want to first acknowledge that I have not been blogging for quite some time.  Some of this has to do with work-life balance of family, the other is that the technology has not really excited me that much to spend time writing a blog.  Its much simpler to “tweet” an interesting article or make a comment about reading something interesting.  A blog entry requires time to think whats important to you and more importantly, what you want to share with others.  Its been a long time since I had opportunity to go to DevCon and meet engineers and product managers of IIG group.  Coming out of this conference, there’s a certain excitement that is starting to brew on what’s coming down the path and how product today (and tomorrow) is going to impact the future strategy of EMC Documentum.

So how is EMC achieving harmonization?  Here are some key examples:

1) xCP 2.1 (coming out next yr) will allow xCP 2.x applications to co-exist with xCP 1.x apps as well as non-xcp apps (including WDK based apps).   Ideally, this would have been part of 2.0 release, but given that 2.0 was a major code release (from 1.x), I understood the need to push the inter-operability to 2.1.

2) When xCP 2.2 comes out (timeline unknown), it will be inter-operable with D2 4.x apps.  The infrastructure to support this will be included in 2.1, but the actual implementation will have to wait to 2.2, since EMC is trying to incorporate some of the real-time configuration capabilities in D2 into xCP.   We will start seeing the D2/xCP sooner.  There are plans to incorporate some of the xCP reporting UIs into D2 (possibly D2 v4.3).

3) Syncplicity Connector to Documentum will support bi-directional review and review.  No longer are we limited to just “pushing” content out of Documentum to Syncplicity.  Users can make updates to the documents stored in their Syncplicity folder and the documents will be sync’d to Documentum repository.

4) Syncplicity today supports both off premise storage hosting (ie Syncplicity servers) as well as on-premise storage via Isilon storage.  EMC plans to expand storage capabilities of Syncplicity to allow admins to select their on-premise medium of choice via EMC ViPR technology.

5) Captiva 7.1 has mobile toolkit that will allow developers to build image enhancement solutions on both iOS and android platforms.

6) Finally, we know that EMC knows private cloud.  We have heard them talk about hybrid cloud.  At DevCon, it was the first time I heard of EMC working towards public cloud.  Specifically, IIG is building a public solutions on NGIS using xCP Designer.  These solutions will be truly multi-tenant and scalable across the globe with multiple data centers.  These solutions will be focus on collaboration that are vertically focused with built-in integrations.

I am very excited that multiple products are maturing to the point where product roadmap is leading to a point of unification and harmony between all the products.  This is not going to happen over night, but I feel the future is bright.

Honored to be one of the first 75 EMC Elect Community Members

I have been honored in the past by EMC for my contributions to EMC Support Forums (specifically Documentum).  Today, I am honored that EMC has selected me to be one of the memebers for 2013 EMC Elect.  I’m looking forward to interacting with other EMC Elect/Experts in the other EMC product fields.  I am hoping that this program will evolve into something like Microsoft MVP, but only time will tell.

I’m hoping to get to EMC World this year and maybe check out the VIP treatment that being EMC Elect receives :)  There is also a private EMC Elect community that has been created and is supposed to have exclusive content.  I will let you know if there are some goodies in here that may inspire other folks to work to becoming an EMC Elect.  Newer members will be added every year, so if you did not become accepted this year, you have entire 2013 to work to gain Elect status.

EMC World: From Outsider Perspective

Unfortunately, I was unable to attend EMC World this year.  Luckily, there were numerous bloggers that did a great job highlighting the conference.  There was a lot of talk about the Syncplicity acquisition as well as showcasing xCP 2.0 and D2.  What I found most interesting was Ron Miller’s interview with Rick Devenuti, who is President of EMC IIG.  Ron’s article “Documentum sees a future in the cloud – Looking to SMB market as growth area” quoted Rick in saying that “…By marketing Documentum as Platform as a Service (PaaS), EMC can suddenly attract a new group of customers beyond its traditional base because third parties can build applications geared specifically to SMB requirements.”

This is similar to something I wrote over two years ago after attending EMC Writer’s Summit in NYC.  Back then we talked about what-if ECM became a commodity.  I felt that if 1) the technology was truly integrated into the platform AND 2) partners could quickly build/configure solutions AND 3) the solutions could be easily hosted, that there would be a large untapped market that EMC could capitalize on – specifically SMB.  I believe 1) is finally here with EMC onDemand, 2) is finally here with xCP 1.6 (and even more so with xCP 2.0 coming out later this year), and 3) is going to be here soon with Syncplicity integration.

A lot of the bloggers talked about the SSO, IRM, synchronization technology that EMC will get from Syncplicity, but what isn’t mentioned as much is the business knowledge that they bring to EMC.  Specifically, the operation costs and pricing model for cloud based solutions.  Syncplicity’s knowledge of freemium model as well as their 4+ years of datacenter operations is key to becoming competitive in the SMB.  EMC could benefit from developing a different licensing model for Documentum (possibly subscription vs perpetual license) to accommodate the minimal upfront costs that SMB typically require.

I’m looking forward to the day that we can offer our solutions to SMB!

Review – Alfresco Share

Alfresco Share by Amita Bhandari, Vinita Choudhary, and Pallika Majumdar

After reading the Preface, I was excited to see three things about this book:

1) it covered the latest UI/application for Alfresco
2) it was written for a business user (not a developer)
3) it was second book written by the same group of people who authored “Review – Alfresco 3 Enterprise Content Management Implementation”

I enjoyed their first book and was looking forward to reading their “sequel”.  Before diving into the review, I want to acknowledge that the publisher invited me to review the latest book about Alfresco and provided me a free copy of the book to review.  This new book had a different spin than most of the other Alfresco books that I have reviewed.  The authors built a case study and provided tips on how to implement collaboration strategy for a company.  It was supposed to focus on business requirements and not delve deep in technical syntax.  Unfortunately, I was sort of disappointed on how much of the book followed this goal.

The authors started off good in Chp 1 by describing various collaboration features in Share and by framing the case study around developing a marketing site created for a new product.

Unfortunately, Chp 2-4 diverged from the business user perspective and covered installation, architecture and system administration.  The same information has been presented in other Alfresco books.  It was very similar to reading a sequel that spent the first third of the book rehashing what occurred in the original book.  If this is the first Alfresco book you buy, then the material is relevant.  However, I doubt anyone who investigating Alfresco as an ECM platform will only buy this one book.

Chp 5-7 covered site management, collaboration features in more detail, and the importance of document library.  The authors did a good job setting up the case study from a screen shot perspective, but did not really relate how business requirements dictated what features to use and what settings to configure.  A technical book oriented towards developers typically gloss over the business requirements in lieu of focusing the technical aspects of the product or feature.  I feel that the authors did exactly this and loss sight of their original goal of writing for a business user.

Chp 8 was the best chapter in the book.  It covered how to implement workflow within Share and it presented the content in terms of the use case.  The authors created very good diagrams that defined the marketing business process.  They describe various review states (or subspaces) and how various users of those review groups participated in the workflow.  The diagrams presented were well thought out and conveyed the business requirements in a way that most business users would understand.

Chp 9-10 covered advanced features and deployment of Share.  Again, I feel these chapters may not be as useful to a business user, but I did learn something new about Share – there is integration with Google Docs.

In summary, I really liked the idea of the book; however, I was disappointed on how the book turned out.  Most books about technology contain content provided by software vendor.  You have to; it is part of educating your readers on the software.  My personal perspective is that a good book has a well define audience and presents the material in a manner that he/she can understand (e.g. beginner developer, advanced developer, or non-technical user).  This book contains all the relevant material about Share, but does not live up to the potential of what it could have been.

In parting, I know sequels can be harder to write, so I hope I do not discourage the authors in continuing to write more books.  They know the material well; they just need to tweak the presentation.

Email & Records Management – Putting square peg in a round hole

There are many email archiving products that allow companies to apply retention and disposition to emails, including EMC SourceOne.  The problem with these kinds of products is that they tend to address email archiving from a storage perspective instead of addressing records management perspective.   Here are list of records management challenges that are unique to emails:

  1. Single copy – Most emails tend to have multiple recipients (from either to, cc, or bcc).  So, if retention rules are defined for multiple users, you can potentially have duplicate email records.  However, most email archiving products do a good job de-duplicating emails, so you do not have multiple copies of the same record.
  2. Categorization/Classification – Retention in records management is typically associated with a fileplan.  Based on content of document, process in which document is derived from or applied to, or the creator of the document, document is classified to appropriate record series in the fileplan.  The challenge with email is that its typically short, so there may not be enough content to truly classify it correctly.  Some emails can be categorized to a process (eg invoice sent), which can then be correctly placed in the appropriate record series and retained for the appropriately amount of time.  Most emails have to categorized based on who the owner is (eg CEO) and retained for generic amount of time.

    Some email archiving products allow you to define rules to better categorize emails; however, this requires “true understanding” of business.  For example, if you define a rule to “delete emails immediately with subject contains party”, this would do a good job filtering birthday party invites; however, if your company is in the business of hosting parties, you would not want to use this rule for categorization.  Other email archiving products use heuristics to analyze sample of your companies email and you teach it what kind of emails should be records and what kinds of emails can be considered non-records.  The challenge of heuristics is that you need a good sampling, which tends to require a large sampling of emails.  Plus, you will need to train the system on how to categorize.  Training heuristic system is easier to do then developing rules for categorization; however, gathering a good sampling is typically harder than using examples in developing rules.

There are a few vendors that apply both strategies of categorization – use heuristics first and then apply rules for categorization.  I believe this is the best approach in solving a classification problem with limited content.

  1. Owner of email/record – Whereas a document tends to have a well-defined owner/creator, the owner of an email can be vague.  For example, an email author may send request to a group of people for the latest version of SOP document.  Several people reply with different versions of the desired document (due to lack of version control or use of Content Management System).  Assuming that the email is categorized based on email authors, emails for different email authors may be different based on their roles (eg CEO emails retain for 7 yrs, everyone else 3 yrs).  Therefore, you can have a situation where the same SOP document gets retained for different times because of how email categorized.

This challenge is not as much a technical issue as how the rules of classification are defined.  In the world of records management, a document has a real owner.  The request for a document is not typically treated as a record itself, so you do not run into these kind of issues.  You may encounter that a document may fit into multiple record series, but at the end of the day, the records managers decide where the document fits most appropriately.

  1. Versioning/supersession – SOP documents typically have a shelf-life and get replaced on a periodic basis.  When a new version of the document is declared as a record, the new record supersedes the old record.  This is a straight-forward concept.  How does this apply to email?  Emails are normally not versioned; you typically have an email chain going back and forth between various users.  Which one of the emails should be the record and which should be superseded?  The reality is that supersession does not really fit with emails (hence title of the blog post).

Some vendors treat email chains as single chain and applies a single retention based on the earliest or latest date of the email chain.  The idea of superseding email does not really exist in email archiving products.  Again, this is not a technical issue as much as emails are very different than typical document/records.

I strongly believe that companies need to include email as part of their overall records management strategy; however, their current records management system may not be up to par to handle email archiving.  Likewise, email archiving systems should not be use as records management system.  I feel that in most situations, a company needs both types of systems, but they also should have a well thought out unified records strategy that includes both documents and emails.

Johnny Gee