Machine Learning Is Ready to Tackle Content
Distributed work comes with its own particular set of challenges. Collaboration software, such as Microsoft Teams and Slack, is now the backbone of how organizations work. But while these tools make distributed work possible, they also amplify existing issues with online collaboration.
One unwanted side effect is the proliferation of documents stored within these online channels. While some collaborative channels are well-defined, others are not.With so many channels storing content, organizations are now asking, “What's out there?”
User Experience > Content Management
Tools like Teams and Slack were designed for a good user experience, which is in part why they are much more widely used than similar tools that came before.
Even if the content captured is not well-managed, high adoption is a good thing. So while disorganized content is a problem, organizations cannot manage content that isn't in the system in the first place.
Vendors designed these tools for communication, not managing content, particularly content that needs to be kept for a long time. In fact, in some of these channels, finding content from even a couple days ago can be a challenge.
How can organizations determine which content they truly need to keep and how can they find it in the first place?
Related Article: You Rolled Out Your Remote Workplace in Record Time. Now Let's Talk Governance
Machine Learning Identifies and Classifies Content
Identifying content and determining its topic and importance is the perfect job for machine learning (ML).
Machine learning works a little different than more traditional artificial intelligence (AI) tools, that scan for keywords in defined parts of a document. ML algorithms use natural language processing to better identify content and classify it appropriately.
Related Article: A 5-Step Approach to Implementing Machine Learning
Don't Forget the ROT
Just as important as identifying and preserving important content is the identification and removal of redundant, obsolete and trivial (ROT) content.
Many content-focused ML tools originated because of the need for organizations to remove ROT from their older systems prior to migration and decommissioning.
But bear in mind, some ROT content may be people’s convenience copy of content, and simply deleting this content may cause a greater productivity loss than the reduced storage and management costs can save. Even if a person doesn't need the content anymore, seeing it missing may lead them to start investigating what else is missing, thus finding a new way to waste time.
Learning Opportunities
Communication is essential to prevent this. People need to know what is happening. There may come a time where the ROT content must be removed, so let people know in advance when and why this will be happening to avoid headaches down the line.
Related Article: Search Challenges? AI Is Here to Help
A Few Notes of Caution Before You Move Forward With ML
A note of caution: there are more vendors out there claiming to use ML than those that actually use it.
That said, determining whether a vendor can meet your needs is more important than how they do it. One consideration is that vendors using ML technology may be able to more easily evolve to meet future requirements.
Take the time to try out the technology before committing. Finding and successfully classifying important content is often the most challenging task for vendors. To be successful, the ML models need to be trained to correctly identify documents, and that can take time.
Understanding the degree of training and expertise required to do the job is another consideration. The cost of doing so for each technology isn't a cost listed on a vendor’s price sheet.
All that being said, using ML technology is faster, cheaper and more accurate than having people classify every piece of content. Keeping everything is expensive and wasteful. Deleting everything too soon risks losing information the business needs or that your organization may be legally required to keep.
There was a time when ML wasn’t practical or as effective at managing documents as people. For most content, that time has passed. It is now time to seriously look at ML products as essential tools in managing the ever-expanding collection of content being placed in our collaboration tools.
Learn how you can join our contributor community.
About the Author
Laurence Hart is a director of consulting services at CGI Federal, with a focus on leading digital transformation efforts that drive his clients’ success. A proven leader in content management and information governance, Laurence has over two decades of experience solving the challenges organizations face as they implement and deploy information solutions.