Can Auto-Tagging Save Us From Metadata?
For professional content managers, metadata is a trusted playbook for all our information assets. Like the index cards at the library or parts catalogue at the hardware store, metadata assigns meaningful names to things we need to find. It turns a junk drawer into a well-ordered, clearly labelled sorter. Metadata tells us where information lives and moves, how important it is for the business and how to handle and safeguard it. But for many end users, metadata is poorly understood. It’s often considered an annoyance to be avoided by any means possible, opting instead to rely entirely on search engines or the latest promise of auto-tagging.
People don’t seem to like talking about metadata. When I chaired a metadata special interest group, it was a small intrepid, sub-team within the higher profile data management group. Metadata conferences used to meet around a table. Now they fill convention centers and headline popular webinars. Metadata isn't going anywhere.
Auto-Tagging Is Not Ready
Clients often ask, “what about that new auto-tagging? My employees don’t want anything to do with metadata.” To replace managed metadata with auto-tagging in its current form means putting the same "intelligence" that delivers misguided ads to you or tries awkwardly to auto-correct your texts in charge of classifying your critical records. Experts describe artificial intelligence (AI) in its current state as like a child — it tends to take everything literally and has no context to help it apply the information it’s presented in a way that makes sense.
In one early attempt at auto-tagging, Flickr, the once-popular photo-sharing website, sparked outrage at how its image recognition technology was labeling photos. Intended to help users organize and find their photos, it started out by applying mostly harmless but irritating (and useless) tags such as ‘monochrome’ or ‘people.’ Its labelling results grew increasingly offensive, tagging a concentration camp image with tags like ‘sport’ and people’s photos with tags like ‘ape.’ It revealed not only how complex it is to classify and label things, but also the flawed and unethical rules and algorithms humans built into the auto-tagging decisions.
Related Article: Using AI for Metadata Creation
Auto-Tagging Has a Place in Information Management
Auto-tagging is making inroads, especially in digital asset management, as a way to recognize and catalogue design drawings and rich images. After several opportunities for it to ‘learn’ based on trial and error, auto-tagging technologies can begin to find patterns and extract data that’s already there.
Auto-tagging is also well-suited for documents with some structure, such as forms. As part of the scanning and ingestion of forms, auto-tagging can extract key metadata with little training of the AI engine. It’s a different story with free-form text documents. In an ideal world, contracts would be standardized across companies and extraction of key data a reality for a lot of our business operations. But this is not realistic because business contracts do not follow a standard structure. This is the next frontier for extraction of key data.
Some exciting new technologies extract key metadata from free-form text, but these require some degree of human intervention to check the results, especially for formal and legal documents. These technologies reduce the armies of people checking for critical information but can’t replace them entirely.
The Difference Good Metadata Can Make
Unless you’re Google, your search results are probably long lists with iffy relation to your topic. Most of my clients tell me they give up on doing a search on their own and start calling people to see if they remember "that case they worked on two years ago." Search is not the whole answer, and neither is personal memory.
Mondelēz: 3 Steps to a Data-Informed, More Proactive IT Department
How to build a new team culture dedicated to the proactive mindset.Watch Now
How to Create a Successful Hybrid Enterprise Using Slack
Learn the three steps companies should take to create a successful hybrid enterprise and enable better productivity.Watch Now
Users are looking for ways around having to map their knowledge in categories (taxonomies) and give them standard names (metadata). Resistance to adopting metadata may be the result of how it’s designed and used. If populating the metadata becomes a burden with little apparent benefit, it will be difficult to get buy-in from the people we’re counting on to keep it stocked. The availability of even basic meaningful metadata impacts the user experience for everyone. I have seen companies where a new executive logs on to the company system for the first time and says, "that’s beautiful," because anyone can easily browse and navigate the company memory. I’ve also seen what employees call “nightmares.” That’s the difference metadata can make.
However you incorporate auto-tagging into your information management strategy, you will still need to think about how you interact with your information throughout its lifecycle and movement. A good taxonomy — a map of your knowledge — is an important part of preparing for auto-tagging, just as it is a prerequisite for labeling or metadata decisions. Content users and managers need to agree on an overall structure for their information sets, and then you can use those categories to label files. You should not expect to buy a taxonomy or metadata off the shelf and have it up and running by Monday (I’ve heard things like that). It takes time and expertise. But the value of good thoughtful structuring is transformational.
When one client took the plunge from folders and sub-folders (and personal hard drives) to a shared, metadata-driven information experience, the following design principles drove the effort from the start:
- Use metadata to measurably improve users’ experience, fix their headaches.
- Provide them with the literacy to make decisions about their own structures.
- Minimize the burden of populating the metadata values.
We surveyed teams to find out which terms they used most often to search for things and what would really help them. From there, we came up with a handful of meaningful metadata and retired others that no one was actually using. Most users search on just a couple of metadata fields to get the personalized, lean list of files that they need on a regular basis.
By providing users with pick lists to select metadata values, and by pre-populating metadata fields based on context and profile, uploading files with full metadata requirements takes just seconds.
Related Article: Poor Information Architecture Is Hurting Your Business
Don't Make Metadata an Afterthought
The term meta is Greek for after. When Aristotle wrote a new work following "Physics," it was given the title, 'that work after that physics thing,’ i.e., metaphysics, for simplicity. Metadata should not be treated as an afterthought or a cost to be avoided, but rather as the playbook for new opportunities, whether it includes AI-powered auto-tagging or extraordinary human-powered insights.
About the Author
Andrea Malick is a Research Director in the Data and Analytics practice at Info-Tech, focused on building best practices knowledge in the Enterprise Information Management domain, with corporate and consulting leadership in content management (ECM) and governance.
Andrea has been launching and leading information management and governance practices for 15 years, in multinational organizations and medium sized businesses.