In Brief
- Strategic Oversight and Collaboration: As principal taxonomist, Ahren shifted from hands-on taxonomy construction to guide the overall taxonomy strategy. He collaborates with various teams to ensure taxonomies align with business objectives and are used effectively.
- Data Consolidation and Ownership: A key part of taxonomy's role is to reduce data duplication. Ahren and his team do this by consolidating existing data, identifying clear business owners, and formalizing documentation and transition processes to maintain data integrity amid organizational changes.
- Balancing Innovation with Governance: While exploring new technologies like generative AI to enhance taxonomy processes, the role requires careful human oversight to manage legal, cultural and technical nuances, ensuring that rapid innovation does not compromise data quality.
Nike's principal taxonomist Ahren Lehnert joins Three Dots to discuss his role and taxonomy's part in driving organizational alignment. He and his team work with subject matter experts throughout the organization to create and maintain taxonomies that keep information findable, up-to-date and scalable.
Ahren and Siobhan talk data clean-up, meeting the needs of a global company, why working directly with subject matter experts is a non-negotiable, departmental experiments with GenAI and how best practices and learnings are shared across departments. Tune in for more.
Table of Contents
- What Does a Principal Taxonomist Do?
- Identifying Taxonomy Needs and Consolidating Data
- Finding the Business Owner for a Taxonomy
- How Internal Mobility Affects Taxonomy Work
- How Does Taxonomy Answer Business Users' Needs?
- Flexibility vs. Consistency in the Taxonomy
- Is Generative AI Providing Value for Taxonomists?
- GenAI for Taxonomy Creation?
- Job Security in the Era of Generative AI
- Memories of the Semantic Web
- The Book of Your Organization
What Does a Principal Taxonomist Do?
Siobhan Fagan: Hi everybody and welcome to today's episode of Three Dots. My name is Siobhan Fagan. I am the editor in chief of Reworked and I am very happy to introduce my guest today. Today I'm speaking with Ahren Lehnert who is the principal taxonomist at Nike. Welcome, Ahren!
Ahren Lehnert: Thank you for having me, I'm really excited to be on.
Siobhan: I am super happy to nerd out with you on all things taxonomy today. So for our audience, I think it would be helpful for you to just describe a little bit about your work at Nike as principal taxonomist.
Ahren: I'm at a point in my career where I do less of the taxonomy construction myself and I'm more at the level of looking at strategy. What is the current state of our taxonomies? What kinds of projects are we using them for? And where are we going? What is the strategy going forward? What are the opportunities within our company to use taxonomy in order to meet whatever goals and KPIs and other stated strategy objectives we have?
So I'm at that point where I am working as a team with taxonomists who are doing a lot of the taxonomy work itself, but I'm setting the strategy and guiding the path forward.
Identifying Taxonomy Needs and Consolidating Data
Siobhan: So can you give us an example of when there would be a new need for a taxonomy at Nike?
Ahren: It doesn't happen too often that we need a whole new taxonomy, because we're at a point where we have robust taxonomies, we have several of them already, and so we've had this data in for several years. And so the need for net new taxonomies doesn't come up that often. However, it can occur. And whether that's a whole taxonomy of itself or a structure within a taxonomy, that's something else.
So for example, one of the general endeavors that we have across the company is to reduce the duplication of data. That's a very common thing in a large organization, right, especially in an organization the size of Nike, where data gets recreated in various locations. Sometimes it's legacy, they're old systems that never got deprecated. Sometimes people create different tables for different purposes, not realizing maybe that there's data already out there.
And so a lot of what we do is seek people out, or sometimes they seek us, to look for new data sets that they might want to use. So rather than recreating things that already exist, we try to say, OK, what already exists out there? Can we deprecate that and instead move it into a taxonomy so that we have one centralized place to work with that data?
A good example of something like that might be currency, where we have systems that deal with that, but we have talked with teams about maybe moving currency and those standard rates into the taxonomy. That's a bit of a complicated task because a lot of those are calculated values. And so you have to ask yourself, Is it appropriate for taxonomy? Do these values change too often to be in taxonomy?
And so we have those conversations. Other times it is about, so at Nike, we haven't really had a lot of mergers and acquisitions. Obviously, Nike owns both Converse and Jordan. And so those are separate brands and sometimes you have to consolidate and expand the taxonomies to bring those in because they kind of are their own companies in a sense. So that happens where something is an enterprise-level work and it can't just be Nike swoosh branded products. It has to be Jordan and Converse as well.
Finding the Business Owner for a Taxonomy
Siobhan: You raised something in a previous conversation about the way you work with these different teams and the way that you make sure that the taxonomy is up to date, that it's actually meeting the business needs. You said that you always look for a business owner in that team. Can you talk a little bit about that process of how you identify the business owner for the taxonomy and where it goes from there?
Ahren: Yes. So sometimes the business owner seeks us out, and they realize that they need taxonomy work. So often they self-identify, they come out and say, I own this process or I own this tool or I own this data set or I own this functionality and I want to use taxonomy.
Other times we know that there's data out there that really should be taxonomy data and maybe nobody's come and asked us. Honestly, a lot of it is about looking at work charts and talking to people and networking and asking who owns this data? And again, in a large company this is fairly typical, where finding ownership and tracing ownership for a process isn't always straightforward. You would think it should be, right? Like, I'll talk to this director or I'll talk to this product manager, they'll know — and they don't always know.
Sometimes that's because it was owned by somebody in the past and the ownership didn't get transferred. Or there's other reasons why people might not be associated with that product or that data set anymore. Maybe they've shifted roles, maybe they've moved on from the company. So that's work that we do and that's sort of beyond, or let me say in addition to the scope of taxonomy work is that taxonomists really have to know how to network and talk to people. And just like you would research anything else, finding a data owner is part of that research.
Siobhan: I'm thinking about this because Nike has about 83,000 employees, give or take. So that's got to be needle in the haystack-type work. And you're probably wishing there was a taxonomy of people.
Ahren: About that: we do actually have a pretty good, and it's separate, but we do have pretty good internal resources for finding people. And also looking up the org chart. Now the org chart can change and then sometimes that can lag, right? Or somebody leaves and it's not updated right away. We have pretty decent resources internally to be able to find people, although it can be challenging. Sometimes a role doesn't always match up to what they do, the title.
So sometimes it can be a little challenging to find people. But there's quite a few people at Nike who have been around for a long time. And so that's another way you find things — the good old traditional knowledge management is you go, I know who to ask. They'll know the answer and I can either call them or email them or instant message them and probably get to that other person. And maybe it'll be two or three steps to find that next person.
How Internal Mobility Affects Taxonomy Work
Siobhan: You've already mentioned this a few times that people are always moving positions. Internal mobility is something that we talk about a lot on Reworked as a beneficial thing, but in cases like this it sounds like it might make your job a bit more complicated. So what happens if you're working with somebody on a taxonomy and they are suddenly moved over to another department where they're no longer dealing with the same kind of topics or perhaps just moving on to another job — are you left adrift?
Ahren: Well, because we have been left adrift in the past, we're starting to address that much better now. We're saying, one, as you mentioned before, we want to have that owner. And not just have that owner today, but to have plans and have documentation in place for that transition. And that can be for many reasons. It can be they switch positions. As you said, that's beneficial. We have a lot of people at Nike who have worked in completely different roles, maybe not what they started doing at Nike, but they've done other things because they were interested.
That kind of mobility is great. But if they leave and they drop whatever work they were doing and it doesn't get transitioned, then that's problematic.
So we've been putting in place a lot more process and a lot more documentation to make sure that we know who that person is today and to communicate with them. We have regular meetings with these people too. We put the onus on them to say, you're moving, if you're transitioning, what are the next steps? You can't just simply leave, right? Because then we position the risks: If you leave, here's the data that's at risk, and here are the downstream systems that are at risk by you leaving the company, by you moving, by whatever happens.
So we have ways of working with them, and some of it's very formal and some of it's very informal, where it's just a conversation to say, who's taking over for you?
If you don't have that person, who do we keep in touch with until that person takes over from you? Because maybe it's a new hire, right? And we have to wait for the new person to come in. Or maybe someone else is shifting into the position and they're still getting their feet on the ground and they don't know all the people to talk to, right?
So we're trying to formalize that process more and more to make sure that ownership is continuous. And some of that has been from painful learnings where we had an owner and then they vanished and then we had to scramble to find out what to do. Now I think we're much more accountable and we ask for more accountability from the people we work with.
Siobhan: I think it's a relatable topic for a lot of people outside of the taxonomy framework because it is this risk of knowledge leaving an institution when people leave the company or when they move on to another department or similar. So this process that you're formalizing: are any of your colleagues in other departments saying, maybe we need to adopt that for other uses or is this becoming a standard practice?
Ahren: I think so. We are as much pushing the envelope forward ourselves as other teams are doing the same in, say, enterprise data governance and those kinds of fields where they're like, this has been a habitual problem. Not in taxonomy, but in closely related data issues where somebody built a table and maybe they're the only owner of it and they haven't opened up — and there's data in it. And again, they leave or they leave the company or they switch positions. And now you have all this unaccounted for data.
Is somebody using it and consuming it? We don't necessarily know. There are many teams that are doing this kind of work. So we're making our own process, but we're also learning from other teams on what are they doing to hold people accountable and using some of their best practices as well.
Siobhan: Would you ever move forward on with a taxonomy without a business owner? Is there ever a use case where you would be like, we need to plow on without?
Ahren: I'll be honest, it can be frustrating because there are things that we know are good taxonomy practice and that we should have or should do. And we are knowledgeable enough within the company as taxonomists because we work with other people that we could just build it. We could just say, they will use it.
But it's this accountability and that can be frustrating. But it also is a work overload mitigator for us, right? We only have so much bandwidth. If you say to a taxonomist: you are now responsible for updating all of the athlete names. Why should that be the taxonomist's job, right? Even though we know it should be done, even though there should be a cleaner process for doing something like that, we can't take that kind of work on because at end of the day, we're not the domain experts.
We're not really the data owners of that kind of information. We shouldn't, and we don't have the time to update all of those things. So even though there are places where we feel like we should build out, we have to take a step back, restrain ourselves a bit and not build it out. Only because then it will age.
We have built things in the past. And again, this is some learnings where people suspected quite rightly that this would be necessary information. They built it in the taxonomies. And there it sits. And now here we are three, four, five years later, not knowing whether anybody's using it, not knowing where it came from really, or the use case behind it. We're pretty good about documenting editorial notes and scope notes and definitions and saying, here's when I added this and here's why.
But there have been some things added where things got lost in translation at some point, and we don't know why it's there. And so you have something in the taxonomy that you're not willing to deprecate or delete because you're not sure who's using it. But it's also been long enough that you're not sure who to follow up with either. So we have to track those things down.
So we really try to avoid building things as much as we think we're probably going to need them in the future. We are trying not to build those things until we have the owners.
Siobhan: That's gotta be a hard thing to resist though, where you're like, I see that need, but I'm gonna hold off.
Ahren: Right. Especially with the whole build it and they will come prospect. Because often the potential stakeholder doesn't know what question to ask for us to answer. And to show them that it's already possible could get them to adopt taxonomy use. So there's a tension there of, well, if I built this thing, they might come and see the value and use it.
But if I build it and they don't see the value and they don't use it, then we built something that no one is using and then we have to maintain it and there's no owners.
How Does Taxonomy Answer Business Users' Needs?
Siobhan: Can we flip to the perspective of the people who are actually relying on your taxonomies? How exactly would their interaction say that I'm a product manager and I need to make up a variation on a sneaker that already exists and we need to see the history of the sneakers and how they've changed and what might be a good thing. Like how would your taxonomy play in there?
Ahren: You know, there's actually multiple places for product data and it's not always consolidated in one place.
So to find the history of things can be a little trickier to understand like when it came up and where it's all documented because these things can be documented. You know, the process is an ongoing improvement, I'll say, in that we have legacy processes for when a company was smaller, pre-COVID, all of these things that we are working to improve these processes and streamline them. So finding something like that might be a little difficult.
But if they're coming to us and saying, I have metadata needs, what do you have? We can basically do show and tell. We're walking through the taxonomies and then if they say, it doesn't seem like you have the data I'm looking for, because we work for so many people, we can usually point them to the right place if we don't have it.
We play the intermediary for people to come to as a librarian might. It's like, what are you looking for? We either have it or we don't have it. And then we can point you to the right place to find that kind of data. It's frequently a multi-step and multi-person and multi-team process to get to some answers. Sometimes you just find the answer right away, but other times we have to go to these people who then redirect us to these people. Then we have a meeting to understand all the places it is because our information architecture is complex and data lives in various places, sometimes incorrectly, sometimes correctly. Does it need to be in multiple systems for multiple reasons, for multiple consumption, and where best for that person to get their data might be an open question. Because it does exist in different locations to serve various purposes.
Siobhan: Nike is a global brand and you are creating this taxonomy that has to work at a standardized level across countries. How do you build in some allowances for regional differences?
Ahren: That's very challenging and we know that quite well. That happens in taxonomy and it happens in other systems. Because you're absolutely right that when you think about what's allowable in a country, both for data, but also just culturally, right? For instance, digital assets with athletes with tattoos. It's fine to use in the U.S. but there are places you cannot use those. And so therefore, they need a version of that asset that's either doesn't have that person in it, or it's been scrubbed and more or less redacted where you don't see the tattoos. The same thing with our data — we do translate our vocabularies into multiple languages and run that through a translation service within the company so they can localize things.
But there are bigger questions too that are sort of beyond our scope, which is we're providing you the taxonomy data. We're providing it in all the languages that Nike needs, so we're not missing anything there. But how they choose to use that and which products they choose to market, that's really their decision.
There is this Nike headquarters, North America view of Here's where we're going, but then there's a lot of freedom in the geos to say, This product won't work here, or This color won't work here, or We have our own holidays and we have to promote something else.
So we do have something to do with that, but a lot of times that's a bigger picture with a lot of teams to say, OK, we're creating product data in multiple locations. We're creating variants of assets to serve out to these geos. It is very complex and we're just part of one cog in this very large machine. But I think mostly our part is to that translation. And then they can choose to use taxonomy terms or not use taxonomy terms on the products as they see fit.
Flexibility vs. Consistency in the Taxonomy
Siobhan: So if that's the case in all things where you're providing this taxonomy and people can kind of be like, OK, I'm going to do this or not? Or does there have to be a certain amount of controlled vocabulary to it?
Ahren: That is very challenging as well. We can build the taxonomies to serve the needs and to make sure that we're checking the boxes and that things are done correctly. So if there's anything in the taxonomy that needs to be checked by legal, for example, we do our due diligence, we make sure that it's OK to use that term and this goes back to ownership. We can't just randomly add something because maybe we're not aware that there is a legal risk or that it's culturally sensitive or, there may be reasons beyond our scope of why we shouldn't add a term or why we should add it in a different way.
So there is this sort of, I'd say, healthy tension between controlling the vocabulary centrally and making sure we have the right terms and the geos doing what they need to do to market products. There's rigor where there needs to be rigor and looseness where there needs to be looseness, where there's some things where we're like, yeah, do what you like. It's your geo, right? You do what you need to do. And then there's the things where you cannot do that contractually or legally, whatever the case.
A lot of how our taxonomy data gets used in consuming systems or by consumers is really up to them. We can't police everyone, therefore, we have to have overarching guidelines around enterprise data governance and security and all the regional laws and regulations. So if I give you a Nike taxonomy and you live somewhere where you're under GDPR, you have to know GDPR to know how to use this taxonomy. We can't really tell you. Now we can enforce the laws in the taxonomy if somebody asks us to, right? But once that data leaves our system, it's kind of up to them to know how best to use it.
Siobhan: That comes back to the point that you are working with these subject matter experts throughout the business to develop these taxonomies. And it's why you're doing it, because you can create these structures. You can understand what kind of terminology you can see, like rising terminology in the language and stuff like that. But you can't actually get all of those nuances because you're only so many people.
Have you ever had a taxonomy discussion get really heated?
Ahren: Definitely. There's a certain level of decorum, and there's multiple flavors of that. So one flavor is simply, we're going to build our own taxonomy. Because we don't really have the authority to tell someone you must use taxonomy, this is more of a negotiation. We say, it's better for you if you use taxonomy, it's better for the enterprise, it's better for everybody, but we can't force you to do it. So if they say, well, we're just gonna build our own thing, then it's up to us to basically convince them.
I've never been in a conversation here where it really got ugly, where you could tell people we're disagreeing, but I think that there is a good sense in this organization culturally that we're not going to behave like that. We may disagree, but we're going to find a path forward somehow.
The other flavor of this is really around the term, where they say, The term must be this. And you say, Well, that isn't really the best term. Or it doesn't fit in with other things, or we have terms that conflict with that. People can be really adamant. That doesn't happen very often, because usually the terms we get, because it's an owner, are pretty straightforward.
But it can happen where it conflicts with somebody else where you say, this doesn't really work. Sometimes that comes down to technical restrictions where really we should be delivering it in a way that we would maybe have a different form of the term or it would have a qualifier or something, but their system can't absorb it in that way. Then we have to do something that violates taxonomy best practices with the hope that we can fix that.
We document that, that this was a decision we had to make to keep the business running. We don't think this is the way forward and we would not like to keep it this way. How do we come back and fix this?
A lot of times these things get de-escalated, pretty well and pretty quickly. At any organization, there's going to be major choices: Should we be using this system or this system? Right? They do similar things, but maybe they don't do 100% overlap. So what do we do about that? So there have been difficult, challenging conversations. It never really gets awful. Not here, anyway.
Siobhan: Doesn't get too spicy. It makes sense though, because in your role, you have the bigger picture view. And I think most likely what's happening is when people hold onto these terms very tightly, it's just because they are seeing their exact domain and not necessarily its relation to everything else, which is sort of what you're there for. So you have to, as you said, have a lot of people skills as well as these other skills.
Is Generative AI Providing Value for Taxonomists?
Siobhan: So I have to ask the token generative AI question. Are you using generative AI in taxonomy creation or in any way with your taxonomies yet?
Ahren: We have several in-flight proof of concepts. I recently attended KM World and heard from other organizations, and I feel like a lot of people are about in the same place where nobody has been a breakout innovator or leader in using it.
Some people have found use cases that were a little bit better for what they do and so they can go farther with it than another company simply because they have a completely different use case.
We do have several things and like many other organizations, we've made AI and GenAI specifically a priority to work on because people are using it — and like other organizations, we do have use cases that could benefit from it.
We're fortunate to have both the taxonomy side and the AI/ML resources in-house, which isn't always, at least in my experience, very common. You had taxonomy and you didn't always have data scientists or machine learning experts, right? You'd have to rely on a vendor. So we have a lot of luxury in that sense.
What it really comes down to is saying, OK, GenAI, everybody's doing it, we're gonna do it. But are we doing it because we're jumping on the bandwagon or we have real use cases? And what are those use cases? We make sure that we put some real clear boundaries and guidelines around this and clear requirements to say, OK, this is in fact a use case which can generate something and be interesting.
Where we've been very fortunate is that although the AI/ML teams and resources could build a lot of this themselves, they recognize the value of taxonomy and we're talking to them all the time. That to me is fantastic because they could just say, We're not going to work with you. We're gonna get our data somewhere else. We're gonna do something else because maybe it's scrappy or faster. But I think everybody has seen that the two have to work together.
And again, we're navigating where is it OK that you just simply don't use taxonomy? Where do you need it and will you use it and will it be robust enough and do we have to do anything in addition to make it more usable for you? I think we're in pretty good position. Again, it's very early days where we're coming up with some things.
There have been some pilots, but I wouldn't be able to estimate a timeline of when you [the public] would start to see those things, or even if you would know that you're starting to see those things. But I think we're doing fairly well in terms of having projects with definitive goals going forward.
And like anywhere else, one thing that's really exciting about Nike is we're not afraid to innovate and we're not afraid to fail. So if we run a project and we get to the end of it and say, nope, that's not going to work or we're not ready for it, or whatever the case is, people are OK with saying, OK, let's not spend any more time on this. We're not ready for it, or it's not gonna give us what we need.
So rather than fail spectacularly by pushing forward and making sure this becomes something that we make public and it flops, we push it forward internally to see whether it's got longevity to it before we put it out there. I think it's great that we're OK with, Well, that didn't work andthere aren't huge repercussions. You didn't take the company in a terrible direction and we can do something else.
GenAI for Taxonomy Creation?
Siobhan: That approach completely make sense especially in a case like this where we don't know all of the potential of the technology. And the only way you're going to learn is by testing out these use cases. The fact that you are identifying specific areas where you're like, Hey, this is where I see it potentially being able to be an additive and to potentially exponentially increase our team size in theory, and then trying it.
Why jump on it just because, Hey, generative AI, here we are. But you do think it has true potential? Part of this is for years we've been hearing AI will come and will automatically tag everything. Nobody will have to tag again in their lives and everyone's like, Yay! No more metadata! So we've been hearing this for so long, but you are seeing real value here.
Ahren: Well, I think in all credit to the people developing these tools out there so that we can use them in our organization. Because I mentioned this KM World conference. A few years ago, you would go to this and start talking about AI and you would get eye rolls. You get people being fearful: It's going to come take my job. There was a lot of skepticism. When I went this year [2024], it was more: It's here. It's going to stay. We're not getting rid of it.
How are you going use it? It's a tool, let's use it. And that's such a short span of time. I attended KMWorld the year before or two years before that. And so in two years, the technology is better. The use cases are clearer. The ability to bring it in and actually use it. It's not this mysterious complex thing as much as it used to be.
You don't necessarily have to rely on a vendor and trust that they know what they're doing. Granted, many people do because it's faster, but it's not a requirement to do that. You can say, We have people internally, what guardrails would we have to put in place to use a publicly available thing like ChatGPT or similar? How can we use that without putting our company at risk? There are better best practices around that now, where you can do it instead of saying, Well, we can't use that, we're just going to close everything off or we've put ourselves at complete risk.
I'll credit our organization for jumping on that very quickly to say, People are going to want to use this. We need to jump in real fast and say, here's how to use it safely and effectively. And we're going to give you the tools to use it safely and effectively because you cannot take part of our code base and check it in ChatGPT in a public place. But we don't want to take that tool away from you. So how do we do it right?
I love that approach where we say, we're just not going to ban it — we're going to do it right. We have to slow it down just a little bit, but not so much that you can't get some benefit from it.
I already see the practical applications for these tools. Taxonomy is by nature a little slow. You have to vet your terms, you add them and you find the business owner to do all these things. But then there are trending things, that happen on TikTok or in other social media, and they're moving very fast and the searches move fast.
A wonderful example of that is Barbie pink. You can go to Google trends and you can see where it spiked, went through the roof for about two months. And then it trailed off. It's not that people don't want Barbie pink anymore, but that was hot. That was really important. And you knew it was coming and you knew if you had products that were pink, you better get them on your website, right?
We don't. We're Nike. We don't sell Barbie products. We don't have a relationship with Mattel. We cannot use the term Barbie on our products because that's not legal. But we can show you pink products because we have them. And that's what we're going to do.
A lot of this becomes overly manual because you're trying to react to it, and we are trying to push forward on that to capture those trending concepts and respond to them. This is one of those places I was mentioning where sometimes there's taxonomy and you say, here's products and here's things that we have. And maybe we supply information about which products we have and the various kinds of pink color that we have. But we're not going to provide you terms that you shouldn't use on our website.
We cannot use a term like Olympics. We cannot use a term like Barbie. These are patented, copyrighted terms. But what we can do is we can use large language models, we can use generative AI to extract concepts and look and see which one of those are ephemeral, which ones are compared to our taxonomies and sort of mash the two to be able to say, all right, I can slap something onto these products right away, or I can group products based on these trends much more quickly than a human being can do it. Then I can let that trend disappear when it disappears. I don't have to maintain it.
There's a lot of value there to be able to respond much more quickly.
Same thing with generating copy. We have copywriters like most people do. Maybe you can generate copy that reflects trending concepts more quickly than people can write trending, right? Because we've got our product copy, we've been doing it for years, so we've got something to work from, but then we have trending concepts, maybe you can supplement some of these trending concepts into this product copy that's already been written, so that it boosts the findability of a product.
There's a lot of things that you can do that will speed time to market. We don't tend to do a lot of image generation because we have existing products and we want to be particular about this product goes on this type of model or is presented in this kind of lighting or has this kind of background. So we want to be very intentional about that.
Internally, we could use a tool like that to generate new images and then run it through and say, yes, this image is OK for public consumption. But we are not ready to just say, Hey, somebody typed something in, automatically generate an image of our product somewhere. It's too risky still. We could get ourselves into some big trouble. So we're not doing anything quite that experimental, but I could see that potentially happening in the future.
Siobhan: The whole idea of being able to act more quickly and also to use the large language model to actually understand the relationships between your existing taxonomy and these trends, where otherwise you would have to be combing through and asking does this actually fit? Is this actually thing? That absolutely makes sense.
Job Security in the Era of Generative AI
Siobhan: You did say when you were talking about KM World about everyone fearing for their job — does this mean that you're feeling pretty good about your job in light of generative AI?
Ahren: I feel more confident about my job. I'm actually surprised by this in a good way, in that there was talk several years ago that you won't need a taxonomy or you won't have to build a taxonomy. This machine will just build it for you. And there are tools that can build and supplement taxonomies. But the prevailing attitude right now is that there must always be human in the loop and that the best practices of taxonomists as people are always going to be necessary.
Maybe someday these things will be so flawless that they can just generate a taxonomy without any input. Maybe that'll happen, but I'm not too concerned about that. I think that people will need to be involved to curate these things.
These generative AI tools have gotten better. I've seen some of them in action where it says, here's my top-level concept, show me potential child concepts for this. But they're not domain specific.
Memories of the Semantic Web
Ahren: And that's been a problem with the whole semantic web. The semantic web we were promised and that I wish we had, it hasn't really panned out. There's semantic aspects, but it never really panned out because there was this notion that once you build an oil and gas taxonomy or an apparel taxonomy, that everybody will use it.
But nobody wants the generic one. They want theirs, right?
And they don't even want to start with the generic one necessarily and build it to make it theirs. Often they start from scratch and say, I'm not interested in this one that's publicly available.
I'll walk this back a little bit. Pharma, biopharma, medical, those have well-established taxonomies, and you should use them. But in a lot of other industries, we don't want want to use that. So this idea that something is out there for us to use and we would just adopt it, that never really happened unfortunately. I think that I am not fearful for my job or jobs like this because I think people will always need to be involved.
Siobhan: I never thought I'd be looking back fondly on 2008, but that's around when I was thinking the semantic web is going to happen. I remember there was a lot of momentum then.
Ahren: Yes, and some of it kind of happened when you look at Google's Knowledge Graph, that's very semantic web. People do use things like WCA data, but not to any extent. I think there was really this notion that at some point we will build it once, we'll get it right and we'll never have to build it again. I've never seen that to be true.
The Book of Your Organization
Siobhan: Not yet.
Ahren, I really appreciate you taking the time. Is there anything that we didn't cover today that you wanted to touch on? Any final thoughts, final advice for our audience?
Ahren: I've been really thinking about taxonomy as a bridge to knowledge management, as an artifact itself, not just that it points to knowledge. You're tagging knowledge, but that itself is knowledge, right? It's never finished, it's always changing, it's always evolving.
When you look at your taxonomy, it is an indicator of your organizational structure, your organizational mindset and domain viewpoint. It's this thing that must exist. And that's even more important in a world in which we're using large language models to say, Here's the view of the world, here's the view of my organization. I've built this taxonomy, I've built this ontology, this structure to reflect the view.
It's like the book of your organization. And I have built that framework. Now I can compare the two, and that's going to be very important. So I think taxonomies still have a place, especially when they're an organizational viewpoint, to bounce it up against the things that are publicly available to put your perspective as an additional layer on this general or more publicly accessible perspective.
Siobhan: I love this idea of the book of your organization, that's a wonderful way to look at the taxonomy. So thank you for that.
Well, Aaron, I really appreciate you taking the time to chat with me and hope to catch up with you again sometime soon.
Ahren: Thank you, great conversation. Very happy to be here. Thanks very much!
Siobhan: Thank you so much for joining us today. If you enjoyed today's show, please share it with a friend. Word of mouth marketing is the best marketing that anyone can ask for.