5 Ways to Use Machine Learning in Knowledge Management

Most knowledge subject to a formal management process is available in written form. Those text collections contain more information than what is written in each of the documents. Analyzing the collections with AI methods like Machine Learning (ML) can support the knowledge management process in multiple ways and makes it possible to gain additional insights and advantages. This article provides some ideas on how to gain additional benefits from Knowledge Management (KM) using ML.

Example 1: Problem Management

Let’s start with knowledge in its second most common form: informal text with little to no structure in some document that is not subject to any KM process. (The most common form is within people’s heads.) Examples are error reports, maintenance reports or repair work protocols. They may come from technicians trying to solve the issue or users contacting the support. There is a lot of valuable information hidden in a collection of such documents. Which are the most common errors? What is the solution with the highest success rate? Are there areas where additional employee or user training might be needed? This is how you get that information: First, we need to tag each report with the error described in it. This can be done either with supervised or unsupervised ML. In supervised ML, you first manually tag a large enough number of documents and then train the system to guess the error from the description. In case the error is described with a code, this part is trivial. If the description is a free text listing symptoms, it is more complicated. If the number of possible errors is high and symptoms vary a lot, unsupervised learning might be the better choice. The system will group the descriptions by similarity of symptoms. Afterwards you manually tag each group with the corresponding error. The drawback is that you might not get one group per error. There might be cases where the system can only limit the number of possible errors, but not provide a clear decision. Now the system can tell what issue the document is about. Thus, you can monitor how often which issue occurs. Here are some examples for the advantages you get from this:

You can find the weaknesses of a product by looking at its most common errors.
You know there is an issue with a new update or a special product batch if numbers rise right after the release.
You can optimize your training programs to focus on the common issues.

Example 2: Suggesting Solutions

Guessing the error was only step one. Step two is to train the system not only to recognize the error, but also to suggest a solution. Take the repair protocols for an error and train the system to detect similar procedures. (If you have information on whether the solution worked, use only the successful procedures.) For each group you write an instruction. You can do this either manually, especially for common or high impact issues, or use a text generation algorithm to create an instruction based on the repair description you already have. The system can now suggest the most common solution to an error (and if you want, also the second most common in case the first one did not work or is not applicable for some reason). It can tell the error from the description. This makes the investigation process much more efficient. And as a bonus, your technicians do not need to write the repair protocols any longer as the system can pre-fill them in most cases. How well a system like this works depends on several factors. Most important are the number of available documents, the number of errors and the variety among the solution descriptions. The more documents per error and the less variety, the better the system will perform. Even with a great AI model, you should not blindly trust the suggestions. But having them definitely is of big advantage.

Example 3: Efficient Search

The next level is having the information in some sort of central system like a wiki, SharePoint or a knowledge module of a ticketing system. In that system you most likely have some search implemented to allow users to quickly find what they need. Search engines are very sophisticated these days and sometimes even employ AI technologies for various purposes like ranking or spellchecks. Especially a good result ranking is important for search. If the result you are looking for is on position 24 on the result list there is only a minor difference to not being included at all. The number of times your terms are used in a document does not necessarily determine its usefulness in your situation nor does its click rate. What you need are the pages most used in your case. While ranking results, the search engine should consider which pages users with a similar interest read, which results they skipped or closed after a short look, and which document they finally used. Web analytics and user tracking can provide such data. To find out which users are looking for the same information, several techniques can be used. Comparing the search terms is straight forward, but might suffer from use of synonyms, different languages, or even misuse of terms. Defining and training intents is an alternative. The technique is primarily used in chatbots to extract the information need from free text input. But as the use case is similar in search it can easily be transferred. Collect search queries that aim for the same information, use them to train the system on recognizing the intent and then let the search check if a new query matches one of the intents. If so, rank results usually used by users with this intent higher. The drawback of this method is that defining intents is not that easy. However, there are other ML techniques that can suggest new intents to add based on the search requests.

Example 4: Personalization

For KM systems with a wide user range there is the challenge to provide everyone with what they need and keep them updated on changes – without making them search for it among content not relevant to them or burying them in notifications. You need to personalize your content delivery. Content should know to which users it is relevant. To get there, again we collect information via web analytics and user tracking. This time we are interested in who uses which content. Then we use ML to build user groups based on user behavior. In most scenarios, every user will be a member of multiple groups. Once learned, the system can add the users to groups automatically. However, assigning them manually should be possible in addition to that. For the content, you do the same. You train the system to predict which user groups might be interested in it by looking at the groups interested in similar documents. Now when adding a new document, the system can notify the users it is relevant for, add it at a prominent spot in their view of the interface, and hide it for other users.

Example 5: Quality Monitoring

User Feedback is vital to KM. Without it, you are missing out an important stakeholder group and risk the acceptance of the KM program. There are many ways to gather feedback: Ratings, surveys, user tracking… The best way to gather feedback is enabling comments. Comments allow the user to give a detailed opinion. They can ask questions, point out minor weaknesses and engage the user in the improvement process directly as they can give input. And in contrast to a survey, they do not need too much preparation and, on a small scale, little interpretation. However, when the number of comments on your content grows large, moderating discussions can get time intense. In addition, it becomes nearly impossible to grasp the overall mood of all comments on a document. Luckily, both issues can be addressed with the same method: Tag a set of comments with the information you need for your comment monitoring. Then train the system to recognize these categories from the text. In marketing context, this is called sentiment analysis since the desired information is whether customers like or dislike a brand or product. In KM however, other categories are important, e.g. whether a comment is a question, critique, a suggestion, or a praise. Questions and critique should be addressed by the moderator or content responsible within a short period of time, while a suggestion might be relevant only with the next content review. A praise, while being the reaction you hope for, might not require any reaction at all. By sorting comments that way using ML, the workload for moderators and responsibles decreases. The same information can be used for quality monitoring. While a high number of comments tells you that a piece of content is perceived, it does not tell you whether it does so for being useful and important or for being insufficient. The ratio of different kinds of comments can tell a lot more. The meaning of praise and critique is obvious. High numbers of questions and suggestions mean the content is important and used (thus has some quality) but might need refinement. This way, you can use the comments to monitor the quality of the content, improving where the need is greatest and noticing early if the quality drops or is not met. These where only 5 examples on the benefits of combining KM and ML. The implementation of AI often fails because of missing data – but KM can provide data in form of text. And with the data available, all you need to start is knowing what you want to know. There is so much more possible if you keep in mind that in KM the whole provides more information that its parts.