Knowledge discovery is a well-known discipline in the data mining field. But what happens when you add a layer of intelligence? Read on to learn more.
Knowledge discovery is the process of extracting knowledge or understanding about a dataset. Traditionally, we use knowledge discovery to identify patterns within relational databases. However, you can also apply it to other data sources including text, graphs, and software code.
Applications of knowledge discovery
There are a few well-known use cases for knowledge discovery. Let’s look at a few of these and then see how AI can augment the process.
Patent searching and discovery
Patents are big money. When you take out a patent, you get the exclusive right to sell products and services based on the described invention. During the patenting process, it is essential to prove that the idea being patented is original. Specifically, you must show that you made an inventive step. To achieve this, you need to hunt for any other patents or publications that describe something similar. This process is known as patent searching.
Patent searching can be very laborious and time-consuming. Typically, you need to employ an expert for this and can expect to pay thousands. Even then, there’s no guarantee they will find every piece of prior art. Alternatively, you can use a computer to help with the process. Traditional computer-powered patent searching uses text-matching and other techniques to try and identify similar patents. However, you can also apply AI to the problem. To do this, you have to combine natural language processing (NLP) with machine learning to identify exactly which patents actually overlap the current one. As an extension of this, you can use the same techniques to discover new patents by performing a gap analysis on existing patents. As a result, AI is now seen as one of the most essential tools for patent discovery.
Your customer service agents are often inundated with simple support requests. Dealing with these can cost your company a lot of wasted time and hence money. Alternatively, you can use case deflection to try and reduce the number of support requests you have to handle. I am sure you have all seen these sorts of systems. The dumbest asks you a series of questions “to help ensure the right person deals with your request”. Before you can actually submit the support request, they will display a set of “related help topics”.
More clever versions use keyword searching to provide you with relevant help topics and threads from community support channels. For instance, if you type “How do I reset my password”, they will find articles relating to password reset. The most intelligent use AI-powered knowledge discovery coupled with natural language processing. These can provide you with answers to even quite complex requests.
If you are prosecuting or defending a corporate legal case, you need to find all documents relating to the case. In a large corporation, you may have tens of thousands of documents and millions of emails to search through. You could choose to do this in a relatively dumb fashion and just construct keyword searches. But this will still give you hundreds of thousands of pages of text to read. Knowledge discovery techniques can be used to speed this process up. Rather than just returning results based on your keywords, you can now get results that are contextually related to the case.
AI takes this to another level. Using AI-powered knowledge discovery, you can search all the documents and emails to find all relevant pieces of text. These can be ranked in order of significance and displayed in context. In effect, you are teaching the computer to mimic how you would assess the relevance of each document.
Knowledge discovery and the COVID-19 fight
Knowledge discovery also has very topical applications. We are all being hugely affected by the COVID-19 pandemic. COVID-19 is caused by the SARS-CoV-2 virus. Fighting this disease requires us to do one of three things. Firstly, we can try to find a vaccine. This will be used to provoke the body into creating antibodies to fight the virus. Secondly, we can identify an existing drug that has the potential to reduce the impact of the virus. Or thirdly, we can identify a new molecule that might form the basis of a new drug treatment.
AI powered drug discovery
Molecular graphs let you describe complex molecules and proteins in a way that a computer can understand. Once you have the concept of a molecular graph, you can apply graph mining to help with various problems. For instance, you can use it to identify chemically similar molecules. Or you can identify new candidate drugs for treating diseases. You can also use AI-powered knowledge discovery to process all the scientific literature. You can use this approach to identify potential candidate drugs for treating COVID-19.
AI protein folding
One of the most complex problems in biomedicine is protein folding. This is essential for understanding how our bodies work and how diseases and viruses affect us. Every protein has a unique 3D structure. These structures are key to how the protein works. For instance, if you look at coronaviruses like SARS-CoV-2, you can see a large number of protein spikes sticking out. Each of these has a unique shape that binds to receptors on the cells in your body. This allows the virus to attack your body and hijack your cells. Our antibodies work by identifying these receptors and latching on to them. The protein folding problem is about assessing every theoretically-possible protein so you can find ones that have the correct shape. Until recently, this problem required sheer computing power and a lot of time. But now, we can use deep learning to solve it far quicker.
Using Sonasoft Saibre for Knowledge Discovery
As we have seen, AI can significantly boost the performance of knowledge discovery. However, as you may well have found, creating machine learning models for AI projects can be a long and complex task. Fortunately, Sonasoft Saibre is able to dramatically speed this process up. Saibre is a universal AI platform that is able to autonomously learn from your raw data. You can give NuGene data in almost any format including text, numerical data, graphs, or images.
Saibre takes your data and processes it to identify any interesting patterns and correlations. From this it tries to form some hypotheses—in other words, it tries to learn why the pattern exists. Uniquely, it then tests its hypotheses for causality and rejects any that are weak. Next, it will create and test a large number of machine learning models. These models are trying to replicate the identified patterns. Once Saibre has a working model, it packages it up in a usable form (which we call a bot).
You can use Saibre bots for a huge number of AI tasks. One of the most useful is knowledge discovery. For instance, we have a demo that shows how Saibre can ingest a complex set of company HR policies and then answer freeform questions. For instance, you can ask “Can I take more time off this year?” Knowledge discovery is also a key part of our AURA engine for support case management. If you would like to see a demo, reach out to us.