April 22, 2022

What can I do with my data?

Caroline Zabarowski

What can I do with my data?

One of the most common questions we get asked when we speak to a potential client is what sort of AI application can we develop for them. In other words, they want us to answer the question “What can I do with my data?” Unfortunately, until we actually see your data, we can’t give a proper answer. So instead, let’s see how we apply a methodology called CRISP-DM when we explore your data.

What is CRISP-DM

CRISP-DM stands for Cross-Industry Process for Data Mining. It is a well-established framework to deliver AI projects with real business value. There are 6 stages within CRISP-DM as shown in the diagram. The arrows show the most common flow through the process, but it is possible to iterate through the steps more than once.

For this blog we will focus on the first three stages.

Business understanding

We need to fully understand your business before we can begin to explore what you can do with your data. This includes exploring your use cases AI, understanding your priorities, and knowing how your data is generated and consumed. In some cases, clients don’t know exactly what AI can do for them. In those cases, we combine this phase with the data understanding phase. 

Types of AI application

As a rough guide, AI applications we develop tend to fall into three categories:

  1. Forecasting: Here, we use historical data, current data, and current conditions to forecast some future event. This could be forecasting demand for electricity, predicting the correct stock levels, or predicting staffing levels.
  2. Classification: These models are able to accurately classify data items across multiple data dimensions. This allows the model to do tasks like credit scoring, recommendations, and lead scoring.
  3. Anomaly detection: With these models, the aim is to spot data items that don’t fit the usual pattern. Two classic use cases for this are fraud detection and identifying unusual activity in your backend systems.

Data understanding

The first aim here is to identify and import all your data into SAIBRE (our AI ecosystem). This is important so that we can start to look at the overall properties of the data. What is it actually showing? Is it time series or discrete? Are there missing records? How was it generated? We also start to visualize the data and try to identify patterns and correlations.

This phase of CRISP-DM is often run in parallel with the business understanding phase. Typically, we can only be certain of what use cases will work once we start exploring the data. Therefore, we work closely with your subject matter experts to try and assess whether the patterns we are seeing are just accidental or will be useful to any AI model. At this stage we can actually properly answer your question “what can I do with my data?”

Data preparation

Next, we start to explore the data in a bit more detail. The aim is to engineer the data into a form that is suitable for modeling. This may require merging tables or using techniques like imputation to fill in missing data. As part of this, we start to construct our data processing pipelines that will be used to handle the data in production. 

By the end of this phase we will be pretty confident whether or not the data supports the use case or not. If it does, we will be ready to proceed to building a proof of concept (the modeling and validation phases in CRISP-DM). If not, we will give you clear guidance on how to collect the required data. 

Mapping this to our commercial process

So how does CRISP-DM look in practice? Over the years, we have developed a clear business process for delivering our AI solutions. The process ensures you retain control and aren’t caught out by unpredictable costs or time overruns. It maps neatly to CRISP-DM as shown in the table below. 

Stage What we do Duration CRISP-DM
Data workshop Establish if you are likely to be able to benefit from AI 2-4 hours Business understanding
Feasibility study Analyze all your data and establish exactly what we can do with AI 4-6 weeks Data understanding
Data preparation
Proof of concept Build a production-ready AI application and demonstrate that it delivers required business value 3-5 months Modeling
Validation
Deploy to production Run your new AI application in either Sonasoft infrastructure or your own backend. This includes our smart monitoring Monthly fees Deployment

If you would like to find out more about how we can help you deliver end-to-end AI applications, reach out to our sales team today. After an initial discussion we can set up a data workshop session and start to explore possibilities.

White Paper

SAIBRE AI Ecosystem

End-to-end AI applications that solve any business problem