Speeding up the ordering process with artificial intelligence


Brainjar/ April 2019

Before going into the technical details let us first present you with an overview of the business case. The following numbers are approximations of the actual numbers. In Belgium alone, our client receives around 8000 orders per day. Almost half of those orders is processed by email, the other half is either by phone or through the website. Each order requires on average 5 minutes of processing time. Totalling around 330 man-hours per day just for email order processing. With our order automation application we can reduce the processing time from an average of 5 minutes to an average of 1 minute, a reduction of 80%! An average reduction of 260 man-hours per day

How does Brainjar speed up the order processing?

Brainjar build an end-to-end machine learning application that

  • integrates with the mailbox;
  • classifies which emails can be handled by our software;
  • can extract the necessary information for building an order from an email;
  • presents the extracted information in a web interface for review.

To build this application Brainjar used a continuous process of 5 distinct steps:

  • Data analysis: We went on-site at the customer to understand the data and the business process
  • Research: The field of machine learning is advancing at a fast pace, therefore some additional research is required for each project
  • Machine learning: During this stage we train and fine-tune our machine learning models
  • Integration: Both at the input and output side we integrate with the systems of our client
  • Customer feedback: Periodically we go on-site or schedule a call to get customer feedback

A bird's-eye view of the application

Classifying emails

The first step was to analyse the workflow. Looking at how clients send emails, in which language, what information can we find in the email ... . Not only do we look at the emails used for creating orders but also other emails that are being sent. The first system we need, is something that can distinguish the emails containing orders from the ones that don't.


This is where we needed to research what the best way was to interact with the mailbox, manipulate emails and separate emails. To interact with the mailbox we've created both a Gmail and Office365 API for handling emails. For separating emails we used a branch of artificial intelligence called Natural Language Processing (NLP). With NLP we can classify emails based on the body of the email. There are a lot of possible NLP approaches to classify text so research was needed to search for one that was most suitable for the use case. More on this below.

Extracting information

The next step is to solve the problem of extracting information from emails. We analysed the emails as well as the corresponding order. From this we could determine which information is necessary to create an order. To extract information from the email we used Named Entity Recognition (NER). NER will locate and classify certain entities in the text. There are again a lot of approaches to NER. More on this below.

Web interface

The last step was to decide how we would build a user friendly web interface. TVH already used Angular for their web platform with their own house style. We used the same technology and layout to make the transition for the employees as intuitive as possible.




Gmail and Office365 API


One of the important parts about this project is manipulating the mailbox in a way that doesn't interrupt the workflow at TVH. This means we needed to be able to move emails around based on their labels or folder structure. That is why we developed an API that integrates both with Gmail and Office365. This API is written in Python and has the following functionality:

  • Move incoming mails to the dedicated label or folder
  • Extract all the information from an email
  • Put all the information per email on a Cloud Pub/Sub
  • Be able to add and remove labels from an email
  • Send and reply on emails


Gmail and Office365 API

One of the important parts about this project is manipulating the mailbox in a way that doesn't interrupt the workflow at TVH. This means we needed to be able to move emails around based on their labels or folder structure. That is why we developed an API that integrates both with Gmail and Office365. This API is written in Python and has the following functionality:

  • Move incoming mails to the dedicated label or folder
  • Extract all the information from an email
  • Put all the information per email on a Cloud Pub/Sub
  • Be able to add and remove labels from an email
  • Send and reply on emails

Backend

The backend is the backbone of this project. This will interact with all the other services. It will make sure that all emails are being processed correctly. This API is written in Python and has the following functionality:

  • Listen on the Cloud Pub/Sub
  • Send the email data to the text classifier
  • Send data to the Named Entity Recognition
  • Send instructions to the mail API
  • Process requests from the web interface
  • Create an order
  • Text classifier

The text classifier is necessary to determine if an email contains an order or not. In the future this can also be expanded to classify different types of emails. For example order, cancelation, sign out, quotation ... . We ended up going for a neural network that was language independent. TVH offers customer support in 37 languages. Keeping that in mind from the beginning will save us time in the long run.

The API works as followed: the body of the email is converted to a sequence of tokens. These tokens are then embedded into vectors. We are transforming the body to vectors because a neural networks cannot process text, only vectors. The vectors are then processed by the neural network which outputs an output vector. By adding a classification on the whole text the API is able to return if the email is an order or not.

Named Entity Recognition

For the Named Entity Recognition we used the same approach as the text classifier. This means that we can extract entities from emails independent of the language. The difference with the text classifier is that for the NER we want to predict the label of a token, not for the whole text. This is done by adding the classification on the output vector of each token. This will return a label for each token. With this we can extract the necessary entities like delivery date for example. This API expects a body of text as input and will return all the tokens with their corresponding label.

Web interface

The web interface has two functions. The first is to verify if an order has been made correctly before sending it to the system. The second function is to save the corrections that are being made. This will help to improve the text classifier and the Named Entity Recognition. The web interface has a login page, overview page and an order page.

Result

An email comes in on the mailbox. The email is being labeled as 'In Progress' while all the information is put on the Cloud Pub/Sub. The backend retrieves all the information from the Pub/Sub and sends it to the text classifier. This returns that the email is indeed an order. The backend sends the body to the named entity recognition. This will return all the entities in the email. The backend will then use all the information to create an order. It will then send instructions to the mail API to remove the label 'In Progress' and add the label 'Processed'.

Now an employee of TVH can login onto the web interface and see a list of emails that need reviewing or that are reviewed. When the employee opens an order he will see the original email on the lefthand side and the order on the righthand side. The employee can then make changes as needed and click submit when done. This information is used to improve the different ai solutions, reducing the time even further.

The future

Brainjar is still working with TVH on improving the application even further. Right now we are working on implementing PDF. A lot of the customers of TVH make orders with PDF's. The problem is that each customer has its own layout and PDF structure. The next step is to also process those emails. Another future extension is adding additional classes of emails, for example: stop orders, change orders, ... .

Experts who turned this story into a success

Niels Debrier - https://www.linkedin.com/in/niels-debrier

Maarten Bloemen - https://www.linkedin.com/in/maarten-bloemen

Adriaan Lemmens - https://www.linkedin.com/in/adriaan-lemmens

Rafaël Mindreau - https://www.linkedin.com/in/rafaël-mindreau-93503957

Kurt Janssens - https://www.linkedin.com/in/koert/

Tom Vermeulen - https://www.linkedin.com/in/tomv