Classifying 100k bank transactions: a hybrid approach

A combination of NLP and expert rules outperformed other strategies for categorising large numbers of bank transactions.

Arjan van der Gaag

Aug 27, 2020

machine-learning

We’re hiring full stack software engineers.

Join us remote/on-site in ’s-Hertogenbosch, The Netherlands 🇳🇱

Join us in building a fintech company that provides fast and easy access to credit for small and medium sized businesses — like a bank, but without the white collars. You’ll work on software wiring million of euros, every day, to our customers.

We’re looking for both junior and experienced software developers.

Find out more…

Ruby on Rails PostgreSQL Docker AWS React React Native

How do you quickly categorise hundreds of thousands of bank transactions in order to determine loan application risk? At Floryn, we have seen great results with a hybrid system of domain rules and machine learning to support the speed and quality of our risk assessments for business loans.

Categorisation by rules

Given a single loan application, domain experts can pore over all the customer’s bank transactions to assess the risk of extending credit, for example by classifying certain transactions as income based on the description, amount or counterparty.

Table 1: example categorisation of three example bank transactions using domain rules
Amount	Details	Notes
€2.000,00	Bob’s Transport Company Invoice ABC123	Positive amount referring to “invoice”: category revenue
€-4.103,00	Alice Salary August 2020	“Salary” mentioned: category staff
€5.000,00	Shady Inc Weapon parts	Blacklisted company name: category suspect

When you deal with many thousands of customers and hundreds of thousands of bank transactions every month, automation becomes crucial. We had previously devised a system to automatically categorise new transactions based on a set of predefined business rules as defined by our domain experts. However, we have found this approach has several shortcomings:

It requires great investment in time and energy to design and maintain such expert rules.
Static rules do not deal well with the variance in input data, such as misspellings, synonyms, different contexts and new entities.

The latest developments in Natural Language Processing (NLP) seemed to offer a compelling alternative to a rules-based categorisation system. After evaluating different approaches we settled on using BERT. BERT is a method of pre-training language representations, meaning that we train a general-purpose “language understanding” model on a large text corpus (like Wikipedia), and then use that model for downstream NLP tasks that we care about (like labeling transactions).

Transfer learning with NLP

Some of the most exciting breakthroughs in NLP, such as ELMo, BERT, ELMFit and XLnet, are centred around transfer learning: using a language model that is pre-trained on extremely large amounts of data first, after which it can be applied to different tasks — such as categorising bank transactions. Although promising, we faced two challenges.

Domain-specific language: bank transaction descriptions are commonly comprised of keywords, abbreviations and codes, which is far removed from the inputs the model is pre-trained on. This requires us to re-train the model on a representative dataset of sufficient size.
Absence of ground truth: although we did have a dataset that was categorised with our rules-based system, we knew it contained mistakes. We did not want to train our model on those mistakes, since that would defeat the purpose of improving on the results.

The big advantage is that we have an existing dataset labeled by humans and by the existing rule system. Using an uncertainty model we extracted an input dataset based on our pre-existing expert-labeled transactions, looking up edge cases where the NLP model has a low confidence in the applied labeling.

After re-training our pre-trained multilingual DistilBERT model we compared both balanced and overall accuracy and found improved performance for some, but not all, transactions. We found we could use the uncertainty model again to dynamically decide between the best of the two approaches, devising a hybrid approach to categorisation.

Table 2: example how given high uncertainty BERT can be used to detect where domain rules fall short.
Amount	Details	Rules	Conf.	BERT	Outcome
€-4.103,00	Alice Salary August 2020	“Salary” mentioned: category staff	95%		staff
€4.103,00	Bob Loon Aug 2020	No keyword, category revenue	80%	“Loon” matches “Salary”. category staff	staff

A hybrid approach to categorisation

With the hybrid approach we use an uncertainty boundary to select either the results of the rules-based categorisation or the language model. Over 80% of “standard” transactions are categorised based on rules, while the remaining 20% are overruled by the DistilBERT model. The uncertainty model helps us find mistakes in the business rules: the rules perform about 18% worse on the transactions that are below the uncertainty boundary of 0.99.

The strength in this approach lies in recognising the strength of the business rules and getting creative with the different ways in which you can unlock value from the textual data to benefit machine learning models. With only a handful of iterations of relabelling data, we have improved classification performance significantly. With that, we are on the right track to handle ever greater volumes of incoming transactions and even better risk assessments for our customers.

Internships at Floryn

This blog post is the summary of the Thesis project of one of our interns Emiel de Heij and was co-written by him. He completed his Graduation Project and obtained his master’s degree Data Science & Entrepreneurship from the Jheronimus Academy of Data Science (JADS) at our company. We’re constantly giving interns the opportunity to build exciting new stuff and try out new technologies at Floryn, so if you’re looking for a technical internship make sure you reach out to us.

Arjan van der Gaag

Lead developer

Arjan has been with Floryn since 2021 and besides his role as lead engineer he is also the self-appointed head of dad jokes. He mostly works remotely from Helmond where he lives with his wife and two daughters.

Ask Arjan about:

Ruby on Rails Agile Elixir History the Beatles Futurama