Image representing how LightGBM algorithm works

Fighting Fraud at the Speed of LightGBM

What kind of difference can LightGBM make in fighting fraud? In a word: huge. In a number: $150 million saved in potential losses.

One of the world’s most recognizable card networks turned to Feedzai with an interesting challenge. After a lengthy period of trying, their internal teams had finally managed to improve the company’s fraud-fighting performance on their own machine learning model by 20%.

The processor wanted to know if Feedzai was able to match their improved performance? They presented this finding as a challenge to Feedzai via a proof of concept (POC), and we excitedly accepted. Within six weeks, our machine learning model exceeded the processor’s objectives and delivered a 55% performance improvement.

You’re probably wondering: how did we do it?

Ask any data scientist what matters most to obtain good results and they’ll mention three things: data quality, informative features, and choosing the right algorithm. At Feedzai we continually obsess about the three, as well as seamless operationalization, such as the ease of deployment into production. A few months earlier, we’d run another POC for the payment processor. In that test, we used the LightGBM algorithm and produced a 12% improvement over the firm’s existing model for fraud detection with a card not present (CNP) use case. These POCs demonstrate how LightGBM has the potential to stop more fraud in less time and why it has become one of our preferred algorithms.

What is LightGBM?

LightGBM is a tree-based learning algorithm that grows vertically instead of horizontally. The tree grows leaf-wise instead of level-wise, allowing the algorithm to reduce the loss more efficiently. Gradient boosting machines build sequential decision trees where each tree is built based on the previous tree’s error. We first started researching the algorithm in 2016, after it won a number of competitions on, a popular data science competition site.

The advantages of LightGBM

LightGBM has a few characteristics that distinguish it from Random Forest, another powerful algorithm we use. These include:

Efficient model training

We can train a LightGBM model in less than an hour, which is considerably less time than other algorithms require. During the POC with the processor, we were able to build a production-grade model starting from scratch in less than five weeks. That’s roughly half the time the processor had previously taken to build and deploy its own models.

Requires fewer computational resources

LightGBM consumes fewer resources than alternative algorithms. This opens numerous new opportunities for data science teams to experiment with more models while consuming less memory and RAM in the process, creating efficient data science teams.

Manages large-scale data

Not only does LightGBM consume fewer resources than alternative algorithms, but it can also handle larger amounts of data more effectively. The algorithm can process greater volumes of data while also generating more accurate results.

Improved Accuracy

LightGBM can also enhance the accuracy of detecting fraudulent activity, meaning clients are able to find more fraud with fewer errors. More accurate scoring capabilities give retailers and banks the ability to stop millions of dollars from being lost to fraud that might otherwise have transacted successfully.

Supports parallel and GPU learning

LightGBM supports graphic processing unit (GPU) learning that can help improve application performances and enhance model training times. The algorithm also supports parallel learning that enables numerous computations to occur simultaneously and produce more-rounded analyses for clients.

Explainable AI (XAI)

A long-standing issue with machine learning technology is that it is often a black box that does not allow financial institutions to understand how the model arrived at its decision or analysis. When it comes to anti-money laundering (AML) and account opening, this black box approach causes serious issues. Banks can’t use machine learning technology that doesn’t explain to regulators why and how analysts decided to flag an account or file a SAR.

LightGBM algorithm, on the other hand, is a tree-based algorithm that relies on simpler trees (growing leaf-wise allows it to have fewer leaves for a similar performance), which makes it well-suited to explain its decisions.

Fraud detection results achieved using LightGBM

Prior to taking on the POC, we had run LightGBM for other large clients with noteworthy results. An investment bank in the U.K., for example, saw a strong increase in the share of total money fraud reported. Once LightGBM was in place, the money recall rate improved by more than 11 percentage points.

Similar results were observed by a pair of EU-based financial service firms. First, a Spanish payment processing provider saw its money recall rate increase by 9 percentage points. A French-based interbank network, meanwhile, saw its money recall rate increase by nearly 7 percentage points.

In addition to financial services, a global athletic retail brand also saw a significant improvement by shifting to LightGBM. This multinational firm saw its detection rate (the share of fraud detected by the Feedzai system) improve by almost 14 percentage points.

These improvements indicate the difference that LightGBM can make in fraud detection. Comparing the same number of fraud alerts, the firms also saw a rise in “useful” alerts enabling them to act swiftly to suspicious activities and were able to stop a higher share of fraudulent transactions. At the same time, they also saw a reduction in false positive alerts and successfully flagged fraudulent transactions of much greater value. This translates to more money saved from fraudulent transactions from the same number of alerts.

LightGBM vs. Random Forest

Considering the results achieved with LightGBM, one might think that we’ve stopped using Random Forest. That’s actually not true.

Machine learning algorithms are not a one-size-fits-all proposition. We must consider our clients’ business goals and constraints. Feedzai currently uses both LightGBM and Random Forest depending on the client’s needs. Generally speaking, here are some of the criteria we use to determine which algorithm best fits the project at hand:

table comparing use cases for LightGBM and Random Forest algorithms

Feedzai’s core mission is to fight fraud and keep commerce safe. Knowing how LightGBM makes it easier to focus on fulfilling our mission we decided to integrate the algorithm into our platform using the same tools and application programmable interfaces (APIs) we provide to our clients. Clients can now point and click to implement LightGBM into any live system and fight fraud more aggressively.

Collaborating with the LightGBM Community

It’s not just our clients who’ve benefited from LightGBM. Feedzai also believes strongly in the future of LightGBM and recently had the opportunity to give back to the LightGBM community and contribute to its future development. We have already performed a few commits to Microsoft’s LightGBM project that has enabled faster predictions with low latencies and provided a few bug fixes. The Microsoft community has been very welcoming of our contributions and we are eager to continue this collaboration going forward.
Key Takeaways

Upgrading to the LightGBM algorithm provides some businesses with the tool they need to find fraud more quickly and with more accurate results, preventing millions in losses.

Internal data science teams also stand to benefit from LightGBM’s enhanced flexibility. Models can be built and deployed into production at a considerably faster rate. Teams can make adjustments without having to lose time or consume resources as they test and deploy models. These data science teams can also rest assured that they will be using the most current versions in their work. Feedzai will continue to collaborate with the LightGBM community to make sure our clients have the latest upgrades and features in place.

LightGBM has already proven effective at stopping fraud in financial services, banking, and retail sectors stopping more fraud than ever before. As one payment processor can attest, having one of the most effective algorithms in place is the answer to the $150 million challenge.


Discover how to fight fraud with the Open ML Engine by Feedzai. Watch a six-minute demo now!


Alberto Ferreira

Latest posts by Telmo Marquês, Data Science Manager (see all)

Subscribe to stay infomed

  • Artificial Intelligence