A New Timescale for Fraud-Fighting Data Science, With Feedzai AutoML

At a time when data scientists are more critical to banks than ever – and more scarce – Feedzai AutoML is augmenting fraud-fighting data science teams by combining two things for the first time: automatic machine learning, and a platform purpose-built to fight fraud.

Our goal is to maximize the capacity of data scientists at banks in the midst of a data science arms race, and free them from all limitations. With the launch of the OpenML Engine in April, we freed them from the constraints of proprietary third party frameworks. And today, we’re freeing them from the most repetitive and time-consuming steps of the data science process, with the launch of Feedzai AutoML.

Fraud attacks are evolving faster than organizations can adapt. This is a problem during a time when digital transformation has become a priority. We built Feedzai AutoML to enable banks to rapidly confront all the new risks that are sure to accompany new use cases, channels, and geographies, so these businesses can protect themselves while also functioning as true digital banks.

Feedzai AutoML makes this possible by introducing a new timescale for data science work, replacing weeks of work with one-click models. Our AutoML allows for generating one thousand new features in minutes, not weeks. And it speeds up fraud prevention workflows by as much as 50 times, compared to current time spent on model creation and feature engineering.

Getting here first

Feedzai has always been obsessed with speed. We know that accelerating the machine learning process is critical for data science teams fighting fraud at an enterprise scale, and Feedzai AutoML is only the latest of our many leaps in this direction.

Since Google launched its AutoML in January, organizations have been finding it difficult to operationalize it. To perform its thousands of evaluations for neural networks, Google’s AutoML requires a GPU capacity that is out of reach for most other organizations. Each of these thousand evaluations alone can require more GPU resources than an organization other than Google can access without high costs.

Feedzai has changed that. We perfected AutoML for fraud by first using it on non-neural network model types like Xgboost and LightGBM, which have comparably fewer training parameters and faster training times.

And our big idea what sets Feedzai AutoML apart from AutoML in other industries was to develop an advanced type of semantic-based automatic feature engineering, where the machine recognizes the semantics associated with each field (e.g. geographies, currencies, times, dates, etc.). This means we can automatically build context-aware features that are based on our years of domain expertise. For example, the engine will create features such as counts of declined transactions per card in the last week/day/hour, distance traveled in the last day, time between consecutive events of the same user, and many more.

Our domain-specificity is greatly empowered by Feedzai AutoML model selection capabilities: the most competitive algorithms are automatically trained, optimized and compared so the best models are recommended at the end of the process.

So, for the first time, data scientists at banks can use AutoML that was purpose-built to fight fraud. Our customers will start seeing the benefits immediately. Because our platform was purpose-built to fight financial crime, our AutoML helps data scientists quickly generate the most relevant features and models for adapting to fast-evolving fraud schemes and attack vectors.

How it works

Before: humans designing models. After: machines designing models. Feedzai AutoML automates multiple techniques to give data scientists a comprehensive AI development suite that puts the fraudster in its sights.

A data scientist simply gives the machine the schema definition; training, development, and test datasets; the function to optimize (e.g., fraud detection rate at 1% false positive rate); and a few other parameters.

The engine uses AutoML to automate the following tasks:

  1. Automatic feature engineering (or creation) through a semantic-based technique that relies on a dynamic and ever-evolving repository of features. These features are based on our years of domain expertise. Without AutoML, the process of finding the right features frequently takes several weeks or months.
  2. Automatic model training for a wide range of machine learning algorithms (e.g. Random Forests, Xgboost, Gradient Boosting Machines, Deep Learning Networks, etc.).
  3. Hyperparameter Optimization through the process of fine-tuning their hyperparameters, resulting in an optimal set of parameters for a specific use case. For this reason, Feedzai AutoML relies on hyperparameter selection functions (e.g., random search, LIPO, SMAC, and GPUCB) during the model training stage.
  4. Automatic Model Selection through measurement of the model performance of the generated models on a validation dataset. The models that optimize the defined KPIs will be recommended. Furthermore, teams can also be able to assess the performance of the models against a hold-out dataset, guaranteeing that the model generalizes to new unseen data.


Charting a path to transfer learning

Transfer learning is a research term that stores learnings from solving one problem and applies them to a different but related problem. A model that gains knowledge from recognizing cats can apply it to trying to recognize dogs, or mammals in general.

Feedzai is exploring transfer learning techniques in order to solve another big problem: what happens when you are trying to add a first line of defense to your business, but you have little to no data? Feedzai is tackling this problem by using transfer learning to apply the knowledge obtained from training machine learning algorithms using rich labeled data, such as the model parameters, to onboard new customers or new use cases. By applying transfer learning techniques, Feedzai continues removing all existing barriers when it comes to empowering our customers to protect themselves.

This is the next stage for fraud-fighting data scientists who use Feedzai. We’re opening our system, speeding things up, and mapping the very DNA of financial crime, even as it rapidly evolves.

RELATED:
The OpenML Engine: How Data Scientists Can Bring Their Own Machine Learning to Fight Fraud