Understanding the Unexplainable

Michelle Advaney
Michelle Advaney

Latest posts by Michelle Advaney (see all)

Machine learning isn’t just another tool that’s available at a data scientist’s disposal to solve any type of problem. Data scientists have to follow an entire process that includes six stages. At Feedzai, we refer to this as the Data Science Loop. To learn more about our data science technology, check out our Ebook on How to Choose a Machine Learning Platform.

Previously, it was not possible for data scientists to guarantee that the results they are seeing in the test or sandbox environment will actually match what’s seen in real-life deployment. This is because data scientists were seeing the performance of models or rules independently. When you add the unpredictability of the results in the production environment when compared to the test environment, it leads to an unnecessarily frustrating experience for data scientists due to the time wasted constantly making manual updates.

Although there have been tremendous advances in not only the application of machine learning but the accessibility and understanding of its use, data scientists still have an extremely difficult job in ensuring the system’s performance is consistent to the results seen when the model is tested. A good machine learning solution allows you to seamlessly test and deploy models in a single platform so you have less variability in the results produced. The result you get from testing rules alone and machine learning alone doesn’t have the same result as when you combine the two together. But data scientists don’t have all the tools available to make this end-to-end deployment.

Trusting the Machine

Data scientists need to be able to trust that the output is what they expect it to be, meaning that it’s the same result as what was seen in the sandboxing environment. Models and rules are tested individually which does not reflect the real impact when both are deployed in combination. Not having clear visibility into the impact of the specific rule or model means that the most optimized combination of rules and models will not be found.

As anyone can imagine, this lack of predictability typically leads to a time-consuming experience for data scientists in having to constantly go back and make manual updates to iterate on both changes in rules and models, as well as the order in which they are placed. Data scientists have to follow this process over and over until the results in production are close enough to what was seen in the testing environment.

The Most Powerful Way to Measure Risk

At Feedzai, we make AI explainable and controllable. With our newest feature, Scenario Optimizer, we’ve created a risk sandboxing environment that not only measures the effect of making changes in a combined environment of rules and models, but also gives data scientists a view on how to prioritize machine learning models with business rules. This allows for fewer iterations and a significant amount of time saved. When trying to prevent fraud, there’s a balance that needs to be achieved between customer experience and risk management. Certain rules or algorithms may prevent more fraud, but these same changes may add friction to a customer’s payment experience. Data scientist can look at the resulting ROC curve and prioritize for business KPIs such as fraud detection rate or false positive rate. Scenario Optimizer puts the control of the system’s performance in the hands of data scientists rather than the algorithm.

Feedzai enables data scientists to be confident about machine learning by helping them not only trust, but also optimize the process. Data scientists can reliably replicate what’s put into production from the testing environment, so machine learning is now more predictable and more accurate than ever before.

Learn more about Feedzai’s new feature: Scenario Optimizer.