Illustration of a large brain with gears next to it, a hand with a chipped payment card connects to the brain; demonstrating how machine learning latency works

Latency is a critical factor in the performance of machine learning systems at financial institutions. However, it’s often misunderstood or misrepresented. This post aims to clarify what latency is, how it affects your fraud operations, and what you need to look for when evaluating latency claims from service providers.

What is Latency?

Latency refers to the delay between a request and its corresponding response within a system. In financial transactions, the latency of a certain system represents the time it takes for a transaction to be processed by that system, i.e., from processing the transaction start until it ends in that system. Understanding latency is crucial because it directly impacts the speed and efficiency of your operations.

Types of Latency Metrics

Two primary metrics measure latency: average latency and percentile latency.

Average Latency: This is the average time to process all requests over a given period. It is often low and can be misleading. For example, an average latency of 20 milliseconds (msec) might seem impressive but doesn’t tell the whole story.

Percentile Latency: This measures the latency at specific percentiles, such as 95% or 99%. It represents the time within which a certain percentage of transactions are processed. For example, a 99% latency of 500 msec means that 99% of transactions are processed within 500 msec or less, while 1% of transactions take longer than 500 msec to be processed. Using a lower percentile highlights that even small percentages can represent a significant number of transactions in high-volume systems.

Why Percentile Latency Matters

Percentile latency gives a more accurate picture of system performance, especially for those rare but critical moments when transactions take longer than usual. A few slow transactions can significantly affect the overall system performance, as illustrated below:

Imagine a system processing 1,000 transactions per second in parallel, handling 100 million transactions. If 99% of these transactions are processed within 20 msec, but during a brief 50-second period, 50,000 transactions take 500 msec due to a temporary slowdown, the average latency might still seem low (around 20.2 msec).

However, the 99% latency would now reflect 500 msec. This demonstrates that while the average latency might look good, a small percentage of transactions can experience significantly worse performance. In systems handling millions of transactions, even 1% represents a large number of affected transactions. For this reason, it is relevant for such systems to report latency percentiles such as 99.9% or higher – e.g., 99.99%.

How to Evaluate Latency Claims

When service providers claim low latency, it’s essential to dig deeper:

  • Is it Average or Percentile Latency? Average latency figures are easier to achieve and, for that reason, can be misleading. Always ask for the 99.9% latency or similar high percentile metrics to understand the true performance.
  • Does it Include Network Latency? Some latency measurements focus solely on server latency, missing external factors like network hops—the points where data passes through devices like routers or switches as it travels across the network. This is especially important in distributed or cloud systems where data might pass through several hops, each introducing a potential delay. More importantly, systems cannot reliably measure their own latency. When a system experiences a slowdown or stall, internal measurements may fail to capture it accurately. External latency measurements are essential to get a complete view, as they account for these network hops and the overall performance of all components.
  • Local vs. Cloud Server: Verify whether the server processing the transactions is local or in the cloud, which can impact the latency.

Key Takeaways

Understanding and correctly interpreting latency metrics is vital. Always prioritize percentile latency over average latency and ensure that all components of the transaction process, including network latency, are considered in the metrics. By doing so, you can better evaluate your systems’ performance and make informed decisions when selecting service providers.

  • Average Latency: Easy to achieve but can be misleading.
  • Percentile Latency: Provides a more accurate representation of the system performance.
  • Network Latency: Must be included for a complete picture.
  • Server Location: Impacts latency and should be considered.

Understanding these nuances will help you ensure that your systems are robust and capable of handling transactions efficiently, even during peak times or unexpected slowdowns.