OFAC's Flawed Web Search Tool - easyofac/docs GitHub Wiki

We often hear auditors and regulators evaluate compliance tools by comparing their results to OFAC's own web-based search tool. They set a compliance standard that includes a "threshold" or "score", as OFAC's web-based search tool allows the user to adjust minimum score thresholds. But requiring a sensitivity adjustment can be detrimental to an effective compliance program. To see why, let us look at how OFAC's tool works.

How it Works

For their first pass, the Fed's tool uses the SoundEx algorithm. This is a very basic phonetic analysis that classifies words or names using basic rules of English pronunciation. But most foreign names are not pronounced using these rules, and often they are transliterated into English. As a result, SoundEx matches many names where it should not, and misses many others that should be considered.

After this, the Fed's tool applies a scoring filter based on something known as a string distance algorithm. Consider two names, "Robert" and "Roger. String distance algorithms calculate how many characters must change to transform one word into the other. Each algorithm is slightly different, but they all operate on this principle. The problem in this example is that "Robert" and "Roger" are two very distinct, unrelated names, which share some spelling similarities. But the Fed's algorithm, Jaro-Winkler, scores them as an 86. For most organizations that adopt a scoring-based methodology in OFAC screening, this result would be well above their threshold and would generate additional false positives.

The Fundamental Problems

Adjusting the threshold moves string distance scoring limit up or down, and allows more or less false positives, but does not improve the quality of matches. Furthermore, the OFAC web search tool looks at the name as a whole, not as its components (first name, surname), which returns even more false positives. It also searches across "weak aliases", which by OFAC discourages the use of in automated searches, due to the "large number of false hits that these names may generate." See SDN Alias Screening Expectations.

The bottom line is that, while the Fed's search tool is useful for one-off searches, the methodology is fundamentally flawed for use in compliance and auditing operations due to a lack of fidelity and vast numbers of false positives. While their search tool can be helpful for quick lookups, it should not be relied on for compliance or auditing activities.

EasyOFAC's Solution

By contrast, EasyOFAC uses a layered fuzzy search, consisting of algorithms designed explicitly for name analysis. Without getting too much into the details, the EasyOFAC system will return anything matching a majority of our layered algorithms (from 1/2 to 2/3). While this may seem like a low threshold, it is essential to remember that EasyOFAC scoring doesn't directly correlate to OFAC's traditional scoring. And we use different algorithms than OFAC's web search tool. At these thresholds, and with our tuned algorithms, EasyOFAC typically will return far less false positives while remaining exceptionally effective at matching the positives.

For more information on how the Fed search works, refer to OFAC FAQ: Search.