Skip to main content

Lesson 17 of 25

Name Screening and Matching Logic: Fuzzy Matching and Thresholds

4 min read · CGSS

Open up the screening engine. Understand fuzzy and transliteration matching, the threshold trade-off between false positives and the dangerous false negative, and the above- and below-the-line testing that proves your tuning is safe.

Names are messy

  • Same person, many spellings and transliterations
  • Aliases, abbreviations, missing data
  • Exact-match alone misses real hits
  • Matching logic is the heart of screening

Name screening sounds simple, compare a name to a list, until you realize how messy names are. The same person appears as multiple spellings, especially when transliterated from Arabic, Cyrillic, or Chinese into the Latin alphabet. People use aliases, abbreviations, reversed name orders, and partial data, and lists themselves carry several variations per target.

So an exact, character-for-character match would miss most real hits, that's why sanctioned parties just tweak a spelling. The engine that decides what counts as a match is the real heart of screening, and the exam tests whether you understand its trade-offs. Let's open it up.

Fuzzy matching and transliteration

  • Fuzzy/probabilistic matching tolerates variations
  • Handles misspellings, transliteration, name order
  • Algorithms score similarity, not just exact equality
  • Catches deliberate spelling tweaks

The solution is fuzzy matching, also called probabilistic or approximate matching. Instead of demanding an exact equality, a fuzzy engine scores how similar two names are and flags those above a chosen similarity. It tolerates misspellings, alternate transliterations, swapped name order, missing middle names, and common abbreviations.

This is what catches the evader who designates themselves untouchable by simply altering one letter. Good engines also normalize data first, standardizing case, removing noise, and accounting for cultural naming conventions, before they compare. The point for the exam: effective name screening is similarity-based, not exact-match, precisely because real-world names and deliberate evasion produce endless variation.

Thresholds and the two errors

  • Threshold = how close a match must be to alert
  • Loose threshold → many false positives
  • Tight threshold → risk of false negatives
  • False negative is the dangerous error

Every fuzzy engine has a threshold, the similarity score at which it raises an alert. This setting forces a trade-off the exam loves. Set the threshold loose, low, and the engine alerts on weak similarities, generating lots of false positives, alerts that turn out not to be real matches, which buries your team in work.

Set it tight, high, and you cut the noise but risk false negatives, real sanctioned matches the engine lets through silently. Of the two, the false negative is the dangerous one, because it means a sanctioned party slipped past undetected. So thresholds aren't set to minimize work; they're set to manage risk, with enough sensitivity that true hits surface even at the cost of more false positives.

Tuning the engine

  • Tune thresholds and rules to balance the errors
  • Above-the-line / below-the-line testing
  • Test that lowering the bar wouldn't catch real hits
  • Document the rationale and approvals

Because of that trade-off, screening engines must be tuned and the tuning tested. Tuning means adjusting thresholds, match rules, and the fields and lists used, to balance catching true hits against drowning in false positives. The key testing technique is above-the-line and below-the-line testing.

Above-the-line testing samples the alerts the engine did generate to confirm they're being handled correctly. Below-the-line testing is the crucial one: you sample the near-misses that fell just below your threshold to make sure you aren't quietly missing real matches, if you find true hits below the line, your threshold is too tight. All of this must be documented, with a clear rationale and appropriate approvals, because regulators expect to see why your settings are calibrated as they are.

Beyond the name

  • Use secondary identifiers (DOB, ID numbers) to confirm/clear
  • Watch 'weak aliases' and low-quality list data
  • Screen owners too — the 50% Rule again
  • Sets up payment screening next

A few refinements that separate a strong program from a checkbox one. Use secondary identifiers, dates of birth, identification numbers, places, to confirm or clear a potential match rather than deciding on the name alone; a shared common name with a different date of birth is likely a false positive. Be careful with weak aliases and low-quality list entries, which generate noise and need judgment.

And remember the 50 Percent Rule: name screening the customer isn't enough if you don't also screen the beneficial owners, since the blocked party may be the unnamed owner behind a clean name.

The exam's favorite threshold question

  • Cutting false positives by tightening = risk of false negatives
  • Reducing alert volume is not the goal; managing risk is
  • Below-the-line testing protects against over-tightening
  • Sets up payment screening next

Let's pre-empt a question pattern the exam reuses constantly. A scenario describes a team overwhelmed by false-positive alerts, and the tempting answer is to raise the threshold so the noise drops. Be careful: tightening the threshold to cut workload directly increases the risk of false negatives, real matches that now fall silently below the line.

The goal of tuning is never simply fewer alerts; it's the right balance of risk, with enough sensitivity that true hits still surface. So the better answers usually involve smarter matching, better data, secondary-identifier checks, and validated tuning, not just turning the sensitivity down to make the queue shorter. And whenever you do tighten, below-the-line testing is what proves you didn't start dropping true matches.

If you internalize that a quieter queue can hide a more dangerous engine, you'll avoid the trap built into these questions. Next, we apply screening to money in motion, payment and transaction screening, where real-time decisions and wire stripping come back into focus.

Sources

  • OFAC guidance on sanctions list screening and name matching
  • Wolfsberg Group guidance on sanctions screening (matching, fuzzy logic, and tuning)
  • OFAC SDN List data (aliases/AKAs, weak aliases, transliteration)
  • OFAC 50 Percent Rule (August 13, 2014)

Test your knowledge

A few CGSS questions on this material — pick an answer to see the explanation.

  1. Q1. A customer is identified as a Politically Exposed Person (PEP) but does not appear on any OFAC or other sanctions list. What is the sanctions compliance obligation?

  2. Q2. Company A is 40% owned by SDN-1 and 35% owned by SDN-2. Company B is 60% owned by Company A. Is Company B blocked under the 50 Percent Rule?

  3. Q3. When screening vessels in a trade-finance transaction, which identifier is most reliable for confirming vessel identity beyond the vessel name?

  4. Q4. During an investigation of a potential SDN match in a wire transfer, the analyst cannot resolve the alert before the payment's same-day processing deadline. What is the appropriate action?

Ready to practice?

Put this lesson to work on real CGSS questions.

Drill the full CGSS bank →