Skip to main content

Lesson 27 of 39

Advanced Analytics: AI/ML, Link Analysis & Entity Resolution *(OUTLINE + BULLET BODY)*

4 min read · CAMS

Explain how AI/ML augments AML detection — including supervised vs. unsupervised learning and the role of human oversight. Describe network/link analysis and entity resolution and what each one reveals. Explain why advanced analytics depends on data quality and explainability.

Cold open / hook *(0:00–0:30)* — [scripted]

A criminal network spreads its money across forty accounts, in twelve names, at one bank — and every single transaction stays under every threshold. A rules engine sees forty unremarkable customers. But draw the connections between them, and a different picture appears: one organization, hiding in plain sight. That's what advanced analytics does — it sees the *network* a rules engine can't. By the end of this lecture, you'll understand how AI, link analysis, and entity resolution extend detection beyond fixed rules — and why none of it works without clean data and an explanation a regulator will accept.

Body — [bullet teaching outline; expand to ~150 wpm prose when recording]

AI/ML in AML — augment, don't replace

- **Machine learning** finds patterns in large datasets that fixed rules miss, and can **prioritize** alerts by likelihood of being genuinely suspicious — reducing the false-positive burden rather than just adding more rules. - **Supervised learning** trains on **labeled historical outcomes** (e.g., past alerts marked "filed a SAR" vs. "closed") to predict whether new activity resembles confirmed suspicious cases. **Unsupervised learning** finds **clusters and anomalies** with **no labels** — useful for surfacing *novel* typologies no one has seen yet. - Common applications: **alert scoring / triage** (rank alerts so analysts work the riskiest first), **false-positive reduction**, **segmentation**, and **anomaly detection**. - **Human-in-the-loop is non-negotiable.** Regulators and the Wolfsberg Group stress that AI **supports** human decision-making — the final SAR decision rests with people, with appropriate oversight, not the algorithm. Wolfsberg's principles for responsible AI use stress fairness, accountability, transparency, and oversight.

Link / network analysis

- **Link analysis** maps **relationships** between entities — shared addresses, phone numbers, devices, beneficial owners, counterparties, IPs — to reveal **hidden connections** a transaction-by-transaction view can't see. - It exposes structures like **funnel accounts, mule networks, and clusters of related shell companies** — and the **central nodes** that connect them, which is where investigators focus. - Network analysis turns *isolated* alerts into a *case*: the forty under-threshold accounts in the cold open look unrelated until link analysis shows they share one phone number and one beneficial owner. - Output is typically a **visual graph** (nodes = entities, edges = relationships), which also makes the rationale **easier to document and explain** to investigators and regulators.

Entity resolution

- **Entity resolution** is the process of determining when **different records refer to the same real-world entity** — "Robert Smith," "Bob Smith," and "R. Smith Jr." resolved to one person despite name/address variants and typos. - Why it matters: without it, the **same customer (or sanctioned party) appears as several separate parties**, defeating aggregation, screening, and risk scoring. With it, the institution sees a **single, complete view** of a customer across products and systems. - Entity resolution is the **foundation under** both screening (don't miss a match because of a name variant) and link analysis (you can't connect a network if you can't tell who's who). - It is **probabilistic** — like fuzzy matching, it must balance over-merging (two real people collapsed into one) against under-merging (one person seen as many).

Anomaly detection

- **Anomaly detection** flags activity that **deviates from an established baseline** — the customer's own history or their peer group — surfacing patterns no pre-written rule anticipated. - It is especially valuable against **emerging/novel typologies**, where there's no historical rule to encode yet. - Trade-off: anomalies are not automatically suspicious — they require **human investigation**, and the technique can be **harder to explain** than a simple rule.

Data quality & explainability — the two strings attached

- **Data quality is the dependency.** Every advanced technique is **"garbage in, garbage out"** — incomplete, inconsistent, or inaccurate data corrupts ML models, breaks entity resolution, and produces phantom or missed links. FATF guidance on new technologies stresses **data governance and quality** as prerequisites, not afterthoughts. - **Explainability / the "black box" problem.** Complex ML models can produce decisions that are **hard to interpret.** Regulators expect institutions to be able to **explain why** a model flagged (or didn't flag) activity — for SAR defensibility, examiner scrutiny, fairness, and to avoid embedding **bias**. Wolfsberg's responsible-AI principles call out transparency and explainability directly. - **Model governance still applies.** ML systems are **models** under SR 11-7 — they require **validation, ongoing monitoring, and documentation**, just like rules-based engines (covered last lecture). New technology does **not** relax model-risk obligations. - The honest bottom line: advanced analytics is a **force multiplier**, not a silver bullet — it sharpens detection but inherits the data and explainability problems of whatever it's built on.

Recap & next — [scripted]

So advanced analytics extends detection in three directions. AI and machine learning — supervised on labeled outcomes, unsupervised for the unknown — prioritize alerts and cut false positives, but a human always makes the final call. Link analysis turns scattered alerts into a network by mapping shared relationships, exposing the mule rings and shell clusters a single-transaction view misses. And entity resolution stitches "Bob," "Robert," and "R. Smith" into one customer, which is the foundation under both screening and link analysis. The catch is always the same two strings: data quality, because garbage in is garbage out, and explainability, because a regulator won't accept "the black box said so." Next, we go to the data frontier — blockchain analytics for crypto, the RegTech landscape, and information-sharing utilities like 314(b) — plus the real limits of each.

Sources

  • Wolfsberg Group Principles for Using Artificial Intelligence & Machine Learning in Financial Crime Compliance
  • FATF "Opportunities and Challenges of New Technologies for AML/CFT" (2021)
  • SR 11-7 / OCC 2011-12 (model risk management)
  • FFIEC BSA/AML Examination Manual (suspicious activity monitoring)

Ready to practice?

Put this lesson to work on real CAMS questions.

Drill the full CAMS bank →