May 2025: AI Agents as adversaries for building fraud models
The same technology that can be misused for fraud, can also be used to harden & protect financial systems
A repeated pattern burned into my memory from Cash App was:
We’d ship a new payment feature
Immediately fraudsters would show up and “pen test” everything; stolen cards, stolen creds, stolen identities, payment limits, social engineering, software vulnerabilities etc in ever more creative ways
Fraudsters would quickly coalesce around any weak points or vulnerabilities and rapidly extract real dollars programmatically from the system
The risk and fraud data teams would go into overdrive patching up the vulnerabilities
Fraudsters would move on (I assume to the next weakest point)
In practice, there’s always some fraud in financial services. But payments products are weakest just after launch, particularly in more complex platforms (ie the more different instruments and types of money movement that a platform has, the more unexpected combinations are available for a fraudster to exploit.)
Fraudsters swarm to test new holes and drain funds while the team is getting their feet under them, and I can’t say this for sure but it definitely feels like fraudsters collaborate and communicate with each other when a vulnerability is discovered, and all try to exploit it.
As instant payments gain adoption around the world, in a time when LLMs make more types of automation, social engineering, impersonation easier, I only expect this dynamic to accelerate. There are already a trickle of examples out there of voice agents or deepfakes being used to defraud companies to dramatic effect, so I’d be surprised if this isn’t happening at scale. As browser agents and frameworks like MCP get better, you can expect more software surfaces to become vulnerable. I’m not as experienced in other financial domains (eg lending, insurance etc) but I suspect a similar dynamic is true in those as well).
Medium term, fraud detection and prevention is an active domain that requires financial services teams to continuously engage, detect new patterns, model them, and deploy those models to protect the company/financial services provider. It’s inevitable that some money will be lost, as fraudsters are human, and humans are creative in unbounded ways, and successfully stealing money is extraordinarily lucrative.
In addition, losses are not always due to the company or product being bad or weak. For example, most stolen debit cards are breached from merchants not issuers, but issuers are ultimately on the hook due to Reg E (this is as it should be)
The opportunity
There’s an opportunity for using AI agents as adversaries in a red team/blue team model. Here’s how it could work:
Teams of human and AI agents are let loose inside the system
This could be in staging/post QA, prior to going to GA (this is probably best practice)
In prod, agents should be able to exfiltrate actual dollars into an actual external /off platform bank account (controlled by the company)
This can be continuous, on both pre launch products that are already moving money, and live products that are already live to customers
Process would require teams to continuously monitor patterns of exfiltration and either
Adapt models to detect and prevent them
Close product gaps that make exfiltration possible
If you do this you’re ultimately constrained by GPU, but could drive step function improvements in risk losses, which at the scales of large fintechs, this could have a meaningful impact on earnings
As off the shelf agents (like Operator) and frameworks (and MCP) and auth get better, this should become easier and easier to implement
Open questions
Using stolen credentials and stolen instruments is a large vector for fraud, and is also literally illegal (a long time ago I thought it would be a good idea to actually try to buy stolen card data when it hits the dark web and preemptively add them to a block list. My compliance team was not a fan.) Given that stolen payment instruments and stolen/synthetic identity are two of the largest and most active vectors for financial fraud, it feels like a missed opportunity to exclude.
I’ve written before about authentication. I think in internal enterprise context this is reasonably manageable, but having programmatic ways to handle and orchestrate credentials will help scale something like this.
In a world where increasingly large numbers of financial activity moves onchain, adding this kind of system to test smart contracts will be super helpful. If you believe the noise about AI agents relying heavily on stablecoins to make and receive payments, this is a natural evolution.