Fraud and security departments converge on battle against ‘synthetic identities’

Cyber Security News

Secretary of the Treasury Janet Yellen, who also served as chair of the Federal Reserve under the Obama administration, speaks during a daily news briefing May 7, in Washington, DC. The Fed recently produced a formal definition for synthetic identity fraud. (Photo by Alex Wong/Getty Images)

The Federal Reserve recently produced a formal definition for synthetic identity fraud, a process that involved a committee of a dozen outside experts convening over nine months.

So what is it? By the Fed’s definition, synthetic identity fraud is the use of artificially constructed identities, fake or mismatching personally identifiable information to scam organizations. While the Fed is primarily concerned with the financial sector, the approach can be used against a variety of industries, from insurance to health care.

Greg Woolf, founder of FiVerity, which developed a machine-learning approach to rooting out synthetic identities, says that understanding synthetic identities is not just an issue for the institution’s fraud departments, but chief information security officers as well.

SC talked to Woolf, who is also a member of the definition committee, about the fraud and what security leaders should keep in mind.

To most CISOs, synthetic identities might seem like an issue for the fraud department to handle on its own. Why is it that CISOs should be getting involved?

There’s an overarching convergence of cyber and fraud. Digital transformation has created new opportunities for fraudsters. The fraud department and cyber [team] need to work together on these kinds of sophisticated hacks.

They are not just bad underwriting. Leaving them in the loan origination department and the fraud department and chalking it up to bad loans is incorrect and has far-reaching consequences not just on the financial [indutsry], but even from a national security perspective.

I was on a panel the other day and somebody asked me, “So are we getting ahead of the fraudsters?” And our response was, “No. The best we can do is track them because they’re continuously evolving to what we’re doing.” It’s almost like our AI is fighting their AI. They’re using their automation and they’re changing over time.

So, what exactly is synthetic identity fraud and why is it so important that the Fed came up with a formal definition?

In short, the way synthetics work is hackers mine the dark web, they combine multiple elements of compromised identities — so, your name, my date of birth, “Robert’s” social — and they create this fake persona. And these fabricated accounts are essentially used to defraud banks of billions of dollars. And while these accounts obviously do tremendous damage from a financial perspective, they can also be used for other nefarious activity because it creates a front for an individual to undertake activities like human trafficking and money laundering.

So, the reason why the Fed came up with this definition is because it’s a sophisticated cybercrime: cyber fraud. And according to the Fed, more than 85% of synthetics slipped through the cracks based on traditional rules-based systems. Part of the problem was there was no standard definition as to what it was. Another part was that it doesn’t have a victim besides the bank. If somebody steals your credit card, you’ll know about it as soon as you get your next statement — but not with a fake person. The goal of the Fed is to create a standardized efficient definition to get everybody on the same page so that the industry could start to get to that W. Edward Demming, “If you can measure it, you can manage it,” point.

Take us through the process of developing that definition.

The Fed convened 12 industry experts, some folks from technology like myself, some from the credit reporting agencies, some from big financial institutions, some from the large credit card companies. The goal was to come up with a standardized definition. What we really discovered during the process was, without a standard definition, a lot of banks were quantifying this type of fraud as just bad underwriting, as credit losses. And while that’s financially harmful to them, it also underplays the significance and importance of the fact that it’s actually a criminal activity.

So the first goal was to define what it is. The second objective was to define where it gets used.

Where does it get used?

It gets used for a number of different purposes. It gets used for credit repair, where people have bad credit and they use a synthetic identity improve on that. There’s folks who come in as illegal immigrants, they’re just trying to get by and live with the legitimate identity. And then of course the big capital is for funding or funding for criminal activity, which is the majority of the motivation.

You said 85% of synthetic identities do not get caught. Why is that so high?

The challenge of course is that fraudsters use a ton of automation to generate these accounts, and they look very real. They create these profiles that look very realistic, and they start off borrowing a little bit of credit, and then over time, they pay back the debt, building up their credit in the system. So ostensibly they look like the perfect customer. Fraudsters know to what extent they can ramp up those balances with the financial institution before they “bust out,” and they know at what point it gets too suspicious.

They’re playing the long game. They can take six to 12 months to build up these profiles, and they look like great customers. The reason why the existing technology doesn’t catch them is because most fraud detection solutions are rules based, and they just don’t have the sophistication and the adaptability to be able to pick up on the patterns that the forces are using to generate and create these fake identities.

We look at 25 to 30 different data points. We started off with a 20% efficacy rate which means that for one out of five of the loans [we were notified] “hey, this loan looks suspicious.” Our banking customers said this was something that would have slipped through the cracks. In Q1 of 2021, We found it was a 50% efficacy rate. What that means to me, first, is the algorithm improving; but second, the problem is accelerating.