Using New Technologies to Mitigate Fraud in Online Research

By Mathijs de Jong, P2Sample

Fraudsters continue to cash in by duping the market research industry. And they’re not just sitting at a desk punching through surveys anymore. New technologies are giving them new opportunities to engage in further disingenuous behavior. Ironically, these technologies are some of the same ones we are using to create efficiencies in the industry, even for fraud mitigation.

Automation and artificial intelligence (AI) are some of the most talked about technologies poised to change things like data integration and analysis. Conversely, they are depersonalizing the market research process, making it easy for fraudsters to commit fraud with no human interaction whatsoever. And they can do it faster and more efficiently than ever before.

As fraud advances, common techniques to eliminate it will no longer work on their own. Coupling traditional methods with more advanced solutions has the potential for higher success rates. By increasing barriers, we can start to detect and defeat the new wave of fraudsters.

What Are Today’s Fraudsters Doing Anyway?
Often the “bot” or automated fraudulent behavior happens at the registration stage, one of the most difficult places for detection. The accounts usually look quite “clean” from a device/geo point of view because that part is scripted. Third-party fingerprinting and fraud detection tools do not detect this type of fraud, nor do most “home-made” solutions.

We see a range of “bot” type of behavior that goes beyond the same individual simply creating multiple accounts and filling out the same information with a dynamic IP address. Some more sneaky fraud behavior manifests itself as scripted behavior surrounding creation of automated emails and creation of email accounts with random patterns, yet smart algorithms. Both can be difficult to detect.

Traditional Techniques are Still Important
Some traditional techniques still have their place in the fraud detection continuum:

  • Captcha: We’ve all seen the little box where we have to enter the right characters or solve a puzzle to prove to a site that we are human. This method introduces artificial blocks which require human intervention. However, this typically only works to ‘kill’ a fully automated process. Humans can solve the Captcha request easily and then hand over the rest of the process to a machine. By inserting these requests at random, Captcha can have greater success in finding fraud
  • Honeypots: This method acts as bait for things like malicious script and uses computing codes to act as a trap that machines find irresistible but humans never notice. If a “bot” falls for the trap, then we can stop it. This approach re-supposes non-human automation
  • Open End Questions: This method can help find statistically unrealistic results, caused by the same answer given by this one individual over and over. Unlike the ease of solving Captcha requests, it is more difficult for most people to create realistic, genuine, non-spammy open-ended answers on a large scale. Detection through open-end questions can therefore work well, especially if there is good, solid pattern recognition behind it

Bringing in the AI Layer
Marrying traditional solutions with brand-new techniques, like AI, can help us battle new types of fraud. AI goes beyond algorithmic decisions: AI is self-learning and will continue to find new patterns. Algorithms are typically more static in nature and require constant human intervention.

An AI approach requires:

  1. A method: the decision-making, heuristic or algorithmic, to determine whether a transaction is real or fraudulent. Using AI, and machine learning specifically, we can study patterns in real-time. By comparing pattern behavior with a survey’s unique set of criteria, AI can analyze billions of data components very quickly and detect anomalies, such as large surges of users with specific demographics
  2. Historical data: a good fraud detection model requires large amounts of data to help allow accurate classification. More data going into the model will help the machine learn and become better at classifying a survey complete as good or fraudulent
  3. Domain experience: traditionally overlooked, this simply means the experience that a vendor or supplier has in identifying, understanding and managing fraud. This underlying knowledge gives fraud detection techniques a solid foundation. Machine learning is not simply programming a machine learning model and slapping it on top of data. Ninety percent of machine learning is data science, and 10 percent is the actual machine learning implementation. Knowing which data to use, as well as which data represents fraudulent behavior, is essential for the algorithm work efficiently.

Keeping fraudsters at bay in the market research industry has become more challenging as these individuals become more savvy and utilize new technologies. Suppliers must start employing several layers of deterrents and marrying traditional and new techniques for fraud mitigation. These security measures, coupled with deep expertise, can start to significantly reduce fraud and have a positive impact on data integrity and quality.

By Mathijs de Jong, CEO, P2Sample