Homing in on suspicious insurance claims with D2K

Insurance is intended as a safety net, something to help when an accident, illness, or act of nature strikes. Some people try to abuse the system, however, hoping to gain from false claims. "Abusive and exaggerated claims are endemic in the insurance industry," says Stephen D'Arcy, a professor of finance at the University of Illinois at Urbana-Champaign. As a recipient of a 2004-2005 NCSA/UIUC Faculty Fellowship, D'Arcy was able to leverage NCSA technology and collaborate with NCSA staff to further his research. Insurance companies rely heavily on information supplied by claimants, and don't always ask for verification of the facts their customers submit. That's because investigating claims is a data-intensive, time-consuming process; investigating each and every claim in detail would not only be prohibitively expensive, it would also cause delays and headaches for honest customers. So insurers try to separate the claims that can be quickly approved from those that require thorough investigation. Through his faculty fellowship at NCSA, D'Arcy has been exploring the use of data mining techniques to allow insurers to identify situations when further investigation of a claim is likely to lead to a positive outcome. For example, when an insurer requests an independent medical exam to verify claims of bodily injury, the investigation leads to a reduction in the claim 60 percent of the time. By using data mining techniques, the company might be able to better target the claims requiring investigation, increasing the percentage of successful investigations and avoiding investigating cases in which an expensive exam won't benefit the company. D'Arcy applied the Data to Knowledge (D2K) data mining systems developed at NCSA to the more than a half million bodily injury claims in the database of the Automobile Insurers Bureau of Massachusetts. This wealth of data has been compiled by the many smaller insurance companies operating in Massachusetts, which regulates insurance rates. The goal of the project is twofold. First, D'Arcy aims to generate predictive models to enhance insurance claim investigation practices. Second, D'Arcy hopes that the dissemination of results from his research will advance the use of data mining in the insurance industry in a way that private, proprietary studies can't. By bringing more data and analysis into the public domain, he hopes to foster the exchange of ideas, eventually making companies more willing to adopt data mining technology. "It's surprising that the insurance industry is not using data mining more effectively," he says. Using D2K, D'Arcy examined the data to establish which factors can be used to identify claims for which investigation is likely to generate cost savings by detecting fraudulent, improper, or exaggerated claims. For example, mining the data revealed that claims filed in October, November, and December are more likely to be reduced or denied after an investigation. The same can be said of claims involving sprains and strains and cases in which the claimant seeks chiropractic care. These patterns can be used to develop models that allow insurers to more easily sift the problematic cases from those that warrant an immediate green light. D'Arcy used a particular case to illustrate how data mining could save insurers money. A man filed a claim with his insurance company, claiming he had been involved in a collision and that his car required extensive repairs. The claim was about to be approved when an investigator, sensing something was amiss, decided to call local towing companies. He discovered that the man's car had been towed recently after it was swamped with water during a drive down a flooded street. Incorrectly believing the water damage would not be covered, the man had asked someone he knew to deliberately crash into his car so he could file a collision claim. Detecting the fraudulent claim rested on the ingenuity and tenacity of an individual investigator. The claim could easily have slipped past him and cost the company money. Automated data mining, however, could routinely look at towing records and could flag the claim for investigation without requiring human inspiration and intervention. Data mining techniques could catch patterns that would be challenging for human workers to spot - such as claimants traveling long distances to see particular doctors or lawyers or a pattern of cases involving the same doctors and attorneys. "The patterns are out there and the information is out there," D'Arcy says. "Insurers just need to develop systems to automatically look for these patterns."