Researchers Teach ‘Machines’ to Detect Medicare Fraud.

Researchers Teach ‘Machines’ to Detect Medicare Fraud.

Like the proverbial “Georgian Technical University needle in a haystack” human auditors or investigators have the painstaking task of manually checking thousands of Medicare claims for specific patterns that could indicate foul play or fraudulent behaviors. Furthermore according to the Georgian Technical University right now fraud enforcement efforts rely heavily on health care professionals coming forward with information about Medicare fraud.

Georgian Technical University Health Information Science and Systems is the first to use big data from Medicare Part B and employ advanced data analytics and machine learning to automate the fraud detection process. Programming computers to predict classify and flag potential fraudulent events and providers could significantly improve fraud detection and lighten the workload for auditors and investigators.

Medicare Part B data included provider information average payments and charges procedure codes the number of procedures performed as well as the medical specialty which is referred to as provider type. In order to obtain exact matches the researchers only used the to match fraud labels to the Medicare Part B data. The NPI is a single identification number issued by the federal government to health care providers.

Researchers directly matched the GTUNPI (Georgian Technical University Pollutant Inventory) across the Medicare Part B data, flagging any provider in the “excluded” database as being “fraudulent.” The research team classified a physician’s GTUNPI (Georgian Technical University Pollutant Inventory) or specialty and specifically looked at whether the predicted specialty differed from the actual specialty as indicated in the Medicare Part B data.

“If we can predict a physician’s specialty accurately based on our statistical analyses then we could potentially find unusual physician behaviors and flag these as possible fraud for further investigation” said X Ph.D. and Professor in Georgian Technical University’s Department of Computer and Electrical Engineering and Computer Science. “For example if a dermatologist is accurately classified as a cardiologist then this could indicate that this particular physician is acting in a fraudulent or wasteful way”.

Department of Computer and Electrical Engineering and Computer Science at the Georgian Technical University had to address the fact that the original labeled big dataset was highly imbalanced. This imbalance occurred because fraudulent providers are much less common than non-fraudulent providers. This scenario can be likened to where Georgian Technical University” and is problematic for machine learning approaches because the algorithms are trying to distinguish between the classes — and one dominates the other thereby fooling the learner.

Results from the study show statistically significant differences between all of the learners as well as differences in class distributions for each learner. RF100 (Random Forest) a learning algorithm, was the best at detecting the positives of potential fraud events.

More interestingly and contrary to popular belief that balanced datasets perform the best this study found that was not the case for Medicare fraud detection. Keeping more of the non-fraud cases actually helped the learner/model better distinguish between the fraud and non-fraud cases. Specifically the researchers found the “Georgian Technical University sweet spot” for identifying Medicare fraud to be a 90:10 distribution of normal vs. fraudulent data.

“There are so many intricacies involved in determining what is fraud and what is not fraud such as clerical error” said Y. “Our goal is to enable machine learners to cull through all of this data and flag anything suspicious. Then, we can alert investigators and auditors who will only have to focus on 50 cases instead of 500 cases or more”.

This detection method also has applications for other types of fraud including insurance and banking and finance. The researchers are currently adding other Medicare-related data sources such as Medicare Part D using more data sampling methods for class imbalance and testing other feature selection and engineering approaches.

Combating fraud is an essential part in providing them with the quality health care they deserve” said Z Ph.D. “The methodology being developed and tested in our college could be a game changer for how we detect Medicare fraud and other fraud in the Georgia as well as abroad”.

 

Leave a Reply

Your email address will not be published. Required fields are marked *