Detecting Healthcare Fraud using Machine Learning

As the elderly populations rise, so does medical care costs that come with treating those that need to be served. Medicare provides insurance to those 65 and older to help with the financial burden of healthcare. Medicare costs about $588 billion and is expected to increase by 18% in the next decade. Healthcare fraud is estimated by NHCAA to be as much as 10% of the nation’s total healthcare spend, or $58.8 billion. Fraudulent claims include both patient abuse or neglect, as well as billing for services that were not received. By using publicly available claims data, machine learning can be used to help detect fraud in the Medicare system helping reduce the cost to taxpayers.

Machine learning is a subset of artificial intelligence that can find a fraudulent needle in the haystack by applying continuous learning algorithms. With each instance that the algorithm is right about a fraudulent transaction, that information goes back into the equation, making it smarter. The same happens when the algorithm is wrong.

Using unsupervised machine learning on publicly available datasets is a growing trend with great potential. The publicly available Medicare claims data has 37 million cases. In machine learning, an essential part of the process is labeling as it affects both the data quality and the performance of the model. Different researchers have created the labels for fraud and non-fraud by mapping the data with other publicly available resources like the National Provider Identifier and List of Excluded Individuals and Entities database. The 37 million cases can then be reduced to under 4 million that can be run through the machine learning algorithm to help identify fraudulent providers.

For example, unsupervised machine learning has been used successfully on Florida’s Medicare data to detect anomalies in Medicare payments using regression techniques and Bayesian modeling. Also, decision tree and logistic regression with random undersampling class distributions have provided some promising results. Initial results have indicated that having more non-fraud cases has helped the model learn better and produce more accurate results between fraud and non-fraud cases.

Using machine learning to detect fraud is game-changing. Machine learning allows humans to be notified early on in the fraud attempt, stopping losses earlier on in the process. Having a continuous look on publicly available data can go a long way in helping minimize fraudulent claims and accelerate the time to prosecute criminals. 

#BigData #MachineLearning #AI #Healthcare

Data Brokers Pay for Your Healthcare Information

A multi-billion dollar industry exists from the buying and selling of your healthcare data. Certain state exceptions under federal privacy rules allow hospital data to be sold to data brokers. Private companies are seeking to gain access to your medical records to advance their mission, but sometime also to make a quick buck.

The right of businesses to profit from health information without patient permission has been previously upheld by the United States Supreme Court. For example, in the 1990s, a data broker was selling data to some big pharmaceutical companies on what individual providers were prescribing to patients. These pharmaceutical companies then used that information to provide targeted marketing to prescribers for the purposes of increasing drug sales. However, once patients started to understand and voice their complaints, a couple of states passed legislation to limit the trade of prescriber specific information. But, the data broker objected so the case went to the Supreme Court and was won by the data broker on the grounds of free speech.

No alt text provided for this image

While the practice of buying and selling medical data is technically acceptable under the Health Insurance and Portability and Accountability Act (HIPPA) because the data is supposed to be anonymous, one of the challenges with the increasing number of these deals is patient privacy is at risk since it is easier now to piece together deidentified records using unstructured data sources like Facebook, Twitter and other social media platforms.  

However, it is also important to note that not all data brokers have misguided intent. There are many organizations in this space with honorable missions. For example, Sloan Kettering made a deal to sell pathology samples to Paige.AI to develop artificial intelligence to help in finding a cure to cancer. In the case of curing cancer, the patient’s medical data is being used to increase the quality of care. However, data brokers do not currently have any fiduciary responsibilities to patients. 

There are some considerations that health systems can put in place to help reinforce ethical best practices:

1.  Only enter into a data transfer deal if it benefits patients

2.  Have a separate agreement form from the consent form that patients complete for their normal healthcare

3.  Asking the patient for permission to sell their data should be done by the third party vendor to ensure that there is no misunderstanding or abuse of the patient/provider relationship

4.  Any default consent options should be that patients do not elect to have their data sold

5.  Consent language should be worded in an easy to understand fashion and potentially in video form for so that patients can clearly understand usage, risks, and their options

6.  Transparency should be provided to the patients and healthcare staff on how the records are being used, who owns the data, and in what way it will be used, especially if there is a financial gain for the health system

Last year GlaxoSmithKline, a large pharmaceutical company came under global scrutiny when they tried to invest $300 million in 23andMe, due to concerns around lack of transparency of what data was being shared combined with the lack of choice for patients to participate.

Given that researchers predict that healthcare data will grow faster than in manufacturing, financial services, or media experiencing a compound annual growth rate of 36 percent through 2025, these issues are likely to continue to surface for governing bodies as well as public policy influencers. 

What has been your experience with data brokers? How do you think this will play out in the future?

#AI #BigData #BioEthics #Healthcare

Data Breaches Cost Healthcare $408 per Record: How to Prevent the Pain

No alt text provided for this image

According to the federal government in June 2019, there were 3.5 million people’s data exposed in healthcare data breaches that were reported. The majority of that data breach was from Dominion National that claims the incident may have started as early as April 2010. The data accessed included access enrollment, demographic data, and associated dental and vision information. Similarly, LabCorp and Quest Diagnostics reported in June 2019 that there was a data breach from an unauthorized user that accessed their vendor payment system that affected nearly 8 million and 12 million patients, respectively. These alarming numbers do not even include encrypted data that is lost by organizations since HIPAA does not consider the loss of encrypted data a breach. The United States healthcare system as a whole lost $6.2 billion in 2016 from data breaches with the average data breach costing a company $2.2 million. Research from IBM Security found that in 2018, the cost to healthcare organizations was $408 per record, up from $380 per record in 2017.

According to a HIMSS 2019 Cybersecurity Survey, 59 percent of all data breaches in the past 12 months started with phishing, or when an attacker masquerades as another reputable person in an email or other communications. Cybercriminals also often change their approach and are now increasingly using techniques powered by artificial intelligence. In response, healthcare organizations are actively deploying artificial intelligence solutions to combat suspicious activities, as well as increasing employee education and cloud-based security. 

No alt text provided for this image

There are some basic techniques that healthcare organizations should be deploying in addition to conducting risk assessments and providing employee education. For example, healthcare organizations should:

  • Take time to understand cloud service-level agreements, retain ownership of data that can be accessed in the event of a crash, and ensure service-level agreements comply with state privacy laws
  • Establish subnet wireless networks for guests and other public types of activity
  • Use multi-factor authentication on employee devices
  • Use business association agreements to help distribute risk and clarify vendor reporting requirements
  • Have a “bring your own device policy” based on current best practices like having a complex password requirements and policies that can be enforced
  • Plan for the unexpected in thinking about how long the healthcare organization can function in different areas without data, while also having an emergency solution for back-up information and data restoration

These tips can be incorporated into the organization’s cybersecurity framework. There are benefits to thinking through some of these strategies before they are mandated to have an effective cyber-defense program that protects both patients and the organization.

#Cybersecurity #AI #BigData #Healthcare

The Future of Open Infrastructure: OpenStack Cloud Computing Platform

No alt text provided for this image

OpenStack is an open-source cloud operating system that is relatively simple to install and provides massive scalability in helping organizations move towards enterprise-wide interdepartmental operations. Providing a stable foundation for both public and private clouds, OpenStack offers plug and play components with “at a glance” visualizations of how different parts work together. Their dashboard feature gives control to administrators while allowing users to provide resources through a web interface. OpenStack’s platform enables the deployment of container resources on a single network. It is one of the fastest growing solutions for building and managing cloud computing platforms with over 500 customers like Target, T-Mobile, Workday, American Express, GAP, Nike, and American Airlines.  

While there can be additional costs for specific versions, it is free to sign up for a public cloud trial:

After installing OpenStack, DevStack can be used to understand better dashboard functionality as well as providing insight to contributors wanting to test against a complete local environment:

Free training on OpenStack is also available helping people master and adopt OpenStack technology:

While the self-service is possible, should you choose to use a vendor for OpenStack management, a few key questions to ask potential vendors include:

  • Can you be specific on how you can help my company support an OpenStack deployment?
  • Can you share what kind of workloads has your OpenStack distribution supported in the past?
  • What kind of flexibility is incorporated in your OpenStack solution?
  • What kind of cost reductions should be anticipated from deploying an OpenStack infrastructure?

Do you have experience with OpenStack? If so, please share your experience with me via DM or in the comments.

#OpenStack #CloudInfrastructure #BigData

Who Runs the World? Amazon Web Services

If you think that most of Amazon’s operating income comes from those packages they deliver so fast on your doorstep after a click of a button, you’d be wrong. Amazon earns billions from its cloud platform, Amazon Web Services (AWS) that has benefited from a more interconnected world where transactions are exponentially increasing in volume. 

No alt text provided for this image

With a growing need to better store, verify and secure transactions, AWS allows businesses to run web and application servers in the cloud, securely store files on the cloud, use management databases like MySQL, Oracle and SQL Server to store information and deliver files quickly using a content delivery network. In short, AWS is core to Amazon’s business model and helps with database storage, content delivery, and computation power. It has been around for 13 years and offers 165 fully featured services across 21 geographic regions and is used by over 1 million customers like Netflix, Airbnb, Johnson & Johnson, Lyft, CapitalOne, and General Electric. 

For developers that may not have prior experience with things like machine learning, artificial intelligence, the Internet of Things and augmented reality, AWS provides an easy solution. For example, it has features like Amazon Personalize that allow developers to add custom machine learning models including product recommendations, search results, and direct marketing. The Amazon Personalize API uses algorithms that are used in Amazon’s own retail business. 

No alt text provided for this image

Some of the benefits of AWS include low-cost services, ease of use, versatile storage and reliability. However, there are a few security limitations, technical support fees, and the product faces general issues associated with cloud computing such as limited control, downtime, and backup protection. However, many of the disadvantages of AWS can be easily overcome or mitigated, making Amazon Web Services a leader in cloud platforms.  

For those wanted to test out Amazon Web Services, it can be downloaded for free: 

Also, Amazon also offers several free trainings:

AWS Cloud Practitioner Essentials

AWS Machine Learning Services

AWS Analytics Services Overview

Have you used Amazon Web Services?  What has been your experience?

#AWS #CloudPlatform #MachineLearning #ArtificialIntelligence

A Refresher on Board Governance

No alt text provided for this image

Master Yoda shares, “Always pass on what you have learned.” While we may hope the boardroom is full of Yodas, the reality is that the boardroom is always changing and there are best practices around boardroom governance for a reason. There is a constant balance that governing bodies face in pursuing business opportunities while maintaining accountability and ethical integrity. In 2007-2008, the global financial crisis put a heightened sense of urgency on the needs for improved ethical frameworks and governance for businesses. Good governance is the heart of any successful company. Enterprise governance needs to balance the economic and social pressures as well as take into consideration the viewpoints of different stakeholders from individuals to collective groups. A governance framework is utilized to support the efficient use of resources as well as to formalize accountability for the stewardship of those resources. The goal of enterprise governance is to help align the interests of individuals, businesses, and society in achieving business objectives. Ethical considerations are important for enterprises not only because of negative pressures from situations like the 2007-2008 global financial crisis, but also because ethical behavior and corporate social responsibilities can bring significant benefits to organizations. Three examples that show how governance can impact organizations include:

•   The Passenger Rail Agency of South Africa had a situation where the acting CEO was fired by the board, and then the Minister of Transport dissolved the board. The reports said that this was an issue where the board was undermined and not accountable to the shareholders. 

•   In another example, Innovations Theatre, which has been in operation for two decades, had a very large board that was focused on board development and future visioning. The board consisted of “white-skin and white-collar” board members representing lots of corporate sponsors. Parallel to this governance board, there was another corporate board that represented even more businesses. 

•   A third example includes the Foster Dance Troup that teaches dance in the inner city. The Dance Troup had a founder in charge for two decades but died a few years ago. The board was faced with more responsibility, and the current structure included an emphasis on committee reports.

In the first example provided, there was a political issue where the shareholders did not seem to be involved in the governance process. In the second example, the board’s lack of diversity may raise some eyebrows as it relates to community support. Also, the size of the board was too large, with over-dependence on one leader. In the last example with the Dance Troop, the board was in early development stages since it lost the founder which refocused the mission, as well as the structure of the organization. This is a situation where the board had an opportunity to define more clear roles and responsibilities, as well as the distinction between board and staff.

   These examples have common themes that are essential to board effectiveness including having a strong board chair, clear roles and responsibilities of board members, CEO that acts as and is treated like a partner, and a board that can confront big questions. It is important for organizations to have strong governance systems because it increases the accountability of organizations, helps avoid disasters before they happen, and moves businesses towards their mission, while maintaining critical legal and ethical standing.

           Have you been involved in any similar experiences? How did you deal with the complex situation? What do you think is critical for good governance?

No alt text provided for this image


AI vs. IoT: What’s the Difference?

No alt text provided for this image

While Artificial Intelligence (AI) and the Internet of Things (IoT) are both hot topics, they are not the same. They have differences but at the same time are connected and related. Artificial intelligence is a type of science that works to imitate intelligent behavior in computers.  Internet of things is the internet-working of devices like homes, sensors, cars, and home appliances that can communicate together and often with the external environment like other cars, devices and human beings. 

Some of the differences between AI and IoT include interaction with cloud computing, scalability, cost, and ability to learn from data. For example, with cloud computing, IoT generates significant amounts of data and cloud computing provides a pathway for that data. On the other hand, AI intersects with cloud computing in the sense that it allows the devices to act and react in a way more similar to the human experience. 

In terms of learning from data, in IoT, there can be multiple sensors, and each has some set of processes where identical information is shared on the internet, but in AI, the system actual learns from the activities or errors occurring to try to evolve into a better version of itself. As it relates to cost, IoT generally costs much less than $50K USD with all components involved from hardware to infrastructure, whereas with AI the charges are typically are calculated for each case and can vary substantially based on complexity and industry. 

No alt text provided for this image

IoT focuses on connecting machines and making use of the collected data while AI is about mimicking intelligent behavior in machines. As the devices powered by IoT continue to grow, AI can help by dealing with the big data by making sense of it. That being said, IoT can exist without AI. And, AI can exist without IoT. But, data is only useful to humans if it creates insights that can be acted upon. Using IoT and AI together create connected intelligence. 

A use case of IoT and AI working together is Tesla Motors self-driving cars. In this example, the car is the “thing,” and the power of AI is used to predict the behavior of a car in a variety of environments. The Telsa cars operate as a network meaning that when one car learns something, all the cars can then learn something.  

No alt text provided for this image

There are several data scientists that believe the future of IoT is in the AI. Undoubtably, when the two are combined the value delivered can increase for the customer, as well as the organization.

#BigData #IoT #AIA

Cure Cancer: AI and Machine Learning

There are several ways that machine learning tools can be used on existing data sets to potentially discover a cure for cancer. First, anybody can download the tools for free nearly anywhere in the world with a consistent internet connection. One of my favorite programs is R that works on both Windows and Mac machines and installs in a matter of minutes. I particularly like R because of the machine learning libraries in it that can be leveraged in programming. While I previously shared some general machine learning algorithms, in this post, I am going to take it a little deeper for those that do have a technical background and want to expand their toolkit and experiment with some of these machine learning techniques.

The first step is understanding what variables you might have access to as it relates to cancer and the nature of those variables. A variety of both structured and unstructured data can be combined in frameworks like Hadoop to prepare the data for analysis. If you want to leverage different machine learning techniques, it is useful to understand how trees work because with decision trees there is not the assumption of linearity which is helpful when trying to glean insights through non-linear data analysis.

No alt text provided for this image

Classification trees help separate data into classes that belong to the response variable. If the target variable has more than two categories, different variants of the algorithm are leveraged, but overall classification trees are useful when the target variable is categorical (like yes/no).  On the other hand, regression trees or prediction trees can be useful when a response variable is numeric or continuous. The target variable determines whether or not to use classification or regression tree. Conditional logistic regression can be useful in tackling sparse data type issues.

   The advantages of decision trees include fast computations, invariance under the monotone transformation of variables, an easy extension to categorical outcomes, resistance to irrelevant variables, one tuning parameter, ability to handle missing data and outputs that can be easily understood by non-technical audiences. The disadvantages can include accuracy since the function needs to involve higher order interactions and variance since each split depends on previous splits and small changes can cause big changes in the decision tree. Some important definitions to understand include:

  • Root is the topmost node of the tree
  • Edge is the link between two nodes
  • Child is a node that has a parent node
  • Parent is a node that has an edge to a child node
  • Leaf is a node that does not have a child node in the tree
  • Height is the length of the longest path to a leaf
  • Depth is the length of the path to its root
No alt text provided for this image

Let’s start with considering the existing prostate cancer data set available in R. The data represents a population of 97 males. This is a good data set to illustrate how easily different tree growth algorithms and classification techniques can be used to predict tumor spread in males. In this specific example, the measures for prediction are PSA, the size of the prostate, benign prostatic hyperplasia, Gleason score, and capsular penetration. Therefore, to better understand and predict the tumor spread (seminar vesicle invation=svi) the following variables were used for the tree growth algorithms: log of benign prostatic hyperplasia amount (lbph), log of prostate-specific antigen (lpsa), Gleason score (gleason), log of capsular penetration (lcp) and log of cancer volume (lcavol).

Here is a quick program that I wrote in R to better understand this data set:

R Script

# Loading the proper libraries to conduct this analysis on the prostate cancer dataset in R
# Setting up the classification tree
# Lets look at the results
# Plotting the results
# Making the plot tree
plot(classification,uniform=T,main="Classification tree for prostate cancer")
text(classification,use.n = T, all=T, cex=.8)
# Making the tree
# Looking at the results
plot(regression,uniform=T,main="Regression tree for prostate cancer")
text(regression,use.n = T, all=T,cex=.8)
# Now doing the conditional inference tree
# Lets look at the results
# Plotting the results
plot(conditional,main="Conditional inference tree for prostate cancer")

This script resulted in the following information:

> printcp(classification)


Classification tree:

rpart(formula = svi ~ lbph + lpsa + gleason + lcp + lcavol, data = Prostate,

    method = "class")


Variables actually used in tree construction:

[1] lcp


Root node error: 21/97 = 0.21649


n= 97


       CP nsplit rel error  xerror    xstd

1 0.52381      0   1.00000 1.00000 0.19316

2 0.01000      1   0.47619 0.80952 0.17831

Regression tree:

rpart(formula = svi ~ lbph + lpsa + gleason + lcp + lcavol, data = Prostate, 
    method = "anova")

Variables actually used in tree construction:
[1] lcp  lpsa

Root node error: 16.454/97 = 0.16962

n= 97 

       CP nsplit rel error  xerror    xstd
1 0.45551      0   1.00000 1.00780 0.14079
2 0.21489      1   0.54449 0.68052 0.15327
3 0.01000      2   0.32960 0.53091 0.11726

> conditional

Conditional inference tree with 3 terminal nodes

Response:  svi 
Inputs:  lbph, lpsa, gleason, lcp, lcavol 
Number of observations:  97 

1) lcp <= 1.7492; criterion = 1, statistic = 43.496
  2) lpsa <= 2.972975; criterion = 1, statistic = 20.148
    3)*  weights = 66 
  2) lpsa > 2.972975
    4)*  weights = 18 
1) lcp > 1.7492
  5)*  weights = 13 
No alt text provided for this image

Note that head node is the seminal vesicle invasion which shows the tumor spread. The cross-validation results show there is only one split in the three with a relative value for the first split of .80952 and a standard deviation of .17831. The log of capsular penetration was used to split the tree when the log of capsular penetration at <1.791. There were three leaf nodes in the regression tree algorithm because the script split the data set three times. The relative error for the first split was 0.68052, and a standard deviation of 0.15327 and at the second split the relative error is 0.53091 and a standard deviation of 0.11726. The tree was split at the first log of capsular penetration at <1.791 and the log of the prostate-specific antigen at < 2.973. The conditional tree algorithm produced a split at <1.749 of the log of capsular penetration at the 0.001 significance level and <2.973 for the log of prostate-specific antigen also at the 0.001 significance level.

In this particular example, the condition tree growth algorithm produced more useful information than the classification and regression tree growth algorithm. That being said, while sometimes the language as it relates to machine learning is complicated to understand, it really just comes down to using the right variables as input and testing different machine learning algorithms relative to the problem being solved. Testing different machine learning algorithms boils down to running a few lines of code in R, python or your favorite programming language.

Clinical data around pathology related detail, tumor evolution and cell-level information is being generated at exponentially increasing levels. Many of these data sets are starting to be available online for analysis. The type of algorithms used in this example could be used on these big data sets to accelerate the discovery of a cure for cancer. But, it is not going to happen without individuals that are willing to embrace these types of tools for analysis.

#BigData #AI #Oncology #MachineLearning

Ensembles and Random Forest Analysis: How it Works

Ensemble methods can use multiple machine learning algorithms to predict performance. Ensemble is essentially about combining methods to have better predictions.  For example, in terms of logistic regression with ensemble classification, if the first classifier is a base classifier and the second is a corrector classifier, then the first does the initial classification, and the predicted class is then fed into the feature of the second classifier.  The second classifier can either result in a classification which is identical to the first or can correct the prediction if more accuracy is found.  The base classifier helps with the initial prediction of the target class.  The corrector classifier attempts to correct any errors in the prediction of the base classifier by focusing on the decision boundary of the base classifier.  For example, a choice of the base classifier could be logistic regression.  Logistic regression is a parametric discriminative classifier that can be used for training.  Also, for a corrector, the k-nearest neighbors can be the parametric classifier which would take the average of k nearest training data to make the prediction.

Random Forest is a type of ensemble method that performs both regression and classification with the use of multiple decision trees.  The technique is often referred to as Bootstrap Aggregation.  The Bootstrap Aggregation method involves training each decision tree on a different random.  The sampling in this instance occurs through replacement.

AI versus Big Data: What’s the Difference?

Artificial intelligence is fueled by computers, big data, and algorithms. Big data is the input for business intelligence capabilities. Big data represents the large volume of data that often needs to go through a data quality process of cleansing before it can be turned into business insights. Artificial intelligence, on the other hand, occurs when computers act on that data input. Artificial intelligence changes behavior based on findings and then modifies the approach. Big data analytics are more about looking for a given piece of data to produce insight versus having the computer act on the results that are found. Big data analytics produces insights through the identification of patterns through things like the sequential analysis, leveraging technologies like Hadoop that can analyze both structured and unstructured data. While artificial intelligence can also be based off structured and unstructured data, with artificial intelligence, the computer learns from that big data and keeps collecting it and then acting upon it.

Industry examples of how big data is being leveraged in artificial intelligence range from consumer goods to the creative arts to media. For example, in consumer goods, Hello Barbie runs off of machine learning where the microphone on Barbie’s necklace records what the child says and analyzes it to determine a fitting response. The server gets the response back to Barbie in under a second. In the creative arts, music-generating algorithms are being used from newspapers and speeches to create themes for new lyrics and help musicians better understand target audiences to increase record sales. In media, the BBC project, Talking with Machines lets listeners engage in conversation with their smart devices to insert their perspective to become part of the story creation.

Artificial intelligence influences big data analytics and vice-versa. Artificial Intelligence uses big data to run algorithms, like machine learning algorithms. In machine learning algorithms, training and test datasets are used for the analysis.  Big data analytics can be useful to prepare those test and training datasets for machine learning. Also, access to big data allows artificial intelligence to continue to learn more additional data sources. Machine learning algorithms can reproduce behaviors based on big data that is feeding processors that it puts through a trial and error type of algorithms. 

Essentially big data is what can teach artificial intelligence, and the rise of artificial intelligence is complementary to the exponential growth of big data. Understanding the basics of how big data and artificial intelligence intersect is important as they are both here to stay and have the potential to boost, not only revenue but innovative and creative capabilities for businesses.

#AI #BigData