Sergey Nivens - Fotolia

Get started Bring yourself up to speed with our introductory content.

Machine learning applications: Mitigating the risks

Machine learning applications are beneficial to enterprises, but there are also several risks involved. Expert Judith Myerson explains five ways to combat them.

Machine learning is gaining widespread adoption by enterprises -- large and small. Three primary reasons why executives want employees to use it are for cost savings, faster processing of massive data and faster discovery of new vulnerabilities.

Large retailers use machine learning applications to find fraudulent transactions over ecommerce while preventing blocking of legitimate transactions. They use it to analyze customers' sentiments on products seen on multiple websites and weed out hackers impersonating as long-time customers.

Financial institutions use machine learning applications or systems to predict loan defaults and fraud and money laundering. Hospitals can avoid mergers and closures by having machine learning predict avoidable emergency room wait times, preventable strokes and seizures, and wasteful hospital readmissions. Machine learning in large legal firms helps lawyers decide faster what cases to take and which to pass on. Legal Robot is trained to determine if a business contract contains all the necessary clauses.

Other uses for machine learning applications range from predicting health outcomes in medicine and stock prices in finance, to electric load and solar power forecasting.

Risks in machine learning

All the best machine learning models come with risks which include large false positives due to bad learning algorithms that the hackers can exploit. Another unwanted guest to the model is the contaminated or compromised data from a recently hacked host. Having no false positives doesn't mean there aren't any risks. The hackers can exploit loopholes in the system running the machine learning applications platform.

One risk is the hacker can use fake biometric fingerprints and iris and facial characteristics to impersonate a legitimate user. Another risk is the hacker can fool a machine learning model into classifying malicious training samples as legitimate -- at test or execution time. This can cause the model to behave significantly and widely different than the expected outputs.

Machine learning risk management

Here are five ways a human can combat risks in machine learning applications.

1. Perform ethical hacking

An ethical hacker is a trusted security professional who breaks into a system to discover machine learning vulnerabilities overlooked by a firewall, an intrusion detection system or any other security tools. In a simple scenario of gaining access, the ethical hacker uses a fake finger reconstructed from the fingerprint left behind by a legitimate user on a dirty device. Once in the system, the ethical hacker sneaks into a fingerprint database, retrieves a biometric template belonging to another legitimate user and then reconstructs a fake finger. To combat these risks, the device reader must be free of dirt, grease and moisture after each use and the database should be encrypted.

2. Encrypt security logs

A system administrator has super user privileges to analyze machine learning log files. The reasons for doing it include checking for compliance with security policies, troubleshooting the system and conducting forensics. Encrypting log files is one way of protecting log files from being hacked. Encryption keys needed to change log contents are not revealed to the malicious hacker. If this hacker tries to delete a log file on hacking activities, the administrator should get an immediate loud-sounding alert on the always-on desktop computer.

3. Clean out training data

A machine learning model behaves well when it is fed by good training data. The model developer must know where the data comes from. The data must be clean and free of biases, anomalies and poisoned data. It must be avoided if the source host has been attacked. Bad data can cause the model to behave differently and ultimately shut down the system. When using multiple learning tools to assess data for a particular task, the model developer should restructure all data into a common format.

4. Apply DevOps to model lifecycle

Hackers can take advantage of false positives from a machine learning platform. One way of combating this and other risks in machine learning is to apply DevOps to the learning model lifecycle. DevOps lets the development and training, quality assurance and production teams collaborate with one another.

DevOps starts with the development and training phase, and then proceeds to the quality assurance phase to see how well the model has been trained. Unsatisfactory test results mean returning to the development phase to retrain the model with better data. Otherwise, the lifecycle moves forward to the production phase where the model is applied to real-world data. If the results don't meet expected outcomes, the DevOps should be repeated from the development or quality assurance phase.

5. Implement a security policy

A security policy should be implemented on the course of actions to take for machine learning risk management. In a simple scenario, the policy should consist of five sections: purpose, scope, background, actions and constraints. The scope section puts a fence around what's to be covered: machine learning model type, training data and data mining algorithm -- regression, clustering or neural networks. The background section looks at the reasons behind the policy, the actions section covers how the risks can be combated using DevOps and the constraints section considers machine learning limitations, and availability of test data.

Next Steps

Read more about how security machine learning techniques need to adapt

Learn how to implement machine learning effectively

Discover how big data is benefiting machine learning

Leveraging the cloud to build deep learning apps

This was last published in April 2016

Dig Deeper on Risk assessments, metrics and frameworks