Why Machine Learning Algorithms Go Wrong
It is a fair statement to say that at some point in the near future, algorithms will be responsible for running a significant portion of our lives. Whether it is work related, home, social, online or societal structure algorithms are increasingly becoming prevalent in all spheres.
The advancement of cognitive technologies and data analytics has led to an unprecedented explosion of algorithms across the globe. These complicated mathematical models can have a profound and lasting effect on society. From the types of medical treatment we receive, to the jobs we are offered, the loans that are approved, to changes in the judicial system and just about everything in between.
The Future of Machine Learning
From current spending and compound growth, it is expected that in the next 5 years, spending on cognitive technologies will reach the $47 billion mark; essentially paving the way for even more advanced machine learning algorithms. As these algorithms become more complex, they seem to be gaining an aura of infallibility. While they do provide industry with an opportunity to navigate a regulatory environment, the important thing to remember is that algorithms are susceptible to risks from accidental inputs to biases or even fraud, with catastrophic results.
As more and more algorithms are introduced into the working world, we are seeing a sharp rise in the number of cases where algorithms are misused or go wrong. At their most fundamental level, algorithms are merely programs with an input and output function. As such, there are three key areas that could cause an algorithm to go awry. These include the input data, coding errors/design and output interpretations.
Algorithms are only as good as the data fed into them. There are plenty of things that can go wrong with input data. For starters, the data could be out dated, incomplete or completely irrelevant for the task at hand. Since there is a human factor involved, input data can be tainted by certain biases. As the saying goes, if you feed rubbish in, you get rubbish out. There is also the possibility that the data used for training the particular algorithm doesn’t match with the actual data used during normal operations, resulting in an alternative outcome.
The second area where machine learning algorithms can become disrupted is in the design of the algorithm itself. Again, we are counting on the human factor to design the algorithm in the first place. As such, the design can be affected by biased logic, incorrect modelling techniques or flawed assumptions. Even if all the input data is correct and the basic modelling of the algorithm is sound, there is still the possibility of inserting a coding error, which may not be immediately noticeable in the output data, but over time, become quite significant. Once an algorithm has been inserted with the correct input data, a sound design model and correct coding, the output data can be collected and interpreted.
The final reason why algorithms can go wrong is down to how we choose to read the data. Most often, the output data is interpreted without regarding the original underlying assumption. There is also the possibility of falling victim to incorrect data output, or using the data inappropriately.
Managing Algorithmic Risk
An incorrect design or output from algorithms can result in potentially catastrophic decisions across a range of functions and areas, including information technology, sales and marketing, human resources, operations and even risk management. The risks involved can lead to long-term implications including damage to the financial operation of a company, the company reputation, and even the operational system.
As a result, it is now essential that companies and government agencies manage algorithmic risk, in the same way that they’d manage any other kind of risk to their business or operations. This could include constant monitoring and testing of data inputs, basic workings and outputs across the board. As algorithms become increasingly commonplace, those who create them need to ensure that they are utilised correctly, fed the correct data and regularly updated to reflect any environmental or circumstantial changes.