Skip to main content
Article

Machine Learning: Today and Tomorrow

Insurance Consulting and Technology
Insurer Solutions|

By Benjamin Williams , Vadim Filimonov and Jason Rodriguez | February 25, 2020

P/C carrier executives have higher expectations about adopting machine learning techniques in 2021 than they had in 2019, according to a recent Willis Towers Watson survey. But their expectations were pretty high in 2019, too, and actual use is far short of predictions.

It is difficult to open an insurance industry newsletter these days without seeing some reference to machine learning or its cousin artificial intelligence and how they will revolutionize the industry. Yet according to Willis Towers Watson’s recently released 2019/2020 P&C Insurance Advanced Analytics Survey results, fewer companies have adopted machine learning and artificial intelligence than had planned to do so just two years ago (see the graphic below).

Top applications insurers plan to use two years from now for AI and machine learning
  Actual for 2017 Expected for 2019 (in 2017) Actual for 2019 Expected for 2021
Build risk models for better decision making 13% 44% 26% 60%
Reduce time spent by humans 11% 49% 22% 60%
Better understand risk drivers 21% 44% 20% 56%
Identify cases that pose higher risk 11% 46% 14% 50%
Augment human-performed underwriting 7% 37% 7% 47%
Identify patterns of fraudulent claims 9% 39% 17% 47%
Identify bottlenecks in claim processes/Process claims more efficiently 3% 30% 7% 43%

Source: Willis Towers Watson 2019/2020 P&C Insurance Advanced Analytics Survey

In the context of insurance, we’re not talking about self-driving cars (though these may have important implications for insurance) or chess-playing computers. We’re talking about predicting the outcome of comparatively simple future events: Who will buy what product, which clients are more likely to have what kind of claim, which claim will become complex according to some definition.

Analytics have applications across the insurance value chain, from marketing, client acquisition and retention to underwriting, pricing and claims management, as insurers look to squeeze more signal out of their data.

The better insurers can estimate the outcomes of these future events, the better they can plan for them and achieve more positive results. The accuracy of their estimates relative to their competitors is of particular importance, for example, allowing them to price more accurately, attracting better and avoiding worse risks. Analytics have applications across the insurance value chain, from marketing, client acquisition and retention to underwriting, pricing and claims management, as insurers look to squeeze more signal out of their data.

Machine learning is a set of techniques that can do a better job than previous techniques at finding patterns in historical information.

Life insurers have been estimating the outcome of future events for centuries using mortality tables. More recently, but still several decades ago, property/casualty insurance pricing was revolutionized by the use of predictive models, most commonly generalized linear models (GLMs), to find signals in historical data to infer the outcome of those future events. Machine learning is a set of techniques that can do a better job than previous techniques at finding patterns in historical information. Examples include gradient boosting machines, random forests, neural networks and support vector machines. In general, these techniques require little human judgment (though human judgment is still required to frame the question, and judicious use of human judgment can lead to even better predictions).

Why is machine learning receiving so much attention now? These methods are computationally intensive, and the necessary resources (hardware, software and people skilled in their use) are becoming increasingly available. More information is available, and traditional techniques can struggle to provide insight.

Historically, insurers used the information they gathered about their clients for analytical purposes. They then discovered that external data sources such as credit score and geodemographic variables (population density etc.) could provide additional insight.

Recent years have seen an explosion in sources of external data, ranging from pharmaceutical and medical records to telematics data, aerial imagery and even social media. As a side note, any data source can only be useful if it can be accessed at the right time; a cautionary tale is provided by the U.K. insurer Admiral, which announced that it was going to use clients’ Facebook posts to inform auto pricing, only to have Facebook close down its data feed because it was concerned that its clients would censor themselves and consider such use of their profiles as an invasion of privacy. The broader public is becoming increasingly aware of how personal data is being used and, in some markets, restrictive legislation has already been introduced—e.g., the GDPR in the European Union.

Further, new techniques have rendered kinds of data, which were previously believed to contain useful insights but were impossible to access, susceptible for use in analytics. These include documents such as underwriting and claim adjuster notes and images of risks to be insured or that have suffered damage. For example, topic modeling identifies commonly occurring combinations of words that appear in series of documents; these topics can be used in predictive models.

Given that machine learning techniques can allow better predictions than previously used methods and can provide insights gleaned from the increasing volumes of data that insurers have access to, we might expect to see them being used everywhere. However, there are important reasons that can limit their usefulness, which we’ll illustrate in the context of P/C insurance pricing.

Unlock More

An Executive’s Guide to Machine Learning Techniques

Generalized linear models have been used for some time by P/C insurers seeking to find signals and patterns in historical data. GLMs are a generalization of least squares regression to non-normally distributed data and non-additive relationships between variables.

Machine learning techniques can do a better job and require less human intervention. These techniques include:

  • Random Forests, which average many simple tree-based models.
  • Gradient Boosting Machines (GBMs), which also combine many simple tree-based models, but each model is built on the residuals of the last. (Residuals are the ratio of observed to estimated values.)
  • Support Vector Machines, which apply complex, high-dimensional transformations to classify data into separable groups.
  • Neural Networks, which aim to learn from data using processes that mimic those of biological neural networks.

While each of these can be applied to similar problems with varying degrees of success, neural networks have been applied to image recognition more than the other techniques.

P/C pricing is highly regulated. For many products, insurers have to justify their rates to regulators. Rates informed by GLMs are more or less easy to explain, and regulators have become accustomed to interpreting these models.

The effect of each rating factor, alone or in combination with others, is isolated. By looking at the results of a GLM, it can be said unequivocally that, for example, the higher an individual’s credit score, the lower their expected claims costs. Many machine learning techniques result in black-box models for which it is not possible to make such definitive statements. It might be possible to say that, in general, high credit scores are associated with lower claims costs, but the model may highlight segments that behave differently. Explaining why these segments behave differently, and even identifying them, can be highly problematic.

A related issue is reputational risk; rates based on black-box models may change at renewal in ways that defy explanation, alienating clients or agents.

Even if an insurer were able to get sufficiently comfortable with rates based on machine learning techniques and managed to convince a regulator to accept a rate filing based on such a technique, it would then have to implement the underlying algorithm. Legacy rating engines are table-based and will not, in general, support the more sophisticated algorithms provided by machine learning techniques. Modern policy administration systems and rating engines can overcome this issue.

For these reasons, use of machine learning techniques in the area of P/C insurance pricing has so far been mainly limited to finding ways to improve traditional models, for example, to identify combinations of variables predictive of risk, which were hard to detect using previous techniques, or to provide better vehicle or territorial classifications, or even to prioritize variables for analysis. We do expect insurers to look for ways to use them more directly in pricing, but this will require carriers to become more comfortable with these black-box models, perhaps through tools that allow them to better understand or constrain them. Further, it will require regulators to accept more complex filings and require access to tools that allow these models to be deployed. All of this could take time.

While fewer companies may have adopted machine learning and artificial intelligence than had planned to do so two years ago, expectations remain high.

We conclude by returning to the survey results mentioned at the beginning of this narrative. While fewer companies may have adopted machine learning and artificial intelligence than had planned to do so two years ago, expectations remain high. In all the facets of use that participants were asked about, companies have higher expectations on adoption rates in 2021 than they had in 2019. Time will tell if they were being optimistic, ignoring previously unforeseen difficulties like those described above, or if real progress will be made.

This article was originally published by Carrier Management, Jan/Feb 2020. (Redistributed with Permission)

Download PDF
Title File Type File Size
Machine Learning PDF .7 MB
Authors

Benjamin Williams
Senior Consultant


Data Science Lead

Contact Us