Skip to main content
Blog Post

Dipping a toe into the deep waters of machine learning in insurance

Insurance Consulting and Technology

By Michael Chen | March 11, 2021

Machine learning can improve pricing and underwriting accuracy in the insurance industry.

The use of machine learning in insurance is steadily growing in recent years as more insurers have dipped their toes into its deep waters. Much of the growth is attributable to advances in both data storage options and data processing capacities.

But an important question remains for the majority working in the industry — at least those outside the actuarial and data scientist spheres. What do these techniques actually involve and enable insurers to do around pricing risk and supporting underwriters?

A couple of colleagues and I will attempt to skim the surface of that question at a workshop at this year’s Casualty Actuarial Society’s (CAS) Ratemaking, Product and Modeling (RPM) seminar. The fact that the workshop lasts four hours should give some indication of just how vast an ocean of a topic this is. 

The basic premise of machine learning is to enable organizations to incorporate new sources of unstructured data, including graphics, sound, videos, and other high-dimensional, machine-generated data to make data-driven decisions. 

Data is where it all starts 

To make our paddle (and that’s all it can be) into machine learning rather more tangible and accessible, for this workshop we’ll use a modeling data set based on a sample of real, anonymized private auto insurance data, in which we have combined policy characteristics, claims and external vehicle characteristics. While the goal is often improving model accuracy and predictiveness, like most modeling exercises, much of the time spent on machine learning comes from preparing and processing data. Data quality will always be a prerequisite, particularly where it involves integrating value-adding data from different data sources or time periods.

That entails data cleansing and data validation, including the data processing, data reconciliation, exploration of missing or inconsistent values, and the handling of incomplete data. Depending on the number of features to be modeled, some dimensionality-reduction techniques, such as combining or pursuing variables, might also be needed to create a more manageable data set.

So, using the data set we’d prepared earlier, we’ll dive in, keeping in mind that the implementation and use of machine learning often must balance improvements in accuracy with interpretability. 

We might add some regression techniques. Automatically, if you make models more accurate, you introduce more complexity to be decrypted. For example, if in our auto data we focus on 10 variables, of which three are significant, machine learning techniques can help us identify which of the other seven in combination with our three main variables improve our model.

We can train our machine learning model and use it as a benchmark. Most auto insurers in the U.S. use generalized linear models (GLM). Benchmarking machine learning techniques that look at multiple interactions between rating variables can inform what’s best to include in the GLM used to price the risk.

At some point, we may expose our data to ensemble model techniques, since combining predictions from a handful of different models can result in better predictions.

We’ll also illustrate the use of machine learning techniques to create additional variables — a score, for example, that we can put back into our data set and run through a GLM.

This is just a taste of the workshop. Even in four hours it will only be a drop in ocean of the potential that machine learning techniques, in the right hands, can bring to the accuracy of pricing and underwriting in the insurance industry.

The machine learning workshop will take place at the CAS RPM seminar on Monday, March 15 at 8:00 a.m. to 12:00 p.m. ET.


Associate Director

Related content tags, list of links Blog Post Insurance Consulting and Technology Insurance

Related Capabilities

Contact Us