How an ML model for an insurtech project guarantees the accuracy of disaster prediction

Related Services:
Machine Learning Consulting Machine Learning Development

Uniсsoft developed a machine learning system that helps insurance companies analyze natural disasters and mitigate investment risks.

About the client 

Forrestry ML, a machine learning strategy consultancy for early-stage startups and legacy enterprises, was looking to build an AI-driven organization. The company needed a contractor that could build out a high-priority project for one of its clients — an insurtech company operating an insurance platform that digitizes and automates the processes involved in insurance services.

Business challenges

Many regions of developed countries, such as the United States or Japan, are subject to frequent natural disasters. The real estate insurance market has traditionally taken into account these risks, which were reflected in the price of insurance services for a client.

With the advent of artificial intelligence and machine learning technologies, it has become possible to assess the risks far more accurately. Many of the benefits are immediately obvious: 

  • the custom pricing and proposals (given the high probability of disasters in certain regions) increase the profitability of the insurance company
  • the competitive advantage which these custom offerings deliver: clients prefer to work with partners that provide them with personalized offers 

The biggest challenge for the natural disaster insurance industry is accurate data collection, analysis, and modeling. At the time, there were a couple of major data analysts that provided data in a non-unified form. This greatly complicates the task of machine learning because the data is so unorganized. 

With all that said, the main challenges Unicsoft’s client faced were as follows: 

  • Names of districts and locations are not unified in the main datasets — the client could not create a disaster forecasting model simply by comparing two datasets.
  • Data in these datasets could not be compared with data on other regions and districts that are already in the insurance platform (different names and codes for the same areas).
  • The client’s data sources were missing location information for multiple locations.

Unicsoft was tasked with resolving a number of issues so that the ML technology could better organize and analyze the data:

  • Build a proof of concept ML model based on a dataset procured by the customer as a representative sample of the complete set
  • Automatically filter data to distinguish what is accurate and inaccurate. The databases contained a variety of events that provide information on what can happen in a given region. Unicsoft’s task was to filter out risky events that are potentially important to the insurance provider.
  • Provide categorical predictions (recognitions) on the data for Construction and Occupancy codes. The insurance company has all the information about locations and regions contained in the form of codes, But in the source databases, this data is displayed in a different form, thus it was necessary to unify this data.

The solution

Unicsoft developed the project following the сross-industry standard process for machine learning and data science projects (a standard process for most ML projects). In total, the project went through six stages of development and implementation:

  1. Forrestry previously completed the Business Understanding and Data Understanding steps. Before starting work, Uniсsoft had a complete understanding of the business requirements and identified the databases and their potential problems.
  2. We started with Data preparation which consisted of the following steps:
  • Clean up junk data 
  • Compose a training set and match SoV to Results in a unified set
  • Run a statistical analysis on the structured  dataset
  1. Data science and ML modeling phase. The customer database operated on two major external databases that categorized regions and predicted disasters. However, these external databases had unstructured info that didn’t match the data in the customer’s database. Unicsoft had to develop a model that could read information from these databases and simulate human behavior in order to identify a correspondence between the results of disaster modeling in external datasets and for the data on the regions in the customer’s platform.
  2. Model assessment. Unicsoft implemented new techniques in order to the model data and verify the ability of the model to achieve the desired result. The final product was astonishing as the accuracy of the ML model was measured to be within 93%.
  3. In-product integration. Although the integration of the model into the client’s product was not the original task, Uniсsoft carried out all the preparatory activities so that the model could be integrated and running in a few days.

The results

Forrestry and their client were able to get a fully working ML model ready for integration with the client’s product ahead of schedule and under budget.

The client noted how surprised they were that they hardly needed to get involved in the work as well as the proactivity and professionalism on the part of the Unicsoft team.

  • The accuracy of the ML model reached 93% – an extremely high result for any ML system.
  • Forrestry avoided having to hire expensive in-house engineers. Unicsoft provided a dedicated team and specialists with experience based on dozens of ML projects. It is almost impossible to assemble this type of team on your own, given the competition on the market. The Unicsoft ML developers sit in the top 3% among ML engineers in the world (on Kagle).
  • Forrestry’s client was impressed with the speed and quality of the work, which had a positive effect on their business relationship.