Data-driven insurance Overview of the Hot Topic “Data Science” and prelude to the Data Science Article Series – Article 1

“Data Science”, “Big Data” as well as “Data Analytics” are currently amongst the most discussed topics in every industry sector. From manufacturing to insurance, this topic moves management boards across industries. Data science as well as artificial intelligence, seen on a long-term outlook, are core elements of the ongoing process of digitalisation.

The big players in the insurance industry have already identified digitalisation as an enabler a couple of years ago and started to invest. In Europe alone, the investments of the insurance industry on digitalisation are predicted to reach €11bn up until 2020[1]. Newly assigned board members of European insurance carriers with a dedicated focus on the digital agenda point out, that this prediction might be even too conservative.

Figure 1: Investments in digitalisation by selected insurers

Insurance companies collect vast amounts of data every single day and most of this data is already sufficient to gain a multitude of new insights by using data analytics. Better still, if combined with additional data, the resulting data-lakes represent chances for so many improvements along the whole customer journey. Together with new created customer-touch-points for the insurance company due to ongoing digitalisation processes, this results in a potential huge gain in profits. And as it turns out, customers are willing to share their personal data – but only if there is an advantage for them[2].

Figure 2: Reasons to share personal data

So the key factor does not seem to be the data-availability, but on the one hand to make sure the available data is of sufficient quality to derive actions from it, and on the other hand to choose the correct way to process and interpret the data. This is where data science starts to get interesting for insurance companies.

So, what actually is data science and in which context are we looking at this topic? Data science is the utilisation of data analysis tools such as scientific machine learning algorithms, data mining processes or business intelligence methods. This methods are carried out on a vast amount of either structured or unstructured data in order to gain insights and extract knowledge. When talking about such a massive amount of data one often refers to it as big data. It can consists of classic business data, sensor data gathered from different types of sources as used in biometrics or telematics but also meta data and personal data. Data science is not to be misunderstood or confused with artificial intelligence, data science is a mere enabler or incubator for artificial intelligence.

Figure 3: Definition of Data science


Research shows that of all available data only approximately 0.5% is currently analysed[3]. Moreover, it is assumed that the percentage of analysed data will even decline, since with increasing digitalisation, available and produced data – and thereof especially the personal and meta data – will grow exponentially[4].

Figure 4: Volume of available data

When it comes to that much data, in order to be able to make use of it, one has to make sure that the data is of sufficient quality. Thus, data quality management is at least as important as the underlying data. As stated in numerous publications, a typical Data Scientist spends up to 80% of her time cleaning and organising data[5]. This huge amount of time is not spent on this task because it is the most fulfilling part (which it is not) but because it is of incredible importance. Otherwise the findings will be of utter uselessness.

These numbers give us a first idea that just the correct way to analyse data will not suffice, since the sheer amount of data, which on top of things is doubling every two years, cannot be handled with currently used techniques. New forms and ways to store these structured and unstructured data in an efficient manner, implementation of a proper data quality management as well as self-learning and self-improving algorithms will be necessary to cope with the arising challenges.

Another necessary step to succeed in these challenges is to always verify and challenge new techniques and methods in terms of the actual impact and benefit for the customers of insurance companies. It is vital that all activities regarding data science are thought from a customer perspective. This is a necessary constraint in order to succeed in a variety of new endeavours within the insurance business, but this holds true especially in the field of data science. If customers do not see or understand the advantage they gain through the proposed new methods and models, not only will then no one buy or use these new products and tools, but the customers will never share their personal data for further improvement of the models. The big investments will very quickly become lost investments.

We will address these upcoming challenges in a series of in total six articles, where we will provide a deep understanding of the impact of data science within insurance companies regarding these major topics:

  • How will overall processes within an insurance company and future customer journeys be affected by data science?
  • What possibilities and opportunities does data science provide for marketing and sales?
  • How can data science improve new risk identification and risk optimisation?
  • Which adoptions of the current IT infrastructure are necessary to enable data science?

With answers to these questions, insurance companies as well as insurance intermediaries will be well prepared for the chances and advantages that come with data availability and how to implement those to gain an advantage for customers. The emphasis is on the competence to identify use cases, to interlink different disciplines with each other and to choose the right algorithms and methods to extract new insights from the given data. The next article of this series will focus on the impact of data science on overall processes of an insurance carrier.

[1] zeb.research, public statements of insurance companies

[2] Fujitsu, “The Fujitsu European Financial Service Survey”, 2016 – Interviews of 1005 insurance customers

[3] Regaldo, A, “The data made me do it”, MIT Technology Review, May Issue, 2013

[4] insideBigData, “A Guide to the use of Big Data on an Industrial Scale”, 2017

[5] CrowdFlower, “Data Science Report”, 2016

Alexander Riesner

Manager Office Vienna

Tobias Holler

Analyst Office Munich


Leave a Reply

Your email address will not be published. Required fields are marked *