SCORING MODEL FOR DATA BREACHES FORECASTING AND PREVENTION
Machine Learning for prediction accuracy
We live in the “era of data”. Everyday companies face a compelling number of interactions with data source, and the more data is being produced, the more risks of this data to be used maliciously arise. Our Client, European start-up had an idea of developing a solution for the forecasting of spear-phishing of personal data for a wide scope of organizations located in various countries. This type of solution is excessively sought-after nowadays – especially combined with ML approach for striking prediction accuracy. Knowing about Unicsoft expertise in that domain and having previous successful track record with us, Client decided to choose us as an authorized solution developer.
Firstly we defined main datasources: the pinnacle was VCDB database (catalog of data security incidents using VERIS framework). Second source used in the analysis is FT500 data set in 2016. Then, after merging these two data sets we obtained set for predicting data breach probability for a particular company. Key ML models were GLM and Random Forest models with RF prevailing because of its higher precision. Moreover, after implementation of Monte-Carlo simulation methods we added prediction of incident density within particular timeframe. As an additional perk we predicted expected loss in USD in case of attack for various industries.
Solution was accepted positively right after its demonstration by various investors and got a lot of encouragement for the further development into more complex product with extended functionality. Due to model precision that amounted to 83% and the fact of Monte – Carlo simulation techniques implementation one may be confident that abovementioned solution has a huge potential and versatility.