As part of the IBM #PartyCloud event, the Data Science Milan community is invited to participate to our September meetup. For this talk we will present a not-so-common application of deep learning. In particular we will learn how to predict failure, churns and customer lifetime using recurrent neural networks.
The notebook and documentation of the original tutorial is available at https://github.com/gm-spacagna/deep-ttf.
Deep Time-to-Failure: predicting failures, churns and customer lifetime using recurrent neural networks.
Machineries and customers are among the most valuable assets for many businesses. A common trait of these assets is that sooner or later they will fail or, in the case of customers, they will churn.
In order to catch those failure events we would ideally consider the whole history of the machine/customer available information and learn smart representations of the system status over time.
Traditional machine learning and statistical models approach the prediction of time-to-failure, aka. expected lifetime, as a supervised regression problem using handcrafted features.
Training those models is hard because of three main reasons:
The complexity of extracting predictive features from time-series without overfitting.
The difficulty of modeling uncertainty and confidence levels in the predictions.
The scarcity of labeled data, failure events are by definition rare and that results in highly unbalanced training datasets.
The first issue can be solved adopting recurrent neural architectures.
A solution to the the last two problems could be to exploit censored data and to build survival regression models.
In this talk we will present a novel technique based on recurrent neural networks that can turn any length-variable sequence of data into a probability distribution representing the estimated remaining time to the failure event. The network will be trained in presence of ground truth as well as with right-censored data.
We will demonstrate using a case study regarding 100 jet engine simulated degradation provided by NASA.
During the tutorial you will learn:
What is Survival Analysis and what are the most popular Survival Regression techniques.
How a Weibull distribution can be used as generic distribution for modeling Time-to-Failure events.
How to build a deep learning algorithm in Keras leveraging recurrent units (LSTM or GRU) that can map raw time-series of covariates into Weibull probability distributions.
The tutorial will also cover a few common pitfalls, visualizations and evaluation tools useful for testing and adapting this approach to generic use cases.
You are free to bring your laptop if you would like to do some live coding and experiment yourself. In this case we strongly encourage to check you have all of the requirements installed in your machine.
More details on the required packages can be found on the Github repository gm-spacagna/deep-ttf.
Gianmario is the Chief Scientist and leader of the AI research at Cubeyou
His mission, and of his team, is building the next generation of social algorithms and models of human decision-making with careful attention to their potential and effects on society.
He is co-author of the book ‘Python Deep Learning’, co-author of the ‘Professional Manifesto for Data Science’ and founder of the ‘Data Science Milan’ community.
Gianmario holds a master’s degree in Telematics (Polytechnic of Turin) and Software Engineering of Distributed Systems (KTH of Stockholm).
His experience covers a diverse portfolio of machine learning algorithms and data products across different industries. Prior to CubeYou, he worked as Data Scientist in IoT Automotive (Pirelli Cyber Technology), Retail and Business Banking (Barclays Analytics Centre of Excellence), Threat Intelligence (Cisco Talos), Predictive Marketing (AgilOne) plus some occasional freelancing.
19:00 Doors opening
19:00 Data Science Milan community introduction