A colleague shared this blog post from AWS. They are using machine learning approaches to predict HVAC failures. The problem is framed as a classification. They do not say how many failures their large sample size has. It seems they may be taking moving windows of 60 days so that they can predict failures within that window.
I’ve seen many of these approaches applied to problems that scream censoring and my first reaction is one of frustration over the ignorance of 70 years of rich survival literature.
My background is in stat and this surely biases my views. I wonder what people in this forum think of these approaches, specifically when applied to survival data.
I know @f2harrell spoke and wrote a lot about the problems of classification, but I’m more interested in applications like the ones in the AWS link: are they badly biased? Do they end up underestimating survival? What are the drawbacks and are classical survival approaches the better answer?