Author: Josh Patterson
Date: Apr 15th, 2022
Other entries in this series:
In the last post we built a series of models with our minimum model performance and stretch goals in mind.
In this post we revist the business scenario and then examine how our model projects to impact the operating profit margin of ACME Tool Co.
The business team set the following business and model performance goals.
18
machines that are likely to fail the next day11
out of 18 correct (61%
out of top 18) 95% of the time14
out of 18 predictions correct (78%
out of top 18) 95% of the timeWith the business goals in mind, let's now take a look at how our data science team fared with the modeling efforts.
In the chart below we can see the model performance from the top 3 models from the data science teams data science workflow (along with the baseline model, logistic regression, for comparison).
The red line represents the minimum performance for the model to hit the most conservative business goal (at least 11
out of 18 correct [61%
out of top 18] 95% of the time).
The blue line represents the ideal outcome for the business in this context (14
out of 18 predictions correct [78%
out of top 18] 95% of the time).
The ranges ("whiskers" on the bars) on the model performance chart represent the 95% confidence intervals (or variability) in the average performance of each model.
We have to calculate and take into consider the 95% confidence intervals because the model is built around the statistics of the training data. We don't train models on the entire population of data (all data that would be possible) because we simply do not have this data. We train models on a sample that (hopefully) represents the true population of the data. The closer our training data represents the full population of the data, the lower the standard deviation and the tighter our confidence intervals will be. Therefore we calculate the confidence intervals for the model based on how well the training data represents the population data.
We conservatively only want to consider the lower range of the confidence intervals so that our business cases is accurate 95% of the time, so we want a model that has a lower range that clears our goal line (red and blue lines). If the average performance of the model is above the goal line but the lower range of the confidence interval doesn't clear the goal line, then the model would only meet our performance goal "some of the time".
The modeling team was able to produce 3 models that would achieve performance better than the red line 95% of the time.
Even better, one of the models found was able to achieve performance better than the blue line 95% of the time. In the next section, we'll analyze how these modeling results translated into financial impact on the line of business.
We'll start with a quick overview of what is projected to happen post-price drop without any changes to the maintenance processes.
This drop in operating profit margin would create a large drop in shareholder equity for ACME Holding Company. This drop in operating profit margin would also make it harder to raise capital to compete and grow in the market. The board would like to avoid both of this scenarios.
Now let's take a quick look at what these same numbers will look like once a predictive maintenace program is applied.
If you'd like to see more detailed numbers on how the model performance affected ACME's income statement, download our pdf report: ACME Tool Company Predictive Maintenance Business Impact Analysis.
Any model the data science team produce that could hit the baseline goal of the model (at least 11
out of 18 correct [61%
out of top 18] 95% of the time) produces an operating profit margin of 9.63%.
This operating profit margin is inclusive of the yearly cost ($129,600) of predictive maintenance program.
While this is lower than the 2021 (current) operating profit margin, it's still considerably higher than what could be the 2022 operating profit margin if nothing changes.
The best model produced gets ACME Tool Co’s operating profit margin back to 11.34%
(within 0.66% of the original 12.00%). This is considered a positive outcome by the whole team (and a pleasant surprise, given the situation).
Further:
The news of the best model produced by the data science team is welcomed by everyone at ACME Tool Co. This gets the executive team off their heels and gives them the flexibility to compete or sell at a better price.
The market is commoditized and no other clear points of differentiation are expected with the new $18.50 version, customers will likely flip on price.
However, ACME Tool can leverage predictive maintenance to possibly drop price further and put pressure back on competitors, defending their market from new entrants.
ACME Tool can also put more resources into further improving the model, further improving their income statement and sharpening their competitive edge.
The business team has now confirmed that the model has met the goal criteria and we can now move the model into a pilot production state.
We don't have to re-train the model everyday -- as long as the distribution of the data stays the same.
However, there is wear on our tools each day we are operational, so we need to re-scan our tool sensor data each day to build a queue of the most likely tools to fail. The evening maintenance team will use the rebuilt queue each day, provided to them as a daily report that shows them the 18 machines to perform the nightly maintenance on. Snowflake is a great place to store these predictions, and in our next article (Part 6: Going to Production with Snowpark) we will deliver this report based on our best model.