Back to Blog

Something is rotten in the holi-dates of models

Something is rotten in the holi-dates of models

Let’s get the obvious out of the way. First, ML models are built on the premise that the data observed in the past on which we trained our models reflects production data accurately. Second, “special” days like holidays such as  Thanksgiving or, more specifically, the online shopping bonanza boom of the last decade, have different patterns than the rest of the year. Obviously. Third, we can prepare ourselves and our models for the accompanying mess of less accurate predictions and a lot more volume because we know about these peaks. Or can we?  

To train or not to train?

Understanding your problem is half the solution, but as data scientists, it’s tempting to view these peaks as an issue that needs to be resolved by training our model to adapt to these times of the year and their fluctuations or perhaps ever going a step further and engineering a feature to address these specific events. While this may be the correct answer for some use cases and businesses, it will not always be the most efficient solution or even an exclusive solution. 

There are a few holiday season model strategies and tactical mixes that can be used to ensure that model issues don’t impact business. 

Train 1: global

Assuming that this is a known fluctuation theoretically, you have enough time to ensure it is represented in your training data. There is a twofold problem here. To address this “noisy” data that is now present, either the model will result in a suboptimal function that may underperform on normal days or result in a fitted but significantly more complex model function.  

Train 2: custom

Another training route that you can take is custom model training, looking at a particular fluctuation/s and building a custom model for those specific time frames only. The potential downsides here are that it may be challenging to grab enough data to train a model for such a specific resolution and that if the real-world changes, case in point COVID, your historical data on which you trained can’t predict the real-world next time around. 

Models without safety nets never to heaven go 

Training aside, other tactics can be applied here. The key to them all is that they do not attempt to learn but accept them as inherently unpredictable anomalies. We know that they will manifest but not precisely how or to what extent. Here our goal will be just to detect these anomalies and trigger manual action via safety nets we set up. 

Don’t train 1: monitor

Model monitoring is a pretty obvious must here and should be regardless of any specific event or day of the year. That said, the configurations that are realistic for “normal” days tend to go off the rails and blast everyone with alert galore when they need to contend with events like Chinese Singles Day. So if alert fatigue is a no, then observability is the answer – we want to destructure the known issue monitoring aspect of our models to a degree. This can be achieved, for example, by increasing segment resolutions, changing feature sensitivity thresholds, or even ignoring specific time-coupled features altogether for a given period.

Don’t train 2: go back to building rules

At its core, we use models because they provide a better representation of the real world, detect complex patterns hidden from the human eye, and come up with predictions much faster than we as mere humans would ever be able to. Unfortunately, the cost of all that is their black-box nature. Given that they are the ones essentially driving the business on “special” days, heuristics is not necessarily a dirty word. Yes, they are slower to reach optimal answers, but they are simpler to understand, and more importantly, they are much more flexible and capable of adapting immediately to specific domain knowledge you have about your business that you can correct for. 

Don’t train 3: it’s time for humans in the loop

All of the tactics listed up till now also call for humans in the loop, but as the next step in an escalation process. Here we’re going to discuss humans in the loop as a tactic in itself and cases where you may decide to double down on it.

Model confidence:

Even on normal days, certain cases will be escalated to analysts when the prediction confidence levels are low enough to trigger an escalation. On non-normal days two additional questions need to be considered. First, do you want to maintain the same criteria but with different thresholds? Second, do you now need to trigger escalations on specific criteria mixes that are not within the norm or clustering in a different pattern than typically normal?

Industry and/ or use case compliance sensitivity:

So this is pretty clear cut (at least in the current regulatory space in which AI operates. In the future, AI regulation will become more complex and nuanced as regulations advance), some industries and use cases, such as banking and lending vs. AdTech and click-through-rate optimization, have more tangible compliance requirements, and even fines, associated with them. Because of the closer association of specific industries to regulators, these organizations should consider instituting additional lines of defense to self-audit themselves ahead of any potential regulatory examination.

Adverse media and adversarial attacks:

Up until now, we’ve mainly talked about peak events in the context of holidays and e-commerce fiestas, so let’s take a right turn and talk about elections, social media, fake news, and influencing public opinion. And with that string of nouns alone, we’ve illustrated a case where the sensitivity to adversarial attacks and the potential adverse media fallout, as a result, justifies humans in the loop.   

To thine own self be true 

Your use case, industry, and internal policies are going to greatly influence what tactical mix you set up this holiday season, and in all honesty, there is no one set truth as to what you should choose from this list or even beyond it. What is true? One tactic alone is unlikely to protect your business during peak events, and strategically it’s never a good idea to put all your eggs in one basket. That said (and food for thought), holidays, Black Friday, Chinese Singles Day, and so forth, to an extent, are easy; because we know about them. It’s the peaks and dips, seasonality, and anomalies that you don’t know about that are the real challenge of model observability. 

Want to see the Superwise way to observe model seasonality?