How to Modernize Legacy Applications with Machine Learning
The three ingredients that enable “In-the-Moment” decisions
by Monte Zweben and Syed Mahmood
We all make tens of thousands of snap decisions every day, such as where to go to dinner tonight, which route to take on our way home from the office, etc. Some of these decisions may be small or routine, while others have the potential to alter the course of our lives. We use the information available to us to predict possible outcomes and make a spur of the moment decision. Let’s call these decisions, “in-the-moment” decisions.
Now, how can enterprises make in-the-moment decisions? With the rise of artificial intelligence and machine learning, businesses are now able to make hundreds of thousands or even millions of intelligent, in-the-moment decisions every day — even in their legacy applications. Online product recommendations, ad placements, and decisions to accept or deny a credit card application are just a fraction of the in-the-moment decisions that businesses now routinely make.
But isn’t “in-the-moment decisions” just a fancy term for real-time decisions that we have been talking about for years? Some in-the-moment decisions are made in real-time, but some may not be real-time. For example, when an insurance company decides whether to pay a claim or route it to an analyst for further investigation, there is a discrete time window in which the insurer must make a pay/no pay decision after the customer provides a First-Notice-Of-Loss. In this case, the decision is not in real-time. It is instead within a service window that is driven by local laws, the need to maximize customer satisfaction, and the company’s bottom line.
Time also affects many variables that impact a company’s in-the-moment decisions. For example, it’s critical that a mobile ad company display a relevant ad as soon as the page is rendered. Factors like context, location, device, behavior, purchases, and weather also play a vital role in determining whether the ad is going to resonate with the audience and convert into clicks. All of these data change over time. Being in-the-moment requires having an accurate snapshot at a point in time.
Moments Drive Intelligent Actions
CEOs and boards are intoxicated with the potential of artificial intelligence and machine learning to make in-the-moment decisions that drive intelligent actions. Luckily, there is a lot of data stored behind enterprise firewalls and in data warehouses and lakes. There are also petabytes of new data from IoT devices, server logs, and media. So using artificial intelligence to make decisions is viable because the raw inputs exist. And it’s not just a luxury. It’s an imperative for success, especially for the legacy applications that have made the business successful in the first place.
Three Ingredients for Making In-the-Moment Decisions
The question on every executive’s mind is, what do I need to have in place before my enterprise can use these moments to make intelligent decisions? We believe that there are three ingredients you need to successfully modernize your legacy applications:
- Recency: The ability to access the most recently available data in ML model to take action while it still matters.
- Continuous Training: The ability to continuously retrain ML models on the most recent data because market conditions change constantly and your model must adjust accordingly.
- Experimentation: The ability for data scientists to constantly and easily experiment with new features, see how they affect the model, and push the best version into production.
Let us explore the role of each of these in making in-the-moment decisions.
If you are going to build a model that is going to predict a future event, you must build it using the same data that contains all or most of the attributes that describe that specific event — the feature vector in data science terminology. The more updated data that you have at your disposal, the higher the likelihood that it contains the features that are predictive of the “moment”.
In the past, due to the cost of storage and computation, businesses could only afford to use a slice of the data available to them to build their models and in most cases, this slice of data contained old information. Even though data storage restrictions have been largely lifted due to the decline in storage prices, the current data infrastructure in most enterprises does not allow the most updated data to be available to data scientists to build their models. This infrastructure typically consists of an operational (OLTP) database and a decision support (OLAP) data warehouse that has been duct-taped together. Others have implemented so-called Lambda Architectures on Hadoop. Both of these infrastructure architectures usually require large amounts of data to be moved repeatedly, and as a result, the data that makes its way to the data scientists to build models is often stale and the data used by the models to make a decision is old. To make matters worse, introducing machine learning usually requires the complete rewriting of legacy applications to be able to make intelligent decisions in-the-moment.
Why does it matter if your model uses stale data that arrives to the application too late? Because the models are more likely to make bad decisions more frequently. Millions of dollars of fraudulent claims can be paid out inappropriately with little chance of clawing those payouts back. Predictable outages in oil rigs and utility grids can be missed, costing millions of dollars a day. Conversion rates on marketing camaigns can suffer, affecting revenues. Patients can be discharged from a hospital too soon, leading to costly readmittance, let alone the safety issues to the patient. The business ramifications of poor models are serious. And data latency is one of the worst culprits.
Having access to the most recent available data to build your model is just one part of the puzzle. The ability to deploy the model at the point of decision and take action in-the-moment is the other.
The second ingredient to make in-the-moment decisions is the ability to continuously train your machine learning models. The reason is that if the model is operating on rapidly changing data, then the model must be trained much more frequently. For example, if you have built a machine learning model to detect cyber attacks on your network, chances are that the hackers will change their strategy frequently to try to stay one step ahead of you. If you do not continuously train your model on new attack patterns, it will sooner or later miss an important signal that will result in the network being compromised. Bad actors adapt, customers change their likes and dislikes, equipment changes its behavior, and many other factors dynamically change, requiring the continuous training of models on recent data.
The latency built into current enterprise data infrastructures makes continuous training of your models difficult or impossible. The fresh data required for constant retraining is simply not available at the frequency required by the model. This forces the model to operate at a suboptimal level and, in extreme cases, make bad predictions.
When data scientists want to build a machine learning model, they start with a large number of attributes or features that they believe can predict the event that they are interested in. In a sense, they are defining a moment. In order to build a robust model, data scientists must continuously experiment with different algorithms, features, and parameters. For example, they might try out different combinations of features, swapping a less predictive vector out for one that they think will make the model better. Only through continuous experimentation can a data scientist build a model that is accurate enough to be put into production, maintains that accuracy as the world changes, and continuously improves over time. This notion of experimentation in a “feature factory” is also discussed in our blog here.
Modernize Legacy Applications to Make Decisions in the Moment
Once you have identified the in-the-moment decisions that matter for your organization, how do you make them a part of your business process? You inject that model into your custom legacy applications so that they add even more value. The benefits are only limited by your imagination. Maybe your call center application can benefit from a live recommendation engine that delivers a 5% uplift in sales because it now uses streaming location and weather data. Maybe your claims management application leverages social, sensor, and traffic data to paint a more accurate picture of what actually happened at the scene. Whatever the end goal of your purpose-built application, it can now make better decisions in the moment. It usually only entails the introduction of a scoring function that embeds the models that leverage the three ingredients above.
Don’t Rewrite. Modernize!
We meet with a lot of executives who are convinced that they must replace their legacy applications with brand new intelligent applications to optimize their digital transformation initiatives. It’s not necessary to rebuild them when you can modernize them. Here’s a commercial announcement: With Splice Machine’s Distributed SQL engine, you can power your existing legacy application with SQL, but enrich it with analytical sources of data, and supercharge it with live, in-the-moment machine learning — using the same SQL code that you used before.
If artificial intelligence and machine learning are the oil of the Information Age, our pipeline and valve infrastructure is lacking. Latency routinely degrades or cripples AI and ML efforts. It doesn’t have to be like this. If you can combine your SQL application workloads, your analytical data, and an AI/ML engine onto one platform, you can use it to power your legacy applications with a fresher and richer source of data and intelligence to make in-the-moment decisions.
If you would like to learn more about how to inject predictive and ML models into your purpose-built applications, sign up for this on-demand webinar.