How Generative AI Is Impacting Analytics

How Generative AI Is Impacting Analytics
Text-based generative AI tools like ChatGPT have been hailed — and rightly so — for their lucrative potential in the business setting, from generating personalized sales scripts to powering onboarding chatbots. But what isn’t discussed enough is how generative AI can enhance analytics.

Generative AI techniques have the potential to enrich all types of enterprise analytics—descriptive, predictive, and prescriptive—by generating synthetic data (including for variables that were not previously measurable!), improving model performance, and supporting the decision-making process.  

 Here’s how generative AI can enhance each type of enterprise analytics:

Descriptive Analytics

Data Augmentation 

Generative AI can be used to generate synthetic data that augments existing datasets, allowing for a more comprehensive analysis. This is particularly valuable for scenarios where data is difficult to measure or collect. There are many possible obstacles to observing valuable data, including physical limitations on sensors, privacy, legal and ethical concerns, cost constraints, or small sample sizes. In any case, generative models can generate synthetic data that mimics the real data without the challenges that come with collecting it. 

Generative models can even generate data for variables that are currently impossible to measure: imagine having a “virtual sensor” for heavy machinery, or inferring an insurance client’s risky behavior that could otherwise not be observed.

In our work with Andretti Autosport, we generated data for a racecar variable that would be impossible to measure in the heat of a race, but that would be valuable for the team’s race strategy. We experimentally validated that the generated data was highly accurate relative to the real data, with a mean squared error (MSE) of .0863 (perfect accuracy would be an MSE of zero). 

Visualization and Interpretation 

Generative models can assist in creating visual representations of data patterns and distributions, making it easier to interpret and understand complex data. This could be used to create dashboards to inform business decisions, or to support filling in paperwork, for example for FDA applications or customs forms in the logistics industry. 

It should be noted that for such use cases it may be necessary to fine-tune foundation models by training them on your enterprises’ proprietary data. Foundation models, or general purpose, one-size-fits all models like GPT, have noted weaknesses when it comes to mathematical accuracy. This can be mitigated by integrating subroutines for algorithms that excel at these mathematical tasks. Models can then be fine-tuned to recognize when to leverage these subroutines to process the data as requested.

Predictive Analytics

More Realistic Synthetic Data

Synthetic data created by generative models also has value for predictive analytics. Synthetic data isn’t necessarily new, but advances in model performance have made this data more accurate and realistic, particularly for situations when training data is limited. In fact, recent research has suggested that quantum generative models running on classical hardware may have an advantage in generalizing from limited data over traditional generative models — in other words, generating more realistic synthetic data when there’s little training data to base it on. Consequently, augmenting the training dataset with more realistic synthetic data can improve the accuracy of predictive models.

Augmenting the training dataset with more realistic synthetic data can improve the accuracy of predictive models.

One example of a predictive analytics task with limited training data is found in our work with Andretti. A yellow flag is flown when an accident on the track forces all cars to cap their speed. Yellow flags are extremely unpredictable, and relative to the infinite number of ways a yellow flag could happen, the dataset available to train a yellow flag prediction model is vanishingly small, even when considering the entire history of IndyCar racing.

Generating synthetic data for yellow flag incidents can enhance predictions for these events, but the limited training data makes it difficult to generate synthetic data. Given the limited training data, the research suggests that a quantum generative model could generate more realistic synthetic data for yellow flag incidents than a traditional generative model.

Anomaly Prediction 

The same technique used for yellow flag prediction could be applied to any anomaly prediction use case, for example, predicting fraudulent activities or equipment failures. Similar to the yellow flag example, there can be infinite ways an anomaly could deviate from the norm. Generative models can augment datasets of anomalies with synthetic anomalies, which can then be used to train a predictive model. In the case of predictive maintenance, this could be used to more proactively predict equipment failures by accounting for more possible failure scenarios.

Prescriptive Analytics

Optimization

Generative models can also be used to enhance optimization algorithms to find optimal solutions for complex problems. In this case, the generative model learns from the best existing solutions and then generates new solutions. By generating new candidate solutions, generative models can help guide the optimization process to better results.

By generating new candidate solutions, generative models can help guide the optimization process to better results.

We’ve recently demonstrated this in work with BMW, where we optimized their assembly line scheduling to minimize downtime while still meeting production targets. We found that the generated solutions tied or outperformed the solutions from the traditional optimization algorithms in 71% of problem configurations. For more details, check out our recent blog post, How We Harness Industrial Generative AI for Optimization.

Conclusion 

Overall, generative AI techniques provide additional capabilities to enterprise analytics by enriching the data landscape, improving prediction accuracy, and proposing new solutions to complex problems. However, it’s important to note that the successful application of generative AI in enterprise analytics requires careful consideration of data quality and model reliability. Oftentimes, open-source or online generative AI tools like ChatGPT will not provide the reliability needed for industrial use cases, so it’s important to fine-tune these models using your own data and integrate them into your existing infrastructure.

To explore how we can help you build Industrial Generative AI applications for your hardest problems, get in touch