How can you translate huge amounts of data crunching and model output into a usable, powerful product?
By integrating artificial intelligence and machine learning, that’s how.
In their webinar, AWS’s Matt Thomson and Clemens Mewald of Databricks explored how data processing is getting far easier thanks to AI/ML models. You can catch the replay of this session on-demand right here.
Of course, the PLA community were ready with their questions, and here are some of the best answers from the session...
Q: If a company has already collected data, how do you apply AI/ML to extra actionable intelligence from the data? Do you need to start with a problem statement to figure out how to apply AI/ML?
A: This depends on what is meant by "actionable intelligence". It can be descriptive insights about the data or prescriptive predictions.
In both cases, the first step is EDA: exploratory data analysis. There are different tools for this, but the goal is to get a better understanding of your data, whether it has predictive power (e.g. are any of the columns correlated with what you are trying to predict).
For the prescriptive part, you really need to train ML models... the recipe for this is usually to start simple: train a linear/tree-based model and see if it works based on evaluation metrics. Then, if the evaluation metrics look good, you probably want to test them on real data (i.e. run an experiment or an a/b test). Often there is a difference between offline and online.
Once those tests are positive you can move to build the model.
Only after that first iteration, I'd move on to fancier ML approaches like deep learning... it is usually good to first get a full iteration in with a basic ML approach before you move on to more advanced ones.
Q: As we know data is the base of AI/ML. So how do you deal with the bias in data sets during building products?
A: The first step is to be aware of these biases. EDA (exploratory data analysis) should help with this. You need to look at your data to see if there are imbalances, e.g. do you have more data for specific segments than others.
If you find that you do have biases in your data then there are different techniques to address this. You can collect more data for the underrepresented segments to balance it out, or you can rebalance your data.
The takeaway is to start with the data, not the model. If your data are biased the model will be.
Q: I would love to know if there’s a general cheat sheet on how AI could help improve the product I’m working on? Would AI/ML always be applicable or could there be instances where this model is not suited?
A: ML definitely doesn't always work. It really depends on the use case, whether you have data available, etc.
Unfortunately, there is no easy cheat sheet to know whether AI could improve the product. The approach I would take is to first start with the problem statement, what is it about your product that you want to improve/solve, and then consult someone who has a lot of experience with AI.
They will usually be able to ask the right questions, like, do you already have the right data to even think about AI, and tell whether there is an obvious way to apply AI.
Q: My other observation is that maybe most of the PMs who can benefit from AI/ML don't know how to begin. So what do they need to do to be successful on this long journey?
A: I've written a little bit about this topic (here) and there are also other resources.
In short, PMs shouldn't actually be too technical or in the weeds, because then they run the risk of having a big hammer and looking for nails. Instead, they should have some amount of knowledge about which types of problems AI can solve, how that applies to their product/business, and how to engage with data scientists / ML engineers to test their hypotheses.
If the R&D team also doesn't know much it's a little harder... at least at this point in time AI still requires very specific skills. So in that case I'd either consider building these skills within the company or finding a partner (maybe a consulting company).
Don't forget, you can watch this whole session on-demand and catch all the latest live broadcasts and replays right here. 👇