DornerWorks

Machine Learning for Product Managers

Posted on August 10, 2022 by Matthew Russell

Machine Learning has become an enabling technology in nearly every industry on the planet.

It’s also become a ubiquitous buzzword for complex technologies that may have nothing to do with neural networks at all.

There is a lot of confusion over what machine learning can be applied to, whether that’s to improve a product, or create something new. According to DornerWorks engineer David Norwood, the path from machine learning idea to proof of concept in simplest terms starts with a dataset.

“A machine learning platform is going to help someone look at that dataset and find value from within it, and then apply that value to whatever their problem may be,” Norwood says.

Take the example of a voice assistant. Human voice is one type of input that can interact with data on an ML platform, for instance Amazon Alexa or Google Assistant devices. Taking that audible input and writing code to fit each context can be a lengthy and inefficient process, however.

“It’s much easier to upload that waveform into a platform that can visualize it, learn from it, and then deploy the output to a device,” Norwood says.

An effective way to implement machine learning algorithms involves the following steps:

  1. Unstructured data is uploaded
  2. The dataset is reformatted in a way to make it compatible with an ML framework/platform
  3. The dataset is visualized using the framework and algorithm
  4. The dataset is trained (if you’re using a neural network)
  5. The system parameters are optimized
  6. The result is deployed to an end device
  7. The system is tested

Further,

  • If any outcome isn’t acceptable, the data (steps 1 and 2) will always be questioned
  • Steps 3-5 would typically iterate some number of times before having something for step 6
  • A bad result in steps 6 and/or 7 could force the designer to reiterate back to steps 3-5

DornerWorks engineers like Norwood specialize in guiding customers through steps

  1. Optimization
  2. Deployment
  3. Testing

“To me, being able to easily get through that entire machine learning process, that’s the sign of a good quality machine learning platform,” Norwood says.

Machine learning is still a relatively new technology. New enough that most algorithm developers and embedded engineers do not have decades of experience working together to draw from, while the models themselves are typically academic.

DornerWorks Machine Learning Expert David Norwood

Putting one of those models on a device with different processors and different memory sizes can quickly complicate the path to a single goal, especially when there are different methods and entry points in play. Another common challenge is the fact that a trained model does not incorporate computing parameters like processors, memories, latencies, etc., which can eventually limit the accuracy of the system when deployed.

“It’s easy to think of machine learning as a magic black box that will solve all your problems. But there are non-technical things that can get in the way,” Norwood says. “Just because you can make some inference doesn’t mean that that’s going to solve the problem or make things better.”

Machine learning can be considered a sub-field of Artificial Intelligence (AI), but it’s also an essential part of other fields like image recognition, language processing, recommender systems, robotics, medical diagnostics, autonomous vehicles, and more. A quality machine learning platform can take on many forms, but they typically face the same challenges. Here’s how to design a robust platform that addresses these challenges from the ground up:

Data access
Data science and ML applications require easy access to data, and lots of it. More data can help paint a bigger picture and provide additional insight. Common barriers include proprietary data formats, data bandwidth constraints, sub-sampling and governance misalignment.

If the data pipeline is limited in any way, the system will not be able to support the complexity of the model required to solving the problem.

Data quality
We’ve all heard of the old adage “garbage in equals garbage out”. Well, the same applies to machine learning. The quality of a machine learning system depends on the quality of the data. If the collection of the data is not reliable then it will be difficult to create machine learning models that are general and predictive. Incoming data must be reviewed carefully to ensure that proper insights can be gleaned from the data coming out.

Fuzzy parameters
The parameters of a method developed by data scientists are not always easy for engineers to understand. When it comes to modern machine learning algorithms there are always some features that need tweaking. Making it easy for others to understand the experience going into a system can help a lot in making the parameters clearer to new users.

Algorithm precision
Keep the problem in mind. Use an algorithm that is most suitable for the characteristics of the data. It will help in getting the most accurate results.

Feature overload
The success of an ML platform depends on its ability to quickly and accurately categorize data. Using too many features can make it difficult for a method to find valuable separations in a dataset. Consider starting with a minimal set of features and expanding them when needed.

Objective creep
ML systems can be incredibly effective at solving problems, so long as they are focused on those problems without spending resources elsewhere. The success of an ML application also depends on the suitability of the objective to begin with. Therefore, it’s important for ML developers to understand when other objectives may be creeping into a solution, because the same dataset and model may not be sufficient to solve the new objective.

Organizational flexibility
Organizations and technology will change. Data sizes will grow; team skill sets and goals will evolve; and technologies will develop and be replaced over time. An obvious, but common, strategic error is not planning for scale. Another common but more subtle error is selecting non-portable technologies for data, logic and models.

Communication challenges
A data platform must simplify collaboration between engineering and data science teams. Common barriers are caused by these two groups using disconnected platforms for compute and deployment, data processing and governance.

Machine learning is still a relatively new technology, with great potential in many industries.

“Communications, I think, is the biggest obstacle to overcome when taking a model and porting it to a device,” Norwood says, offering an analogy for creating a satisfying meal. “If Jamie Oliver had a recipe for me to replicate, I’d have to let him know the limited supply of ingredients (and even utensils) that I have in my kitchen, which might dictate what the recipe calls for.”

The different ingredients, utensils, and dietary restrictions in the recipe map nicely to data inputs, platform tools and parameters in an ML system. The end result in both cases solves a problem, though it may simply be “what’s for dinner?”

Why is Machine Learning important?

Edge devices are becoming more ubiquitous in factories, stores, gas stations, and a variety of IoT use cases where locally-deployed machine learning models can reduce latency and make decisions without internet connectivity. Using these technologies, companies can enable decision making in real-time.

The best machine learning platforms support the machine learning lifecycle from ingesting unstructured data through testing. These platforms enable automation, increase productivity, support machine learning software and frameworks, and can help you scale your business more efficiently.

If you are considering adding machine learning to your existing products, or building a system from the ground up, schedule a meeting with our team and we will help you turn your ideas into reality.

Matthew Russell
by Matthew Russell