Infusion of Machine Learning Operations with Internet of Things
By: Shikhar Kwatra, Utpal Mangla, Luca Marchi
With the advancement in deep tech, the operationalization of machine learning and deep learning models has been burgeoning in the space of machine learning. In a typical scenario within the organization involving machine learning or deep learning business case, the data science and IT teams need to extensively collaboration in order to increase the pace of scaling and pushing of multiple machine learning models to production through continuous training, continuous validation, continuous deployment and continuous integration with governance. Machine Learning Operations (MLOps) has carved a new era of DevOps paradigm in the machine learning/artificial intelligence realm by automating end-to-end workflows.
As we are optimizing the models and bringing the data processing and analysis closer to the edge, data scientists and ML engineers are continuously finding new ways to push the complications involved with operationalization of models to such IoT (Internet of things) edge devices.
A LinkedIn publication revealed that by 2025, the global AI spend would have reached $232 billion and $5 trillion by 2050. According to Cognilytica, the global MLOps market will be worth $4 billion by 2025. The industry was worth $350 million in 2019. [1]
Models running in IoT edge devices need to be very frequently trained due to variable environmental parameters, wherein continuous data drift and limited access to such IoT edge solutions may lead to degradation of the model performance over time. The target platforms on which ML models need to be deployed can also vary, such as IoT Edge or to specialized hardware such as FPGAs which leads to high level of complexity and customization with regards to MLOps on such platforms.
Models can be packaged into docker image for the purpose of deployment post profiling the models by determining the cores, CPU and memory settings on said target IoT platforms. Such IoT devices also have multiple dependencies for packaging and deploying models that can be executed seamlessly on the platform. Hence, model packaging is easily implemented through containers as they can span over both cloud and IoT edge platforms.
When we are running on IoT edge platforms with certain dependencies of the device, a decision needs to be taken which containerized machine learning models need to be made available offline due to limited connectivity. An access script to access the model, invoke the endpoint and score the request incoming to the edge device needs to be operational in order to provide the respective probabilistic output.
Continuous monitoring and retraining of models deployed in the IoT devices need to be handled properly using model artifact repository and model versioning features as part of MLOps framework. Different images of the models deployed will be stored in the shared device repository in order to quickly fetch the right image at the right time to be deployed to the Iot device.
Model retraining can be triggered based on a job scheduler running in the edge device or when new data is incoming, invoking the rest endpoint of the machine learning model. Continuous model retraining, versioning and model evaluation become an integral part of the pipeline.
In case the data is frequently changing which can be the case with such IoT edge devices or platforms, the frequency of model versioning and refreshing the model due to variable data drift will enable the MLOps engineer persona to automate the model retraining process, thereby saving time for the data scientists to deal with other aspects of feature engineering and model development.
In time, the rise of such devices continuously collaborating over the internet and integrating with MLOps capabilities is poised to grow over the years. Multi-factored optimizations will continue to occur in order to make the lives of data scientists and ML engineers more focused and easier from model operationalization standpoint via an end-to-end automated approach.
Reference:
[1] https://askwonder.com/research/historical-global-ai-ml-ops-spend-ulshsljuf