i4Q AI Models Distribution to the Edge¶
General Description¶
I4QAI addresses a multi-tier infrastructure designed for the management of AI-based models in a hybrid cloud-edge manufacturing environment. Scale is one of the main areas of concern in such environments, thus operations are planned to be automated as much as possible by employing techniques like policies and labels based distribution and deployment. There is a tight coordination required between this task and the AI workload distribution mechanism (implemented in the solution i4QEW: AI Workload Placement and Deployment), such that the AI workload and model shall meet and collaborate in the correct edge targets nodes. This solution resides at the back-end infrastructure level providing capabilities to be used by other solutions and pilots. This solution provided support for different kinds of model wrappers to be deployed, for instance native binary to be loaded by the corresponding workload or a wrapping http server whose interfaces can be invoked by the corresponding AI workload. In addition the model to be deployed can be prepared in several flavors such as a standalone container or wrapped within a helm chart.
Features¶
Distribute AI models to the edge where they are expected to be used by locally running workloads. The AI model distribution is coordinated with the workload distribution mechanism to ensure that the right set of AI models is made available for the workloads that use them.
Help deploy AI models at the edge in scale. To enable that, a policy-based deployment pattern can be used via associating labels with the potentially different target nodes.
Manages lifecycle of AI models, from creation at the cloud, initial deployment at the edge, and re-deployment when revised models are available. Finally, when not needed anymore the deletion of the model is supported as well.
Support policy-based placement mechanism for AI models that eases the task of the administrator by enabling the specification of rules for eligible targets in a simplified manner.
Support GitOps based mode of operation. The connection point between the model creator and the model deployer is performed via git. New resources that get pushed into git are retrieved to be deployed on the target nodes associated with the requested labels.
Support the deployment of AI models with Red Hat Advanced Cluster Management for Kubernetes (RHACM). ACM serves as the basis for enabling deployment operations at scale.
Support AI models wrapped in containers. In this mode, the models are ready to be loaded by the corresponding AI workload to be used internally.
Support AI models wrapped in an HTTP server. Model deployed as a microservice exposing a REST interface that can be used by additional components such as the AI workload.
Operate seamlessly well for small to very large multi-site infrastructure common in smart manufacturing environments.
Support lightweight orchestration engines such as k3s. Thus, can be deployed and made operational over a variety of host devices and architecture, from high to low footprint artefacts
ScreenShots¶
Comercial Information¶
License¶
TBD
Pricing¶
TBD
Associated i4Q Solutions¶
Required¶
Can operate without the need for another i4Q solution
Optional¶
i4Q Edge Workloads Placement and Deployment
Installation Guidelines¶
System Requirements¶
Access to a RHACM instance
Access to a VM that can run K3s
Attaching K3s cluster to RHACM¶
Steps taken from Steps taken from importing-a-target-managed-cluster-to-the-hub-cluster The K3s cluster must be able to connect to the RHACM instance to be imported
Open the ACM UI
Open Infrastructure -> Clusters (path ACM-URL/multicloud/clusters)
Click Import Cluster
Enter a cluster name (it is recommended that your choice is expressive of the cluster)
Click “Save import and generate code”
Copy the command generated and run it on your K3s cluster
The imported cluster should appear as “Ready” under Infrastructure -> Clusters within minutes
Deploying an application on the RHACM instance¶
If your application must pull container images from private repositories then a deployment-level/service-account level docker-configuration access secret must be deployed, and referenced in deployment/service-account
User Manual¶
This section describes the steps necessary to deploy a model on an ACM instance, and having ACM propagate and deploy the model to the connected managed cluster. #. Deploy docker authentication secret on your edge K3s cluster:
Configure:
Gitlab user:
export GITLAB_USER=...
Docker-config access token for the private containers repository:
export DOCKER_READ_TOKEN=...
Secret name (referenced in deployment, e.g., i4qregistry)
export SECRET_NAME=...Create secret:
kubectl create secret docker-registry $SECRET_NAME --docker-server=registry.gitlab.com --docker-username=$GITLAB_USER --docker-password=$DOCKER_READ_TOKEN -n i4q-lrt --dry-run=client -o yaml | kubectl apply -f -
Through RHACM UI:
Visit Applications UI (/applications)
Click on “Create Application” then “Subscription”
Enter Info:
Name: <i4q-lrt>
Namespace: <i4q-lrt>
Repository type: select Git
URL: <https://gitlab.com/i4q/LRT>
Username: your gitlab user
Access token: a gitlab token of yours with READ api permissions
Branch: <acm-dev>
Path: <charts/sktime>
- Scroll down to “Deploy application resources only on clusters matching specified labels”:
Label name: <env>
Label value: <lrt>
Click “Save”
Getting the application deployed on a Managed-Cluster¶
The above application is bounded to a placement rule that selects managed clusters with label env=lrt. To add the label to your managed clusters, visit Infrastructure/Clusters in the ACM UI, or run:
kubectl label managedcluster <cluster-name> env=lrt
Accessing the application¶
If port 80 is accessible on your K3s cluster’s VM, you should be able to access the application through http://CLUSTER_PUBLIC_IP_OR_DNS/api/ui