i4Q AI Models Distribution to the Edge

General Description

I4QAI addresses a multi-tier infrastructure designed for the management of AI-based models in a hybrid cloud-edge manufacturing environment. Scale is one of the main areas of concern in such environments, thus operations are planned to be automated as much as possible by employing techniques like policies and labels based distribution and deployment. There is a tight coordination required between this task and the AI workload distribution mechanism (implemented in the solution i4QEW: AI Workload Placement and Deployment), such that the AI workload and model shall meet and collaborate in the correct edge targets nodes. This solution resides at the back-end infrastructure level providing capabilities to be used by other solutions and pilots. This solution provided support for different kinds of model wrappers to be deployed, for instance native binary to be loaded by the corresponding workload or a wrapping http server whose interfaces can be invoked by the corresponding AI workload. In addition the model to be deployed can be prepared in several flavors such as a standalone container or wrapped within a helm chart.

Features

  1. Distribute AI models to the edge where they are expected to be used by locally running workloads. The AI model distribution is coordinated with the workload distribution mechanism to ensure that the right set of AI models is made available for the workloads that use them.

  2. Help deploy AI models at the edge in scale. To enable that, a policy-based deployment pattern can be used via associating labels with the potentially different target nodes.

  3. Manages lifecycle of AI models, from creation at the cloud, initial deployment at the edge, and re-deployment when revised models are available. Finally, when not needed anymore the deletion of the model is supported as well.

  4. Support policy-based placement mechanism for AI models that eases the task of the administrator by enabling the specification of rules for eligible targets in a simplified manner.

  5. Support GitOps based mode of operation. The connection point between the model creator and the model deployer is performed via git. New resources that get pushed into git are retrieved to be deployed on the target nodes associated with the requested labels.

  6. Support the deployment of AI models with Red Hat Advanced Cluster Management for Kubernetes (RHACM). ACM serves as the basis for enabling deployment operations at scale.

  7. Support AI models wrapped in containers. In this mode, the models are ready to be loaded by the corresponding AI workload to be used internally.

  8. Support AI models wrapped in an HTTP server. Model deployed as a microservice exposing a REST interface that can be used by additional components such as the AI workload.

  9. Operate seamlessly well for small to very large multi-site infrastructure common in smart manufacturing environments.

  10. Support lightweight orchestration engines such as k3s. Thus, can be deployed and made operational over a variety of host devices and architecture, from high to low footprint artefacts

ScreenShots

i4q-lert example

Comercial Information

Authors

Company

Website

Logo

IBM

https://research.ibm.com/labs/haifa/

IBM

License

TBD

Pricing

TBD

Associated i4Q Solutions

Required

  1. Can operate without the need for another i4Q solution

Optional

  1. i4Q Edge Workloads Placement and Deployment

Installation Guidelines

System Requirements

  1. Access to a RHACM instance

  2. Access to a VM that can run K3s

Attaching K3s cluster to RHACM

Steps taken from Steps taken from importing-a-target-managed-cluster-to-the-hub-cluster The K3s cluster must be able to connect to the RHACM instance to be imported

  1. Open the ACM UI

  2. Open Infrastructure -> Clusters (path ACM-URL/multicloud/clusters)

  3. Click Import Cluster

  1. Enter a cluster name (it is recommended that your choice is expressive of the cluster)

  2. Click “Save import and generate code”

  3. Copy the command generated and run it on your K3s cluster

  1. The imported cluster should appear as “Ready” under Infrastructure -> Clusters within minutes

Deploying an application on the RHACM instance

If your application must pull container images from private repositories then a deployment-level/service-account level docker-configuration access secret must be deployed, and referenced in deployment/service-account

User Manual

This section describes the steps necessary to deploy a model on an ACM instance, and having ACM propagate and deploy the model to the connected managed cluster. #. Deploy docker authentication secret on your edge K3s cluster:

  1. Configure:

    • Gitlab user:

    export GITLAB_USER=...
    
    • Docker-config access token for the private containers repository:

    export DOCKER_READ_TOKEN=...
    
    • Secret name (referenced in deployment, e.g., i4qregistry)

    export SECRET_NAME=...
    
  2. Create secret:

kubectl create secret docker-registry $SECRET_NAME --docker-server=registry.gitlab.com --docker-username=$GITLAB_USER --docker-password=$DOCKER_READ_TOKEN -n i4q-lrt --dry-run=client -o yaml | kubectl apply -f -
  1. Through RHACM UI:

  1. Visit Applications UI (/applications)

  2. Click on “Create Application” then “Subscription”

  3. Enter Info:

    • Name: <i4q-lrt>

    • Namespace: <i4q-lrt>

    • Repository type: select Git

    • URL: <https://gitlab.com/i4q/LRT>

    • Username: your gitlab user

    • Access token: a gitlab token of yours with READ api permissions

    • Branch: <acm-dev>

    • Path: <charts/sktime>

    • Scroll down to “Deploy application resources only on clusters matching specified labels”:
      • Label name: <env>

      • Label value: <lrt>

  4. Click “Save”

Getting the application deployed on a Managed-Cluster

The above application is bounded to a placement rule that selects managed clusters with label env=lrt. To add the label to your managed clusters, visit Infrastructure/Clusters in the ACM UI, or run:

kubectl label managedcluster <cluster-name> env=lrt

Accessing the application

If port 80 is accessible on your K3s cluster’s VM, you should be able to access the application through http://CLUSTER_PUBLIC_IP_OR_DNS/api/ui