Skip to content

User Guide

Using mlinfra

mlinfra is used as a cli which always takes --stack-config-path as an argument which is the path ofg the stack file that represents your MLOps stacks and deployment type. Following functions can be performed on a stack file:

Cloud Credentials

mlinfra relies on cloud credentials to be configured prior to all these commands getting executed.

  • estimate-cost: This command generates a cost breakdown of the cloud components defined in the stack config file. To use this feature, infracost needs to be installed on your system. An example is as follows:
    mlinfra estimate-cost --stack-config-path=aws-lakefs-k8s.yaml
    
  • generate-terraform-config: This command generates the *.tf.json configuration for the stack file and allows the user to inspect the params prior to getting deployed. An example is as follows:
    mlinfra generate-terraform-config --stack-config-path=aws-lakefs-k8s.yaml
    
  • terraform: This command is used in conjunction with another sub-command --apply which has the following values:

    • plan: used to plan the stack config
    • apply: used to apply the stack config
    • destroy: used to destroy / delete the stack config
  • Examples of these commands are as follows:

    # To plan the changes in a stack config
    mlinfra terraform --action=plan --stack-config-path=aws-lakefs-k8s.yaml
    
    # To apply the stack config components
    mlinfra terraform --action=apply --stack-config-path=aws-lakefs-k8s.yaml
    
    # To delete the stack config components
    mlinfra terraform --action=destroy --stack-config-path=aws-lakefs-k8s.yaml
    

Info

As the tool is under active development, more commands might be added to mlinfra based on users requests that might facilitate ease of operations.

Deploying a Stack

  • A sample mlops stack for deployment on Cloud IaaS looks as follows:
This stack config deploys EC2 instances with config stacks
name: aws-mlops-stack-complete
provider:
  name: aws
  account_id: "793009824629"
  region: "eu-central-1"
deployment:
  type: cloud_vm
stack:
  - data_versioning:
      name: lakefs
  - experiment_tracking:
      name: mlflow
  - orchestrator:
      name: prefect
The same stack config can be configured to quite an extent
name: aws-mlops-stack-complete-advanced
provider:
  name: aws
  account_id: "793009824629"
  region: "eu-central-1"
deployment:
  type: cloud_vm
  config:
    vpc:
      create_database_subnets: true
stack:
  - data_versioning:
      name: lakefs
      params:
        remote_tracking: true
        database_type: "dynamodb"
        lakefs_data_bucket_name: "lakefs-repository-data-bucket"
        dynamodb_table_name: "lakefs_kvstore"
  - experiment_tracking:
      name: mlflow
      params:
        remote_tracking: true
        mlflow_artifacts_bucket_name: "artifacts-storage-bucket"
  - orchestrator:
      name: prefect
      params:
        remote_tracking: true
        ec2_application_port: 9500
  • Whereas sample mlops stack for deployment on Cloud PaaS looks as follows:
This stack config deploys an EKS cluster with LakeFS
name: aws-lakefs-k8s
provider:
  name: aws
  account_id: "793009824629"
  region: "eu-central-1"
deployment:
  type: kubernetes
stack:
  - data_versioning:
      name: lakefs
The same stack config can be configured to quite an extent
name: aws-lakefs-k8s
provider:
  name: aws
  account_id: "793009824629"
  region: "eu-central-1"
deployment:
  type: kubernetes
  config:
    vpc:
      create_database_subnets: true
      enable_nat_gateway: true
      one_nat_gateway_per_az: false
    kubernetes:
      k8s_version: "1.28"
      cluster_endpoint_public_access: true
      spot_instance: false
      tags:
        data_versioning: "lakefs"
    node_groups:
      - name: lakefs-node-group
        instance_types:
          - t3.medium
        desired_size: 1
        min_size: 1
        max_size: 3
        disk_size: 20
stack:
  - data_versioning:
      name: lakefs
      params:
        remote_tracking: true
        database_type: "postgres"
        tags:
          database_type: "postgres"
          data_versioning: "lakefs"
          remote_tracking: true
  • Terraform plan from this configuration can be inspected using the mlinfra cli command:

    mlinfra terraform --action=plan --stack-config-path=aws-lakefs-k8s.yaml
    

  • The mlops stacks configuration can be deployed using the mlinfra cli command:

    mlinfra terraform --action=apply --stack-config-path=aws-lakefs-k8s.yaml