MLFlow

If you want to boot up mlflow project with one-liner - this repo is for you.

The only requirement is docker installed on your system and we are going to use Bash on linux/windows.

AWS S3 based on this article

Configure .env file for your choice
Create mlflow bucket. You can do it either using AWS CLI or Python Api

**AWS CLI**

Install AWS cli Yes, i know that you dont have an Amazon Web Services Subscription - dont worry! It wont be needed!
Configure AWS CLI - enter the same credentials from the .env file

aws configure

AWS Access Key ID [****************123]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [****************123]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [us-west-2]: us-east-1
Default output format [json]:

aws --endpoint-url=http://localhost:9000 s3 mb s3://mlflow

**Python API**

Install Minio

pip install Minio

Run this to create a bucket

from minio import Minio
from minio.error import ResponseError

s3Client = Minio(
    'localhost:9000',
    access_key='AKIAIOSFODNN7EXAMPLE', # copy from .env file
    secret_key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY', # copy from .env file
    secure=False
)
s3Client.make_bucket('mlflow')

Open up http://localhost:5000/#/ for MlFlow, and http://localhost:9000/minio/mlflow/ for S3 bucket (you artifacts) with credentials from .env file
Configure your client-side

For running mlflow files you AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables present on the client-side.

Also, you will need to specify the address of your S3 server (minio) and mlflow tracking server

export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
export MLFLOW_S3_ENDPOINT_URL=http://localhost:9000
export MLFLOW_TRACKING_URI=http://localhost:5000

You can load them from the .env file like so

source .env

or add them to the .bashrc file and then run

source ~/.bashrc

Test the pipeline with below command with conda. If you dont have conda installed run with --no-conda

mlflow run git@github.com:databricks/mlflow-example.git -P alpha=0.5

Optionally you can run

python ./quickstart/mlflow_tracking.py

(Optional) If you are constantly switching your environment you can use this environment variable syntax

MLFLOW_S3_ENDPOINT_URL=http://localhost:9000 MLFLOW_TRACKING_URI=http://localhost:5000 mlflow run git@github.com:databricks/mlflow-example.git -P alpha=0.5

2.8 KiB Raw Blame History

MLFlow

2.8 KiB

Raw Blame History