MLFlow Docker Setup 
If you want to boot up mlflow project with one-liner - this repo is for you.
The only requirement is docker installed on your system and we are going to use Bash on linux/windows.
Step by step guide
- 
Configure
.envfile for your choice. You can put there anything you like, it will be used for our services configuration - 
Run the Infrastructure by this one line:
 
$ docker-compose up -d
Creating network "mlflow-basis_A" with driver "bridge"
Creating mlflow_db      ... done
Creating tracker_mlflow ... done
Creating aws-s3         ... done
Your mlflow_db is slowly getting ready - it migh take up to 1 minute. To be sure, that all of the services are running fine, just run docker-compose up -d until you see all services up-to-date
$ docker-compose up -d
mlflow_db is up-to-date
aws-s3 is up-to-date
tracker_mlflow is up-to-date
- Create mlflow bucket. You can do it either using AWS CLI or Python Api. You dont need an AWS subscription
 
AWS CLI
- Install AWS cli Yes, i know that you dont have an Amazon Web Services Subscription - dont worry! It wont be needed!
 - Configure AWS CLI - enter the same credentials from the 
.envfile 
aws configure
AWS Access Key ID [****************123]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [****************123]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [us-west-2]: us-east-1
Default output format [json]:
- Run
 
aws --endpoint-url=http://localhost:9000 s3 mb s3://mlflow
Python API
- Install Minio
 
pip install Minio
- Run this to create a bucket
 
from minio import Minio
from minio.error import ResponseError
s3Client = Minio(
    'localhost:9000',
    access_key='AKIAIOSFODNN7EXAMPLE', # copy from .env file
    secret_key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY', # copy from .env file
    secure=False
)
s3Client.make_bucket('mlflow')
- 
Open up http://localhost:5000/#/ for MlFlow, and http://localhost:9000/minio/mlflow/ for S3 bucket (you artifacts) with credentials from
.envfile - 
Configure your client-side
 
For running mlflow files you AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables present on the client-side.
Also, you will need to specify the address of your S3 server (minio) and mlflow tracking server
export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
export MLFLOW_S3_ENDPOINT_URL=http://localhost:9000
export MLFLOW_TRACKING_URI=http://localhost:5000
You can load them from the .env file. Create a .env file inside this repo folder and paste:
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
AWS_REGION=us-east-1
AWS_BUCKET_NAME=mlflow
MYSQL_DATABASE=mlflow
MYSQL_USER=mlflow_user
MYSQL_PASSWORD=mlflow_password
MYSQL_ROOT_PASSWORD=toor
MLFLOW_S3_ENDPOINT_URL=http://localhost:9000
MLFLOW_TRACKING_URI=http://localhost:5000
Then run
source .env
or add them as export X=Y to the .bashrc file and then run
source ~/.bashrc
- Test the pipeline with below command with conda. If you dont have conda installed run with 
--no-conda 
mlflow run git@github.com:databricks/mlflow-example.git -P alpha=0.5
Optionally you can run
python ./quickstart/mlflow_tracking.py
- (Optional) If you are constantly switching your environment you can use this environment variable syntax
 
MLFLOW_S3_ENDPOINT_URL=http://localhost:9000 MLFLOW_TRACKING_URI=http://localhost:5000 mlflow run git@github.com:databricks/mlflow-example.git -P alpha=0.5
Improvements needed
- db is very slow to boot up, and tracker_mlflow is crashing due to db being loaded. We need a wait script for the mlflow, to wait for the db to boot up