11 Commits

Author SHA1 Message Date
Tomasz Dłuski
b129e9a7cc Update README.md 2021-12-03 22:22:39 +01:00
Tomasz Dłuski
57e981da9a Update README.md 2021-12-03 22:21:40 +01:00
Tomasz Dłuski
b605078792 update readme 2021-12-03 22:14:24 +01:00
Tomasz Dłuski
6e798644da s3: automatically create a bucket on startup 2021-12-03 22:11:10 +01:00
Tomasz Dłuski
4c4449110e Update README.md 2021-12-03 21:42:52 +01:00
Tomasz Dłuski
5255f67780 Update README.md 2021-12-03 21:40:14 +01:00
Tomasz Dłuski
8bba55703c Update LICENSE 2021-12-03 21:40:14 +01:00
Tomasz Dłuski
01e8abe89a add named volumes instead of mapped local directory 2021-12-03 21:34:32 +01:00
Tomasz Dłuski
d0a5dfbde0 add restart: unless-stopped 2021-12-03 21:32:42 +01:00
Tomasz Dłuski
f92d4ec230 update minio to the newest version 2021-12-03 21:29:36 +01:00
Tomasz Dłuski
2ad2c983db dont expose database to public network closes #12 2021-12-03 21:21:51 +01:00
8 changed files with 95 additions and 163 deletions

4
.env
View File

@@ -1,5 +1,5 @@
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE AWS_ACCESS_KEY_ID=admin
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY AWS_SECRET_ACCESS_KEY=sample_key
AWS_REGION=us-east-1 AWS_REGION=us-east-1
AWS_BUCKET_NAME=mlflow AWS_BUCKET_NAME=mlflow
MYSQL_DATABASE=mlflow MYSQL_DATABASE=mlflow

View File

@@ -1,22 +0,0 @@
# Minio Console
s3.localhost:9001 {
handle_path /* {
reverse_proxy s3:9001
}
}
# Minio API
s3.localhost:9000 {
handle_path /* {
reverse_proxy s3:9000
}
}
mlflow.localhost {
basicauth /* {
root JDJhJDEwJEVCNmdaNEg2Ti5iejRMYkF3MFZhZ3VtV3E1SzBWZEZ5Q3VWc0tzOEJwZE9TaFlZdEVkZDhX # root hiccup
}
handle_path /* {
reverse_proxy mlflow:5000
}
}

View File

@@ -1,6 +1,6 @@
MIT License MIT License
Copyright (c) 2020 Tomasz Dłuski Copyright (c) 2021 Tomasz Dłuski
Permission is hereby granted, free of charge, to any person obtaining a copy Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal of this software and associated documentation files (the "Software"), to deal

104
README.md
View File

@@ -1,86 +1,32 @@
# MLFlow Docker Setup [![Actions Status](https://github.com/Toumash/mlflow-docker/workflows/VerifyDockerCompose/badge.svg)](https://github.com/Toumash/mlflow-docker/actions) # MLFlow Docker Setup [![Actions Status](https://github.com/Toumash/mlflow-docker/workflows/VerifyDockerCompose/badge.svg)](https://github.com/Toumash/mlflow-docker/actions)
If you want to boot up mlflow project with one-liner - this repo is for you. > If you want to boot up mlflow project with one-liner - this repo is for you.
> The only requirement is docker installed on your system and we are going to use Bash on linux/windows.
The only requirement is docker installed on your system and we are going to use Bash on linux/windows. # 🚀 1-2-3! Setup guide
1. Configure `.env` file for your choice. You can put there anything you like, it will be used to configure you services
2. Run `docker compose up`
3. Open up http://localhost:5000 for MlFlow, and http://localhost:9001/ to browse your files in S3 artifact store
[![Youtube tutorial](https://img.youtube.com/vi/ma5lA19IJRA/0.jpg)](https://www.youtube.com/watch?v=ma5lA19IJRA)
**👇Video tutorial how to set it up on Microsoft Azure 👇**
[![Youtube tutorial](https://user-images.githubusercontent.com/9840635/144674240-f1ede224-410a-4b77-a7b8-450f45cc79ba.png)](https://www.youtube.com/watch?v=ma5lA19IJRA)
# Features # Features
- Setup by one file (.env) - One file setup (.env)
- Production-ready docker volumes - Minio S3 artifact store with GUI
- Separate artifacts and data containers - MySql mlflow storage
- [Artifacts GUI](https://min.io/) - Ready to use bash scripts for python development!
- Ready bash scripts to copy and paste for colleagues to use your server! - Automatically-created s3 buckets
## Simple setup guide ## How to use in ML development in python
1. Configure `.env` file for your choice. You can put there anything you like, it will be used to configure you services
2. Run the Infrastructure by this one line: <details>
```shell <summary>Click to show</summary>
$ docker-compose up -d
Creating network "mlflow-basis_A" with driver "bridge"
Creating mlflow_db ... done
Creating tracker_mlflow ... done
Creating aws-s3 ... done
```
3. Create mlflow bucket. You can use my bundled script. 1. Configure your client-side
Just run
```shell
bash ./run_create_bucket.sh
```
You can also do it **either using AWS CLI or Python Api**.
<details><summary>AWS CLI</summary>
1. [Install AWS cli](https://aws.amazon.com/cli/) **Yes, i know that you dont have an Amazon Web Services Subscription - dont worry! It wont be needed!**
2. Configure AWS CLI - enter the same credentials from the `.env` file
```shell
aws configure
```
> AWS Access Key ID [****************123]: AKIAIOSFODNN7EXAMPLE
> AWS Secret Access Key [****************123]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
> Default region name [us-west-2]: us-east-1
> Default output format [json]: <ENTER>
3. Run
```shell
aws --endpoint-url=http://localhost:9000 s3 mb s3://mlflow
```
</details>
<details><summary>Python API</summary>
1. Install Minio
```shell
pip install Minio
```
2. Run this to create a bucket
```python
from minio import Minio
from minio.error import ResponseError
s3Client = Minio(
'localhost:9000',
access_key='<YOUR_AWS_ACCESSS_ID>', # copy from .env file
secret_key='<YOUR_AWS_SECRET_ACCESS_KEY>', # copy from .env file
secure=False
)
s3Client.make_bucket('mlflow')
```
</details>
---
4. Open up http://localhost:5000 for MlFlow, and http://localhost:9000/minio/mlflow/ for S3 bucket (you artifacts) with credentials from `.env` file
5. Configure your client-side
For running mlflow files you need various environment variables set on the client side. To generate them user the convienience script `./bashrc_install.sh`, which installs it on your system or `./bashrc_generate.sh`, which just displays the config to copy & paste. For running mlflow files you need various environment variables set on the client side. To generate them user the convienience script `./bashrc_install.sh`, which installs it on your system or `./bashrc_generate.sh`, which just displays the config to copy & paste.
@@ -89,7 +35,7 @@ For running mlflow files you need various environment variables set on the clien
The script installs this variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, MLFLOW_S3_ENDPOINT_URL, MLFLOW_TRACKING_URI. All of them are needed to use mlflow from the client-side. The script installs this variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, MLFLOW_S3_ENDPOINT_URL, MLFLOW_TRACKING_URI. All of them are needed to use mlflow from the client-side.
6. Test the pipeline with below command with conda. If you dont have conda installed run with `--no-conda` 2. Test the pipeline with below command with conda. If you dont have conda installed run with `--no-conda`
```shell ```shell
mlflow run git@github.com:databricks/mlflow-example.git -P alpha=0.5 mlflow run git@github.com:databricks/mlflow-example.git -P alpha=0.5
@@ -97,8 +43,16 @@ mlflow run git@github.com:databricks/mlflow-example.git -P alpha=0.5
python ./quickstart/mlflow_tracking.py python ./quickstart/mlflow_tracking.py
``` ```
7. *(Optional)* If you are constantly switching your environment you can use this environment variable syntax 3. *(Optional)* If you are constantly switching your environment you can use this environment variable syntax
```shell ```shell
MLFLOW_S3_ENDPOINT_URL=http://localhost:9000 MLFLOW_TRACKING_URI=http://localhost:5000 mlflow run git@github.com:databricks/mlflow-example.git -P alpha=0.5 MLFLOW_S3_ENDPOINT_URL=http://localhost:9000 MLFLOW_TRACKING_URI=http://localhost:5000 mlflow run git@github.com:databricks/mlflow-example.git -P alpha=0.5
``` ```
</details>
## Licensing
Copyright (c) 2021 Tomasz Dłuski
Licensed under the MIT License (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License by reviewing the file [LICENSE](./LICENSE) in the repository.

View File

@@ -1,38 +1,23 @@
version: '3.2' version: "3.9"
services: services:
caddy:
image: caddy:2-alpine
container_name: caddy
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile
- /caddy/data:/data
- /caddy/config:/config
ports:
- 80:80
- 443:443
- 9000:9000
- 9001:9001
restart: unless-stopped
s3: s3:
restart: always image: minio/minio:RELEASE.2021-11-24T23-19-33Z
image: minio/minio:latest restart: unless-stopped
container_name: aws-s3
ports: ports:
- 9000 - "9000:9000"
- 9001 - "9001:9001"
environment: environment:
- MINIO_ROOT_USER=${AWS_ACCESS_KEY_ID} - MINIO_ROOT_USER=${AWS_ACCESS_KEY_ID}
- MINIO_ROOT_PASSWORD=${AWS_SECRET_ACCESS_KEY} - MINIO_ROOT_PASSWORD=${AWS_SECRET_ACCESS_KEY}
command: command: server /data --console-address ":9001"
server /date --console-address ":9001"
volumes:
- ./s3:/date
networks: networks:
- default - internal
- proxy-net - public
volumes:
- minio_volume:/data
db: db:
restart: always
image: mysql/mysql-server:5.7.28 image: mysql/mysql-server:5.7.28
restart: unless-stopped
container_name: mlflow_db container_name: mlflow_db
expose: expose:
- "3306" - "3306"
@@ -42,13 +27,13 @@ services:
- MYSQL_PASSWORD=${MYSQL_PASSWORD} - MYSQL_PASSWORD=${MYSQL_PASSWORD}
- MYSQL_ROOT_PASSWORD=${MYSQL_ROOT_PASSWORD} - MYSQL_ROOT_PASSWORD=${MYSQL_ROOT_PASSWORD}
volumes: volumes:
- ./dbdata:/var/lib/mysql - db_volume:/var/lib/mysql
networks: networks:
- default - internal
mlflow: mlflow:
restart: always
container_name: tracker_mlflow container_name: tracker_mlflow
image: tracker_ml image: tracker_ml
restart: unless-stopped
build: build:
context: ./mlflow context: ./mlflow
dockerfile: Dockerfile dockerfile: Dockerfile
@@ -59,11 +44,26 @@ services:
- AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
- AWS_DEFAULT_REGION=${AWS_REGION} - AWS_DEFAULT_REGION=${AWS_REGION}
- MLFLOW_S3_ENDPOINT_URL=http://s3:9000 - MLFLOW_S3_ENDPOINT_URL=http://s3:9000
entrypoint: mlflow server --backend-store-uri mysql+pymysql://${MYSQL_USER}:${MYSQL_PASSWORD}@db:3306/${MYSQL_DATABASE} --default-artifact-root s3://${AWS_BUCKET_NAME}/ -h 0.0.0.0
networks: networks:
- proxy-net - public
- default - internal
entrypoint: bash ./wait-for-it.sh db:3306 -t 90 -- mlflow server --backend-store-uri mysql+pymysql://${MYSQL_USER}:${MYSQL_PASSWORD}@db:3306/${MYSQL_DATABASE} --default-artifact-root s3://${AWS_BUCKET_NAME}/ -h 0.0.0.0
create_s3_buckets:
image: minio/mc
depends_on:
- "s3"
entrypoint: >
/bin/sh -c "
until (/usr/bin/mc alias set minio http://s3:9000 '${AWS_ACCESS_KEY_ID}' '${AWS_SECRET_ACCESS_KEY}') do echo '...waiting...' && sleep 1; done;
/usr/bin/mc mb minio/mlflow;
exit 0;
"
networks:
- internal
networks: networks:
default: internal:
proxy-net: public:
driver: bridge
volumes:
db_volume:
minio_volume:

View File

@@ -1,10 +1,10 @@
FROM continuumio/miniconda3:latest FROM continuumio/miniconda3:latest
RUN pip install mlflow boto3 pymysql
ADD . /app ADD . /app
WORKDIR /app WORKDIR /app
COPY wait-for-it.sh wait-for-it.sh COPY wait-for-it.sh wait-for-it.sh
RUN chmod +x wait-for-it.sh RUN chmod +x wait-for-it.sh
RUN pip install mlflow boto3 pymysql

View File

@@ -5,7 +5,7 @@ from mlflow import mlflow,log_metric, log_param, log_artifacts
if __name__ == "__main__": if __name__ == "__main__":
with mlflow.start_run() as run: with mlflow.start_run() as run:
mlflow.set_tracking_uri('https://mlflow.localhost') mlflow.set_tracking_uri('http://localhost:5000')
print("Running mlflow_tracking.py") print("Running mlflow_tracking.py")
log_param("param1", randint(0, 100)) log_param("param1", randint(0, 100))