2020-08-23 23:27:00 +02:00
# MFFlow All-In-One PoC
2018-11-22 21:13:29 +08:00
2020-08-23 23:27:00 +02:00
AWS S3 based [on this article ](https://dev.to/goodidea/how-to-fake-aws-locally-with-localstack-27me )
2018-11-22 21:13:29 +08:00
2020-08-23 23:27:00 +02:00
1. [Install AWS cli ](https://aws.amazon.com/cli/ ) **Yes, i know that you dont have an Amazon Web Services Subscription - dont worry! It wont be needed!**
2. Configure `.env` file for your choice
3. Configure AWS CLI - enter the same credentials from the `.env` file
2018-11-22 21:13:29 +08:00
2020-08-23 23:27:00 +02:00
```shell
aws configure
2018-11-22 21:13:29 +08:00
```
2020-08-23 23:27:00 +02:00
> AWS Access Key ID [****************123]: AKIAIOSFODNN7EXAMPLE
> AWS Secret Access Key [****************123]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
> Default region name [us-west-2]: us-east-1
> Default output format [json]: <ENTER>
2018-11-22 21:13:29 +08:00
2020-08-23 23:27:00 +02:00
4. Create mlflow bucket
2018-11-22 21:13:29 +08:00
2020-08-23 23:27:00 +02:00
```shell
aws --endpoint-url=http://localhost:9000 s3 mb s3://mlflow
2018-11-22 21:13:29 +08:00
```
2020-08-23 23:27:00 +02:00
5. Open up http://localhost:5000/#/ for MlFlow, and http://localhost:9000/minio/mlflow/ for S3 bucket (you artifacts) with credentials from `.env` file
2018-11-22 21:13:29 +08:00
2020-08-23 23:27:00 +02:00
6. Configure S3 Keys.
2018-11-22 21:13:29 +08:00
2020-08-23 23:27:00 +02:00
For running mlflow files you AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables present on the client-side.
2018-11-22 21:13:29 +08:00
2020-08-23 23:27:00 +02:00
```shell
export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
2018-11-22 21:13:29 +08:00
```
2020-08-23 23:27:00 +02:00
You can load them from the .env file like so
```shell
source .env
2018-11-22 21:13:29 +08:00
```
2020-08-23 23:27:00 +02:00
or add them to the .bashrc file and then run
2018-11-22 21:13:29 +08:00
2020-08-23 23:27:00 +02:00
```shell
source ~/.bashrc
2018-11-22 21:13:29 +08:00
```
2020-08-23 22:28:42 +02:00
2020-08-23 23:27:00 +02:00
7. Test the pipeline with below command with conda. If you dont have conda installed run with `--no-conda`
2020-08-23 22:28:42 +02:00
2020-08-23 23:27:00 +02:00
```shell
2020-08-24 00:02:55 +02:00
MLFLOW_S3_ENDPOINT_URL=http://localhost:9000 MLFLOW_TRACKING_URI=http://localhost:5000 mlflow run git@github .com:databricks/mlflow-example.git -P alpha=0.5
2020-08-23 22:28:42 +02:00
```
2020-08-23 23:28:50 +02:00
Optionally you can run
```shell
2020-08-24 00:02:55 +02:00
MLFLOW_S3_ENDPOINT_URL=http://localhost:9000 MLFLOW_TRACKING_URI=http://localhost:5000 python ./quickstart/mlflow_tracking.py
2020-08-23 23:57:08 +02:00
2020-08-23 23:28:50 +02:00
```
2020-08-23 23:27:00 +02:00
8. To make the setting permament move the MLFLOW_S3_ENDPOINT_URL and MLFLOW_TRACKING_URI into your .bashrc
2020-08-23 22:28:42 +02:00
2020-08-23 23:27:00 +02:00
```bash
2020-08-24 00:02:55 +02:00
export MLFLOW_S3_ENDPOINT_URL=http://localhost:9000
2020-08-23 23:27:00 +02:00
export MLFLOW_TRACKING_URI=http://localhost:5000
2020-08-23 22:28:42 +02:00
```