MLflow

MLflow#

Is a popular pool for tracking machine learning models. These pages focus on aspects of using mlflow.

Learm more:

Tracking server overview on official mlflow site;
MLflow tracking quickstart on official mlflow site;
Official MLflow docker image.

Start tracking server#

MLflow should have a tracking server. Here we describe how to start it locally from docker. This option will usually be used for the examples in this section.

Load docker image.

!docker pull ghcr.io/mlflow/mlflow &> /dev/null

The following command will start the mlflow server in docker, but it’s also acceptable in the host.

%%bash
docker run -p 5000:5000 -dt --name my_server --rm \
    ghcr.io/mlflow/mlflow \
    bash -c "mlflow server --host 0.0.0.0 --port 5000"

b8975190736e71d255d5f3d05f05ad7a38178f045569faa06b592b899ce75644

Don’t forget to stop the server after playing with it.

!docker stop my_server

my_server

Now you can access your mlflow UI using the browser url localhost:5000. Let’s have a look at mlflow using the selenium screenshot.

from time import sleep
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from PIL import Image
import io

# Set up Firefox options
options = Options()
options.binary_location = "/usr/bin/firefox"

driver = webdriver.Firefox(
    options=Options(),
    service=Service(
        # path to the geckodriver on my computer
        executable_path="/snap/bin/geckodriver",
        log_output="/dev/null"
    )
)
driver.get("http://localhost:5000")

sleep(3)

screenshot = driver.get_screenshot_as_png()
image = Image.open(io.BytesIO(screenshot))
display(image)

driver.quit()

../_images/0e3a1a9a8bb77b1a44e21dc018fabf96a562c5f276b93e6898d9a9bc35266bb8.png

Connecting#

Use mlflow.set_tracking_uri to specify the uri for the mlflow for the pythonAPI.

Environment variables#

You can specify environment variables to connect to MLFlow:

MLFLOW_TRACKING_URI: specifies the URI that must be used to connect to the MLFLOW server.
MLFLOW_TRACKING_USERNAME: specifies the username to be used to log in to MLFlow.
MLFLOW_TRACKING_PASSWORD: specifies the password to be used to log in to MLFlow.

Once these variables are loaded into the Python environment, MLFlow will automatically find and use them.

Creating run#

Run in mlflow refers to the execution of a machine learning experiment or a piece of code that you want to track and record. It represents a specific execution instance with associated metadata and recorded metrics, parameters, artifacts, and tags.

You can start a run with mlflow.start_run() and end it with mlflow.end_run(). By passing the run_name parameter to the mlflow.start_run() function, you can define the name of the run. The following cell demonstrates how it may be:

import mlflow
mlflow.set_tracking_uri("http://localhost:5000/")

mlflow.start_run(run_name="my_run_name")
mlflow.end_run()

Note it’s OK to have several runs with the same name - you can think of them as different versions of the same run.