Josh Peng

Full-Stack Data Scientist

github linkedin email
mlflow - Naming Your Experiments
May 12, 2019
One minute read

One recent tool we’ve been evaluating for our data science team here at Clutter is mlflow. We are particularly interested in the model tracking portion of it. It seems to be incredibly useful for keeping journal-esque logs of runs between our data scientists. Yay for reproducibility. Yay for collaboration.

An early concept we struggled with was naming and organizing our experiments inside mlflow though. It seemed experiments were given integer IDs and somehow got human readable names at ID creation time. In the mlflow UI, this is completely non-editable. We originally figured out that we could edit the meta.yaml file in the local file store on the server, but that just felt very inconvenient. Upon further investigation we found that mlflow actually now has the ability to run on a SQL backend with significant performance benefits over the default file store. We hopped on this immediately and restarted our server with a Postgres backend and now can edit experiment names directly in SQL! Easy!

How to launch mlflow server with Postgres

Note: you’ll need to use mlflow 0.9.0 (not 0.9.1) for now due to this bug

mlflow server \
    --backend-store-uri postgresql://<user>:<password>@<host>:<port>/<database_name> \
    --default-artifact-root s3://<my-mlflow-bucket>/ \
    --host 0.0.0.0

Tags: #tooling

Back to posts


comments powered by Disqus