It’s started by Docker employees.

Analogous to Dockerfile, there’s a Modelfile. With the Modelfile, you can layer and compose models with different settings such as temperature and instructions.

It creates a simple HTTP server to take traffic, as well as expose a CLI interface. It can also provide token-by-token responses.

The nice part is that is makes the productionization process quite simple for non-ML developers.