Using Ollama

Ollama is a large language model which has been tested running on a single node. Versions are currently available on both the Koko and Athene clusters and can be loaded using the following command.

module load ollama

It may be executed using the following. Please note you must execute these commands via SRUN or SBATCH commands. Also make sure you allocate a substantial amount of cores, GPU and memory to this solution. We recommend a minimum of 128GB and 64 cores.

ollama-linux-amd64 serve &
  ollama [flags]
  ollama [command]

Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  ps          List running models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command

  -h, --help      help for ollama
  -v, --version   Show version information
