Ollama is a large language model which has been tested running on a single node. Versions are currently available on both the Koko and Athene clusters and can be loaded using the following command.
module load ollama
It may be executed using the following. Please note you must execute these commands via SRUN or SBATCH commands. Also make sure you allocate a substantial amount of cores, GPU and memory to this solution. We recommend a minimum of 128GB and 64 cores.
ollama-linux-amd64 serve &
ollama-linux-amd64
Usage:
ollama [flags]
ollama [command]
Available Commands:
serve Start ollama
create Create a model from a Modelfile
show Show information for a model
run Run a model
pull Pull a model from a registry
push Push a model to a registry
list List models
ps List running models
cp Copy a model
rm Remove a model
help Help about any command
Flags:
-h, --help help for ollama
-v, --version Show version information