AI server

The Computational Literacies Lab operates a shared AI server for use by lab members and collaborators. It gives you access to powerful open AI models for chat, audio transcription, image generation, and more — without sending your data to outside services like OpenAI or Google. All processing happens within UB's network.

The server lives in Chris's office and is built around a single machine with 700 GB of RAM. That's enough to run large, capable models that would be impossible to run on a laptop — think of it as a very powerful shared computer dedicated to AI work.

To use the server you need to be on the UB campus network or VPN, and you need an API key. Email Chris Proctor to request one, with a brief description of what you'd like to do.

Demo

The widget below makes requests over HTTP. If you are viewing this page over HTTPS, your browser will block those requests as mixed content. The easiest workaround is to save this page as a file and open the saved copy in your browser.

API Key

Returns all models currently available on the server (GET /v1/models).

Model

Select a model above and say something.

Response

→ Enter your API key and select an operation above.

Getting started

There are three main ways to use the server. All of them require your API key. The easiest way to supply it is to set it once as an environment variable in your terminal — then you never have to include it in your code:

export GSEAI_API_TOKEN=your-api-key

1. The gseai command-line tool

The simplest way to use the server — no Python required. Install it once:

uv tool install gseai

Then talk to a model directly from your terminal:

gseai chat gemma-4-e2b-it "What is machine learning?"

For an interactive back-and-forth conversation:

gseai chat gemma-4-e2b-it -i

Run gseai --help to see all available commands, or see the full CLI reference.

2. The gseai Python package

If you want to call the server from a Python script or notebook, the gseai package provides a simple interface. Add it to your project:

uv add gseai

Then use it in your code:

from gseai import GSEAIServer

server = GSEAIServer("your-api-key")
response = server.chat("gemma-4-e2b-it", "What is machine learning?")
print(response["choices"][0]["message"]["content"])

If you have GSEAI_API_TOKEN set in your environment, you can read it from there instead of writing the key in your code:

import os
from gseai import GSEAIServer

server = GSEAIServer(os.environ["GSEAI_API_TOKEN"])

3. The OpenAI Python package

If you already have code that uses the OpenAI Python package, it will work with the GSEAI server with two small changes: point the client at the GSEAI server URL, and swap in your GSEAI API key.

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key",
    base_url="http://gseai.gse.buffalo.edu:11434/v1",
)
response = client.chat.completions.create(
    model="gemma-4-e2b-it",
    messages=[{"role": "user", "content": "What is machine learning?"}],
)
print(response.choices[0].message.content)

Capabilities

Run gseai models to see what models are currently available. The examples below use specific model names — substitute whichever model you want to use.

Chat

Ask a model a question or start a conversation. The system parameter sets the model's persona or gives it standing instructions.

server = GSEAIServer(os.environ["GSEAI_API_TOKEN"])

response = server.chat(
    "gemma-4-e2b-it",
    "Summarize the main argument of this paper in two sentences.",
    system_prompt="You are a helpful research assistant.",
)
print(response["choices"][0]["message"]["content"])

From the CLI:

gseai chat gemma-4-e2b-it "Summarize the main argument of this paper in two sentences." \
    --system "You are a helpful research assistant."

Embeddings

An embedding is a numerical representation of a piece of text that captures its meaning. Embeddings are useful for tasks like finding similar documents, clustering interview excerpts, or building semantic search over a corpus.

server = GSEAIServer(os.environ["GSEAI_API_TOKEN"])

result = server.embeddings("nomic-embed-text", "Students described feeling overwhelmed.")
vector = result["data"][0]["embedding"]   # a list of numbers

Audio transcription

Transcribe spoken audio to text. Supported formats: WAV, MP3, M4A, OGG, FLAC.

server = GSEAIServer(os.environ["GSEAI_API_TOKEN"])

result = server.transcribe("whisper-1", "interview.mp3")
print(result["text"])

You can also produce SRT or VTT subtitle files:

srt = server.transcribe("whisper-1", "lecture.mp3", response_format="srt")

From the CLI:

gseai audio transcribe whisper-1 interview.mp3
gseai audio transcribe whisper-1 lecture.mp3 --format srt

Image generation

Generate an image from a text description.

import base64

server = GSEAIServer(os.environ["GSEAI_API_TOKEN"])

result = server.generate_image("sd-1.5-ggml", "a red barn in a snowy field")
image_bytes = base64.b64decode(result["data"][0]["b64_json"])
open("barn.png", "wb").write(image_bytes)

From the CLI (saves the file automatically):

gseai images generate sd-1.5-ggml "a red barn in a snowy field" --output barn.png

For the complete API and CLI reference, see the gseai documentation.