BRING YOUR
OWN BRAIN.
Don't marry your AI. Swap it.
TOSS is an open runtime for intelligence. Load Llama for chat, Phi for math, or Gemma for creative writing.
Smol Brain (T5)
Logic DriverWeight Size
250MB
Instant reflex. Runs on a toaster. Perfect for background tasks and quick fact checks.
Inference Speed95/100
Reasoning Depth40/100
Memory Load15%
Native .GGUF Support.
We don't reinvent the wheel. TOSS is compatible with the industry standard. If it runs on Llama.cpp, it runs on your phone.
.GGUF
Standard Quantized Format
.ONNX
Microsoft Runtime
.TFLITE
Legacy Mobile Models
.BIN
Raw Binary Weights
Optimization
Big brains. Small footprint.
TOSS uses 4-bit Quantization (Q4_K_M) to compress massive neural networks into files smaller than a movie.
Download Once
Get the model from HuggingFace directly in the app. No PC required.
Tune Parameters
Adjust Temperature, Top-K, and Context Window to change the AI's personality.
Q4
FP16: 12GBINT4: 3GB