Tag

inference

2 articles tagged "inference"

LLM Cognition 28 February 2026

Building mullama: What We Learned Replacing Ollama from Scratch

A post-mortem on building a local LLM serving layer — llama.cpp integration, model management, and where existing tools constrain research.

llama.cppinferencelocal-llm

Edge Intelligence 25 February 2026

Running Language Models on Your Phone: The llamafu Experiment

What happens when you run a full LLM on mobile hardware with zero cloud dependency — memory, latency, and model quality on consumer devices.

mobilellama.cppflutter