SLLM
Share a GPU node with other devs — unlimited tokens from $10/month
SLLM is a cohort-based GPU sharing service that lets developers pool costs to access large language models via an OpenAI-compatible API. You join a cohort, and when the group fills up, everyone gets charged and gains access to a shared vLLM node running models like Llama 4 Scout, Qwen 3.5, GLM-5, Kimi K2.5, and DeepSeek variants. Plans run $10–40/month with throughput between 15–35 tokens/second. The pitch is simple: most developers don't need a dedicated GPU but they also don't want per-token billing anxiety. By splitting a node, you amortize the cost dramatically and get predictable flat-rate access. The API is fully OpenAI-compatible, meaning existing integrations just need a base URL swap. The HN discussion revealed genuine enthusiasm for the concept but raised practical concerns about cohort fill times (you might wait weeks before your cohort opens) and whether 15–25 tok/s shared among hundreds of users is actually usable for interactive workflows. The founder was active in the thread defending the model.
Panel Reviews
“The flat-rate model removes token anxiety entirely. For background tasks, batch processing, or low-traffic tools, $10/month for unlimited calls to Llama 4 Scout is a no-brainer. Just make sure your use case tolerates variable latency — this isn't for real-time chat.”
“Cohort fill times are the killer here. You could sign up and wait a month before your cohort has enough members to open. And once it does, 15 tok/s shared across potentially hundreds of users during peak hours is borderline unusable. The economics only work until a large cohort gets noisy neighbors.”
“Cooperative compute is underexplored. SLLM is the first serious take I've seen on LLM time-sharing — a model that worked for mainframes and could work for GPU inference. If the cohort mechanics get refined with smarter scheduling, this becomes a real alternative to per-token clouds.”
“Tried to sign up but the Join button gave no pricing info before checkout. The UX friction combined with cohort uncertainty made me bounce. I'll revisit when the on-boarding is clearer and there's a track record of cohorts actually filling at reasonable speeds.”
Community Sentiment
“Brilliant concept but cohort fill times could kill it”
“15 tok/s shared with hundreds of users isn't usable”
“$10/month unlimited LLM access sounds too good”