Scraper
Spider

A robotic spider About
Blog
@dbaman@fosstodon.org
Click ▶ to show/hide AI summary and keywords
Click The google logo for Google search on keywords

2026-02-18 17:30
gpt-oss
gpt-oss stories from the last 14 days  | Back to all stories
93.  HN Self-Hosted LLM Upgrade on AMD: Kimi Linear 48B, Qwen3 Coder Next, and Q2_K_XL
The blog post explores the experimentation with new AI models on an AMD-based homelab setup intended for local hosting of self-hosted language learning models (LLMs), particularly focusing on Kimi Linear 48B and Qwen3 Coder Next. The author evaluates these models based on latency, resource consumption, and a subjective "Vibe Score" that combines quality with speed. The infrastructure includes two AMD AI Max+ 395 systems with substantial unified memory for concurrent model operation. A notable shift towards open-source models is emphasized, driven by rapid advancements in research supported by communities like LocalLLama. This transition aims to replace costly proprietary cloud-based solutions with efficient local alternatives that maintain similar quality levels but at reduced costs. Testing encompasses diverse applications such as coding, chat interactions, and multimodal tasks, highlighting improvements in newer architectures like Mixture of Experts (MoE) and quantization techniques. Despite hardware constraints, models like Kimi Linear 48B and Qwen3 Coder Next are identified as viable for general-purpose functions and AI-assisted development. The author notes that open-source models are increasingly competing with proprietary ones regarding quality, promoting broader access to powerful AI tools without cloud dependency. The discussion concludes by advocating for enhanced optimization in model evaluation processes to facilitate easier testing and usage, reflecting a trend towards more accessible and autonomous AI deployment solutions. Keywords: #phi4, AI Models, AMD, Arize Phoenix, Attention Mechanisms, Function Calling, GLM-Air-REAP, GPT-OSS, GPU Memory, Homelab, Kimi Linear, Latency, Linear Attention, Local AI, MoE Architectures, Model Evaluation, NVIDIA, Open Source, OpenWebUI, Quantization, Qwen3 Coder Next, ROCm, Roo Code, Self-Hosted LLM, Vibe Score, Vulkan
    The google logo   site.bhamm-lab.com 6 hours ago