644 words
3 minutes
Agent Self-Hosting Project

Project Status: In Progress ​🚧​

Introduction 🚀#

After completing my PhD, I’m finally diving into a project I’ve long envisioned: self-hosting LLMs and services on a consumer-grade mini-PC. Driven by a fascination with the web and inspired by the LLM wave, this blog series will document my journey as I build a unified system to integrate various services and elicit agentic behaviors—all from my modest homelab.

The Vision 🌟#

Ultimately, my goal is to orchestrate various self-hosted services and specialized LLMs through a unified chat interface. This isn’t just about hosting—it’s about integration. I envision a system where:

  • 📩 Notifications from different chatbot services reach me seamlessly.
  • 💬 I can send actionable messages to these services from one platform.

For now the choice of a Matrix server with distinct rooms for quick access to a service seems to me the best option to achieve this.

The Setup 🛠️#

My Homelab 🏡#

My homelab is modest but equipped for experimentation:

  • Compute server: 32 cores, 64 threads, 128GB RAM, 10TB storage, 2 GPUs (40GB total VRAM)
  • Storage server: 32TB, perfect for datasets
  • Raspberry Pi 4: 8GB RAM for IoT and home automation
  • Inference server: A small mini-pc (Beelink Mini S12 Pro), resourceful in terms of power consumption, for serving models and services

With this setup, I’m ready to embark on my self-hosting journey. Self-hosting LLMs might seem daunting, but recent advancements have made it far more approachable. Innovations in pretraining, parameter pruning, quantization, compression, distillation, and context reduction have dramatically lowered the barrier to entry. Tools like llama.cpp and projects from contributors like Tom Jobbins (TheBloke) have even made running Llama-2 7B on a Raspberry Pi a reality (source).

Moreover, self-hosting LLMs enables you to run fine-tuned models that are often more efficient and task-specific than relying on large LLM APIs with prompt engineering. It’s an empowering alternative to the constraints and costs of third-party services.

Why I Chose a Mini-PC Over a Raspberry Pi for Hosting 🤔#

FeatureRaspberry Pi 5Beelink Mini S12 Pro
CPUBroadcom BCM2712 (Quad-core, 2.4GHz)Intel N100 (Quad-core, up to 3.4GHz)
RAM8GB max16GB
StorageMicroSD card500GB SSD
Power Consumption3–4W idle8W idle
ArchitectureARM-basedx86
Wake-on-LANNot supportedSupported
Price (approx.)~€150 (with accessories)~€180

The comparison table highlights the strengths and trade-offs between the Raspberry Pi 5 and Beelink Mini S12 Pro. While the Raspberry Pi 5 excels in energy efficiency (3–4W idle), its ARM-based architecture, limited RAM (8GB max), and reliance on microSD cards for storage make it less suited for resource-heavy workloads.

In contrast, the Beelink Mini S12 Pro offers significant advantages:

  • Performance: The Intel N100 processor (3.4GHz) and 16GB of RAM deliver far superior computational power for hosting multiple services or running LLMs.
  • Storage: An integrated 500GB SSD provides speed and reliability compared to the Pi’s microSD cards.
  • Compatibility: Its x86 architecture supports a broader range of software and avoids potential compatibility issues often encountered with ARM systems.
  • Utility Features: Wake-on-LAN allows remote boot-up, which can be useful in the distant management of the homelab.

While the Beelink has a slightly higher power consumption (8W idle) and costs around €30 more when fully equipped, these trade-offs are outweighed by its scalability and versatility for hosting demanding applications. For lightweight automation or ultra-low-power needs, the Raspberry Pi remains an excellent choice, but the Beelink’s advantages make it the clear winner for my self-hosting goals.

Ideas of Services to Implement 🧩#

Here’s a snapshot of my broader ambitions:

  • Obsidian integration: Read and write directly to my vault
  • Voice note-taking: Send voice messages and convert them to text
  • Automated Anki cards: Generate cards from my readings effortlessly
  • Daily news digests: Summarize global news every morning from diverse sources

While these could be achieved using existing LLM providers like OpenAI, I prefer the independence and control of building them myself. Self-hosting offers customization, freedom from API price changes, and deeper integration with other self-hosted services. If going bigger, it should be possible to easily change to cloud-based providers, as the underlying knowledge and skills are transferable. It’s going to be challenging, but knowledge, skill-building, and complete ownership are, I think, well worth it.

Agent Self-Hosting Project
https://evandufraisse.github.io/projects/agent_self_hosting/
Author
Evan Dufraisse, PhD
Published at
2024-12-29