After months of tinkering, cursing, and late-night debugging sessions, my Raspberry Pi 5-powered smart speaker is finally operational. While commercial options like Amazon Echo and Google Home dominate the market, there's something deeply satisfying about building your own system from scratch—one that you control completely, with no data being shipped off to corporate servers.
In this post, I'll break down the intent routing system that powers my speaker, with special focus on how I integrated WiZ smart bulbs using nothing but simple UDP commands. I'll also dive into the challenges of running everything locally on a Pi5, and how I structured the system to handle multiple types of intents efficiently.
The Architecture
At a high level, the system works like this:
- Local speech recognition captures voice commands
- A compact language model detects intent and extracts parameters
- The intent router dispatches commands to appropriate handlers
- Handlers execute specific functions (lights, music, weather, etc.)
- Text-to-speech reads back responses
The whole stack runs on a Raspberry Pi 5 with 8GB RAM, which provides enough horsepower to handle local speech processing without cloud dependencies. I'm using a custom 3D-printed enclosure with an integrated microphone array and a decent 3W speaker.
Intent Recognition and Routing
For intent recognition, I initially tried running a full-sized language model on the Pi, but quickly discovered that was too ambitious. Instead, I fine-tuned a smaller model specifically for intent detection. The model takes transcribed speech and outputs a structured JSON object containing:
- Intent category (lights, music, weather, etc.)
- Command type (on/off, color change, volume, etc.)
- Parameters (which lights, what color, etc.)
{
"category": "lights",
"command": "change_color",
"parameters": {
"room": "living_room",
"color": "daylight"
}
}
The intent router is a FastAPI service that receives this output and dispatches to the appropriate handler class. Here's a simplified version of my router:
@app.post("/process-intent")
async def process_intent(intent_data: dict):
category = intent_data.get("category")
if category == "lights":
return await light_handler.process(intent_data)
elif category == "music":
return await music_handler.process(intent_data)
elif category == "weather":
return await weather_handler.process(intent_data)
return {"status": "error", "message": "Unknown intent category"}
The Beauty of WiZ Lights: Just UDP
The most elegant part of this project has been integrating WiZ smart bulbs. Unlike many smart home devices that require cloud services, complicated APIs, or proprietary bridges, WiZ bulbs communicate over your local network using simple UDP messages. This was a revelation.
After discovering their protocol through some reverse engineering and community documentation, I was able to create a clean Python class that communicates directly with the bulbs:
class WiZController:
def __init__(self, ip_list=None, discover=False):
self.PORT = 38899
self.bulbs = ip_list or []
if discover:
self.start()
def send(self, ip, payload):
with socket.socket(socket.AF_INET, socket.SOCK_DGRAM) as sock:
sock.sendto(json.dumps(payload).encode(), (ip, self.PORT))
The commands themselves are remarkably simple JSON structures. Want to turn on a light? It's just:
{
"method": "setState",
"params": {
"state": true
}
}
Want to change the color temperature to daylight? Just:
{
"method": "setState",
"params": {
"state": true,
"temp": 6500
}
}
Challenges and Solutions
Challenge 1: Bulb Discovery
def start(self):
local_ip = self._get_local_ip()
network = ipaddress.IPv4Interface(f"{local_ip}/24").network
discovery_msg = {"method": "getPilot", "params": {}}
discovered = []
with socket.socket(socket.AF_INET, socket.SOCK_DGRAM) as sock:
sock.settimeout(0.5)
sock.bind(('', 0))
for ip in network.hosts():
try:
sock.sendto(json.dumps(discovery_msg).encode(), (str(ip), self.PORT))
except:
pass
start_time = time.time()
while time.time() - start_time < self.timeout:
try:
data, addr = sock.recvfrom(1024)
ip = addr[0]
if ip not in discovered and ip not in self.bulbs:
response = json.loads(data.decode())
if "result" in response:
discovered.append(ip)
self.bulb_info[ip] = response["result"]
except socket.timeout:
pass
Challenge 2: Color vs. Temperature Confusion
TEMP_MAP = {
"warm white": 2700,
"soft white": 3000,
"neutral": 4000,
"cool white": 5000,
"daylight": 6500
}
def set_temperature(self, ip_list, temp):
temp = max(2200, min(6500, temp))
for ip in ip_list:
self.send(ip, {
"method": "setState",
"params": {
"state": True,
"temp": temp
}
})
Challenge 3: Rave Mode and Threading
def rave_mode(self, ip_list, interval=1.0):
self.stop_rave()
self.rave_running = True
self.rave_thread = threading.Thread(
target=self._rave_worker,
args=(ip_list, interval),
daemon=True
)
self.rave_thread.start()
def _rave_worker(self, ip_list, interval):
try:
while self.rave_running:
for ip in ip_list:
r, g, b = [random.randint(0, 255) for _ in range(3)]
self.send(ip, {
"method": "setState",
"params": {
"state": True,
"r": r, "g": g, "b": b
}
})
time.sleep(interval)
except Exception as e:
print(f"Rave mode error: {e}")
Integration With the Voice Assistant
class LightHandler:
def __init__(self):
self.controller = WiZController(discover=True)
async def process(self, intent_data):
command = intent_data.get("command")
params = intent_data.get("parameters", {})
room = params.get("room", "all")
if room == "all":
target_bulbs = self.controller.bulbs
else:
room_map = {
"living_room": ["192.168.1.107", "192.168.1.143"],
"bedroom": ["192.168.1.158", "192.168.1.166"]
}
target_bulbs = room_map.get(room, [])
if command == "turn_on":
self.controller.set_state(target_bulbs, True)
return {"status": "success", "message": f"Turned on {room} lights"}
elif command == "turn_off":
self.controller.set_state(target_bulbs, False)
return {"status": "success", "message": f"Turned off {room} lights"}
elif command == "change_color":
color = params.get("color", "white")
self.controller.set_color(target_bulbs, color)
return {"status": "success", "message": f"Changed {room} lights to {color}"}
elif command == "set_brightness":
level = params.get("level", 50)
self.controller.set_brightness(target_bulbs, level)
return {"status": "success", "message": f"Set {room} brightness to {level}%"}
elif command == "rave_mode":
self.controller.rave_mode(target_bulbs)
return {"status": "success", "message": "Rave mode activated"}
return {"status": "error", "message": "Unknown light command"}
Performance on the Pi 5
One of the most pleasant surprises has been how well the Raspberry Pi 5 handles this workload. With its improved CPU and 8GB of RAM, it manages the following simultaneously with minimal lag:
- Local speech recognition using Whisper small model
- Intent classification
- UDP communication with up to 18 WiZ bulbs
- Local music playback via MPD
- Text-to-speech response generation
The system typically responds within 1–2 seconds of a command, which is impressively close to commercial smart speakers—all while keeping data private and running 100% locally.
What's Next
- Adding room presence detection to automatically contextualize commands
- Implementing scenes that combine lighting, music, and other elements
- Creating a web dashboard for configuration and monitoring
- Expanding to control more device types using similar direct-communication approaches
Final Thoughts
Building this system has reinforced something I've long suspected: much of the complexity in commercial smart home products is unnecessary, serving business models rather than users. The WiZ bulbs demonstrate that smart home technology can be simultaneously powerful and simple.
There's something deeply satisfying about saying "Computer, set living room to daylight" and watching the lights respond—knowing that the command never left my local network, wasn't processed by a corporation, and doesn't depend on someone else's servers staying online.
That's the beauty of DIY smart home technology: it's truly yours.