- Created NETWORK-UPGRADE-PLAN.md with full topology and VLAN design - Hardware: 2× GiGaPlus 10G PoE ($202), 2× U7 Pro ($378) = $580 total - 10G backhaul between server closet and basement - VLANs: Trusted (1), Servers (10), IoT (20), Guest (30) - OPNsense VM for routing, UniFi Controller LXC for APs - Updated TASKS.md with implementation checklist - Updated DECISIONS.md with architecture rationale 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
6.4 KiB
Architecture Decisions & Patterns
Purpose: Record of important decisions, patterns, and "why we do it this way" Update Frequency: When making significant architectural choices
Service Organization
Authentication Strategy
Decision: Services use their own built-in authentication, not Authelia Reason: Most *arr services and media tools have robust auth systems Exception: Consider Authelia for future services that lack authentication
LXC vs Docker
Keep in Docker:
- NZBGet (requires specific volume mapping, works well in Docker)
- Multi-container stacks
- Services requiring Docker-specific features
Migrate to LXC:
- Single-purpose services (Sonarr, Radarr, etc.)
- Services benefiting from isolation
- Stateless applications
File Permissions
Media Files
Standard: All media files and folders must be 777 Reason:
- NFS mounts between multiple systems with different UID mappings
- Jellyfin runs in LXC with UID namespace mapping (100107)
- Sonarr runs in LXC with different UID mapping
- NZBGet runs in Docker with UID 1000
Implementation:
- NZBGet:
UMask=0000to create files with 777 - Sonarr: Media management → Set permissions → chmod 777
- Manual fixes:
chmod -R 777on media directories as needed
Network Architecture
VLAN Strategy (Planned)
Decision: Segment network into 4 VLANs See: NETWORK-UPGRADE-PLAN.md
| VLAN | Name | Subnet | Purpose |
|---|---|---|---|
| 1 | Default | 10.4.2.0/24 | Management, trusted PCs, Proxmox hosts |
| 10 | Servers | 10.4.10.0/24 | Server containers, NAS |
| 20 | IoT | 10.4.20.0/24 | Cameras, smart home, Home Assistant |
| 30 | Guest | 10.4.30.0/24 | Guest WiFi, isolated |
VLAN Tagging Methods:
- WiFi: UniFi APs (SSID → VLAN mapping)
- Cameras: GS308EP (port-based VLAN)
- Containers: Proxmox (bridge VLAN tag)
- Wired PCs: Untagged (VLAN 1 via unmanaged switches)
Router/Firewall (Planned)
Decision: OPNsense VM on Elantris Reason:
- Free, full-featured firewall/router
- VLAN routing and inter-VLAN firewall rules
- IDS/IPS capability
- Elantris has ample resources (128GB RAM)
Alternative Considered: Ubiquiti Dream Machine
- Rejected due to cost and ecosystem lock-in
- OPNsense more flexible for homelab
10G Backhaul (Planned)
Decision: 10G RJ45 between server closet and basement Hardware: 2× GiGaPlus 6-Port 10G PoE switches ($101 each) Why GiGaPlus over UniFi:
- Native 10G RJ45 (no SFP+ transceivers needed)
- Includes PoE for APs
- $202 total vs $800+ for UniFi equivalent
- Cat6 can handle 10G at house distances (<55m)
WiFi (Planned)
Decision: UniFi APs with mixed models Hardware:
- 1× U6 Enterprise (existing) - server closet/upstairs
- 2× U7 Pro ($189 each) - basement + main floor
Why UniFi:
- Multiple SSIDs mapped to VLANs
- Seamless roaming between APs
- Centralized management via controller
- Better than Asus mesh for VLAN support
Controller: LXC on Proxmox (free) via community helper script
Reverse Proxy
Decision: Single Traefik instance handles all external access Location: LXC 104 on pm2 Benefits:
- Single point for SSL/TLS management
- Automatic Let's Encrypt certificate renewal
- Centralized routing configuration
- DNS-01 challenge for wildcard certificates
Service Domains
Pattern: <service>.kavcorp.com
DNS: All subdomains point to public IP (99.74.188.161)
Routing: Traefik inspects Host header and routes internally
Storage Architecture
Media Storage
Decision: NFS mount from elantris for all media
Path: /mnt/pve/elantris-media → elantris /el-pool/media
Reason:
- Centralized storage
- Accessible from all cluster nodes
- Large capacity (24TB ZFS pool)
- Easy to backup/snapshot
LXC Root Filesystems
Decision: Store on KavNas NFS for most services Reason:
- Easy backups
- Portable between nodes
- Network storage sufficient for most workloads
Exception: High I/O services use local-lvm
Monitoring & Maintenance
Configuration Management
Decision: Manual configuration with documentation Reason: Small scale doesn't justify Ansible/Terraform complexity Trade-off: Requires disciplined documentation updates
Backup Strategy
Decision: Proxmox built-in backup to KavNas Frequency: [To be determined] Retention: [To be determined]
Common Patterns
Adding a New Service Behind Traefik
- Deploy service with static IP in 10.4.2.0/24 range
- Create Traefik config in
/etc/traefik/conf.d/<service>.yaml - Use pattern:
http: routers: <service>: rule: "Host(`<service>.kavcorp.com`)" entryPoints: [websecure] service: <service> tls: certResolver: letsencrypt services: <service>: loadBalancer: servers: - url: "http://<ip>:<port>" - Traefik auto-reloads (no restart needed)
- Update
docs/INFRASTRUCTURE.mdwith service details
Troubleshooting Permission Issues
- Check file ownership:
ls -la /path/to/file - Check if 777:
stat /path/to/file - Fix permissions:
chmod -R 777 /path/to/directory - For NZBGet: Verify
UMask=0000in nzbget.conf - For Sonarr/Radarr: Check Settings → Media Management → Set Permissions
Node SSH Access
From local machine:
- User:
kavren - Key:
~/.ssh/id_ed25519
Between cluster nodes:
- User:
root - Each node has other nodes' keys in
/root/.ssh/authorized_keys - Proxmox web UI uses node SSH for shell access
Known Issues & Workarounds
Jellyfin Not Seeing Media After Import
Symptom: Files imported to /media/tv but Jellyfin shows empty
Cause: Jellyfin LXC mount not active or permissions wrong
Fix:
- Restart Jellyfin LXC:
pct stop 121 && pct start 121 - Verify mount inside LXC:
pct exec 121 -- ls -la /media/tv/ - Fix permissions if needed:
chmod -R 777 /mnt/pve/elantris-media/tv/
Sonarr/Radarr Import Failures
Symptom: "Access denied" errors in logs Cause: Permission mismatch between download client and *arr service Fix: Ensure download folder has 777 permissions
Future Considerations
- Automated backup strategy
- Monitoring/alerting system (Prometheus + Grafana?)
- Consider Authelia for future services without built-in auth
- Document disaster recovery procedures
- Consider consolidating Docker hosts