- CLAUDE.md: Project configuration for Claude Code - docs/: Infrastructure documentation - INFRASTRUCTURE.md: Service map, storage, network - CONFIGURATIONS.md: Service configs and credentials - CHANGELOG.md: Change history - DECISIONS.md: Architecture decisions - TASKS.md: Task tracking - scripts/: Automation scripts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
4.6 KiB
Architecture Decisions & Patterns
Purpose: Record of important decisions, patterns, and "why we do it this way" Update Frequency: When making significant architectural choices
Service Organization
Authentication Strategy
Decision: Services use their own built-in authentication, not Authelia Reason: Most *arr services and media tools have robust auth systems Exception: Consider Authelia for future services that lack authentication
LXC vs Docker
Keep in Docker:
- NZBGet (requires specific volume mapping, works well in Docker)
- Multi-container stacks
- Services requiring Docker-specific features
Migrate to LXC:
- Single-purpose services (Sonarr, Radarr, etc.)
- Services benefiting from isolation
- Stateless applications
File Permissions
Media Files
Standard: All media files and folders must be 777 Reason:
- NFS mounts between multiple systems with different UID mappings
- Jellyfin runs in LXC with UID namespace mapping (100107)
- Sonarr runs in LXC with different UID mapping
- NZBGet runs in Docker with UID 1000
Implementation:
- NZBGet:
UMask=0000to create files with 777 - Sonarr: Media management → Set permissions → chmod 777
- Manual fixes:
chmod -R 777on media directories as needed
Network Architecture
Reverse Proxy
Decision: Single Traefik instance handles all external access Location: LXC 104 on pm2 Benefits:
- Single point for SSL/TLS management
- Automatic Let's Encrypt certificate renewal
- Centralized routing configuration
- DNS-01 challenge for wildcard certificates
Service Domains
Pattern: <service>.kavcorp.com
DNS: All subdomains point to public IP (99.74.188.161)
Routing: Traefik inspects Host header and routes internally
Storage Architecture
Media Storage
Decision: NFS mount from elantris for all media
Path: /mnt/pve/elantris-media → elantris /el-pool/media
Reason:
- Centralized storage
- Accessible from all cluster nodes
- Large capacity (24TB ZFS pool)
- Easy to backup/snapshot
LXC Root Filesystems
Decision: Store on KavNas NFS for most services Reason:
- Easy backups
- Portable between nodes
- Network storage sufficient for most workloads
Exception: High I/O services use local-lvm
Monitoring & Maintenance
Configuration Management
Decision: Manual configuration with documentation Reason: Small scale doesn't justify Ansible/Terraform complexity Trade-off: Requires disciplined documentation updates
Backup Strategy
Decision: Proxmox built-in backup to KavNas Frequency: [To be determined] Retention: [To be determined]
Common Patterns
Adding a New Service Behind Traefik
- Deploy service with static IP in 10.4.2.0/24 range
- Create Traefik config in
/etc/traefik/conf.d/<service>.yaml - Use pattern:
http: routers: <service>: rule: "Host(`<service>.kavcorp.com`)" entryPoints: [websecure] service: <service> tls: certResolver: letsencrypt services: <service>: loadBalancer: servers: - url: "http://<ip>:<port>" - Traefik auto-reloads (no restart needed)
- Update
docs/INFRASTRUCTURE.mdwith service details
Troubleshooting Permission Issues
- Check file ownership:
ls -la /path/to/file - Check if 777:
stat /path/to/file - Fix permissions:
chmod -R 777 /path/to/directory - For NZBGet: Verify
UMask=0000in nzbget.conf - For Sonarr/Radarr: Check Settings → Media Management → Set Permissions
Node SSH Access
From local machine:
- User:
kavren - Key:
~/.ssh/id_ed25519
Between cluster nodes:
- User:
root - Each node has other nodes' keys in
/root/.ssh/authorized_keys - Proxmox web UI uses node SSH for shell access
Known Issues & Workarounds
Jellyfin Not Seeing Media After Import
Symptom: Files imported to /media/tv but Jellyfin shows empty
Cause: Jellyfin LXC mount not active or permissions wrong
Fix:
- Restart Jellyfin LXC:
pct stop 121 && pct start 121 - Verify mount inside LXC:
pct exec 121 -- ls -la /media/tv/ - Fix permissions if needed:
chmod -R 777 /mnt/pve/elantris-media/tv/
Sonarr/Radarr Import Failures
Symptom: "Access denied" errors in logs Cause: Permission mismatch between download client and *arr service Fix: Ensure download folder has 777 permissions
Future Considerations
- Automated backup strategy
- Monitoring/alerting system (Prometheus + Grafana?)
- Consider Authelia for future services without built-in auth
- Document disaster recovery procedures
- Consider consolidating Docker hosts