# Architecture Decisions & Patterns > **Purpose**: Record of important decisions, patterns, and "why we do it this way" > **Update Frequency**: When making significant architectural choices ## Service Organization ### Authentication Strategy **Decision**: Services use their own built-in authentication, not Authelia **Reason**: Most *arr services and media tools have robust auth systems **Exception**: Consider Authelia for future services that lack authentication ### LXC vs Docker **Keep in Docker**: - NZBGet (requires specific volume mapping, works well in Docker) - Multi-container stacks - Services requiring Docker-specific features **Migrate to LXC**: - Single-purpose services (Sonarr, Radarr, etc.) - Services benefiting from isolation - Stateless applications ## File Permissions ### Media Files **Standard**: All media files and folders must be 777 **Reason**: - NFS mounts between multiple systems with different UID mappings - Jellyfin runs in LXC with UID namespace mapping (100107) - Sonarr runs in LXC with different UID mapping - NZBGet runs in Docker with UID 1000 **Implementation**: - NZBGet: `UMask=0000` to create files with 777 - Sonarr: Media management → Set permissions → chmod 777 - Manual fixes: `chmod -R 777` on media directories as needed ## Network Architecture ### Reverse Proxy **Decision**: Single Traefik instance handles all external access **Location**: LXC 104 on pm2 **Benefits**: - Single point for SSL/TLS management - Automatic Let's Encrypt certificate renewal - Centralized routing configuration - DNS-01 challenge for wildcard certificates ### Service Domains **Pattern**: `.kavcorp.com` **DNS**: All subdomains point to public IP (99.74.188.161) **Routing**: Traefik inspects Host header and routes internally ## Storage Architecture ### Media Storage **Decision**: NFS mount from elantris for all media **Path**: `/mnt/pve/elantris-media` → elantris `/el-pool/media` **Reason**: - Centralized storage - Accessible from all cluster nodes - Large capacity (24TB ZFS pool) - Easy to backup/snapshot ### LXC Root Filesystems **Decision**: Store on KavNas NFS for most services **Reason**: - Easy backups - Portable between nodes - Network storage sufficient for most workloads **Exception**: High I/O services use local-lvm ## Monitoring & Maintenance ### Configuration Management **Decision**: Manual configuration with documentation **Reason**: Small scale doesn't justify Ansible/Terraform complexity **Trade-off**: Requires disciplined documentation updates ### Backup Strategy **Decision**: Proxmox built-in backup to KavNas **Frequency**: [To be determined] **Retention**: [To be determined] ## Common Patterns ### Adding a New Service Behind Traefik 1. Deploy service with static IP in 10.4.2.0/24 range 2. Create Traefik config in `/etc/traefik/conf.d/.yaml` 3. Use pattern: ```yaml http: routers: : rule: "Host(`.kavcorp.com`)" entryPoints: [websecure] service: tls: certResolver: letsencrypt services: : loadBalancer: servers: - url: "http://:" ``` 4. Traefik auto-reloads (no restart needed) 5. Update `docs/INFRASTRUCTURE.md` with service details ### Troubleshooting Permission Issues 1. Check file ownership: `ls -la /path/to/file` 2. Check if 777: `stat /path/to/file` 3. Fix permissions: `chmod -R 777 /path/to/directory` 4. For NZBGet: Verify `UMask=0000` in nzbget.conf 5. For Sonarr/Radarr: Check Settings → Media Management → Set Permissions ### Node SSH Access **From local machine**: - User: `kavren` - Key: `~/.ssh/id_ed25519` **Between cluster nodes**: - User: `root` - Each node has other nodes' keys in `/root/.ssh/authorized_keys` - Proxmox web UI uses node SSH for shell access ## Known Issues & Workarounds ### Jellyfin Not Seeing Media After Import **Symptom**: Files imported to `/media/tv` but Jellyfin shows empty **Cause**: Jellyfin LXC mount not active or permissions wrong **Fix**: 1. Restart Jellyfin LXC: `pct stop 121 && pct start 121` 2. Verify mount inside LXC: `pct exec 121 -- ls -la /media/tv/` 3. Fix permissions if needed: `chmod -R 777 /mnt/pve/elantris-media/tv/` ### Sonarr/Radarr Import Failures **Symptom**: "Access denied" errors in logs **Cause**: Permission mismatch between download client and *arr service **Fix**: Ensure download folder has 777 permissions ## Future Considerations - [ ] Automated backup strategy - [ ] Monitoring/alerting system (Prometheus + Grafana?) - [ ] Consider Authelia for future services without built-in auth - [ ] Document disaster recovery procedures - [ ] Consider consolidating Docker hosts