Initial commit: KavCorp infrastructure documentation

- CLAUDE.md: Project configuration for Claude Code - docs/: Infrastructure documentation - INFRASTRUCTURE.md: Service map, storage, network - CONFIGURATIONS.md: Service configs and credentials - CHANGELOG.md: Change history - DECISIONS.md: Architecture decisions - TASKS.md: Task tracking - scripts/: Automation scripts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 22:07:01 -05:00
commit 120c2ec809
19 changed files with 3448 additions and 0 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,239 @@
 # CLAUDE.md
 This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
 ## Repository Purpose
 Infrastructure documentation and management repository for the **KavCorp** Proxmox cluster - a 5-node homelab cluster running self-hosted services. This repository supports migration from Docker containers to Proxmox LXCs where appropriate.
 ## Cluster Architecture
 **Cluster Name**: KavCorp
 **Nodes**: 5 (pm1, pm2, pm3, pm4, elantris)
 **Network**: 10.4.2.0/24
 **Primary Management Node**: pm2 (10.4.2.6)
 ### Node IP Mapping
 - pm1: 10.4.2.2
 - pm2: 10.4.2.6 (primary for new LXC deployment)
 - pm3: 10.4.2.3
 - pm4: 10.4.2.5
 - elantris: 10.4.2.14 (largest node, 128GB RAM, ZFS storage)
 ## Common Commands
 ### Cluster Management
 ```bash
 # Access cluster (use pm2 as primary management node)
 ssh pm2
 # View cluster status
 pvecm status
 pvecm nodes
 # List all VMs/LXCs across cluster
 pvesh get /cluster/resources --type vm --output-format json
 # List all nodes
 pvesh get /cluster/resources --type node --output-format json
 # List storage
 pvesh get /cluster/resources --type storage --output-format json
 ```
 ### LXC Management
 ```bash
 # List LXCs on a specific node
 pct list
 # Get LXC configuration
 pvesh get /nodes/<node>/lxc/<vmid>/config
 pct config <vmid>
 # Start/stop/restart LXC
 pct start <vmid>
 pct stop <vmid>
 pct restart <vmid>
 # Execute command in LXC
 pct exec <vmid> -- <command>
 # Enter LXC console
 pct enter <vmid>
 # Create LXC from template
 pct create <vmid> <template> --hostname <name> --cores <n> --memory <mb> --rootfs <storage>:<size>
 ```
 ### Network Configuration
 ```bash
 # View network interfaces
 ip -br addr show
 # Network config location
 /etc/network/interfaces
 # Standard bridge: vmbr0 (connected to eno1 physical interface)
 # Gateway: 10.4.2.254
 ```
 ## Storage Architecture
 ### Storage Pools
 **Local Storage** (per-node):
 - `local`: Directory storage on each node, for backups/templates/ISOs (~100GB each)
 - `local-lvm`: LVM thin pool on each node, for VM/LXC disks (~350-375GB each)
 **ZFS Pools**:
 - `el-pool`: ZFS pool on elantris (24TB), used for large data storage
 **NFS Mounts** (shared):
 - `KavNas`: Primary NFS share from Synology NAS (10.4.2.13), ~23TB - used for backups, ISOs, and LXC storage
 - `elantris-downloads`: NFS share from elantris, ~23TB - used for media downloads
 ### Storage Recommendations
 - **New LXC containers**: Use `KavNas` for rootfs (NFS, easily backed up)
 - **High-performance workloads**: Use `local-lvm` on the host node
 - **Large data storage**: Use `elantris-downloads` or `el-pool`
 - **Templates and ISOs**: Store in `KavNas` or node's `local`
 ## Service Categories (by tags)
 - **arr**: Media automation (*arr stack - Sonarr, Radarr, Prowlarr, Bazarr, Whisparr)
 - **media**: Media servers (Jellyfin, Jellyseerr, Kometa)
 - **proxy**: Reverse proxy (Traefik)
 - **authenticator**: Authentication (Authelia)
 - **nvr**: Network Video Recorder (Shinobi)
 - **docker**: Docker host LXCs (docker-pm2, docker-pm4) and VMs (docker-pm3)
 - **proxmox-helper-scripts**: Deployed via community scripts
 - **community-script**: Deployed via ProxmoxVE Helper Scripts
 ## Migration Strategy
 **Goal**: Move services from Docker containers to dedicated LXCs where it makes sense.
 **Good candidates for LXC migration**:
 - Single-purpose services
 - Services with simple dependencies
 - Stateless applications
 - Services that benefit from isolation
 **Keep in Docker**:
 - Complex multi-container stacks
 - Services requiring Docker-specific features
 - Temporary/experimental services
 **Current Docker Hosts**:
 - VM 109: docker-pm3 (on pm3, 4 CPU, 12GB RAM)
 - LXC 110: docker-pm4 (on pm4, 4 CPU, 8GB RAM)
 - LXC 113: docker-pm2 (on pm2, 4 CPU, 8GB RAM)
 - LXC 107: dockge (on pm3, 12 CPU, 8GB RAM) - Docker management UI
 ## IP Address Allocation
 **Infrastructure Services**:
 - 10.4.2.10: traefik (LXC 104)
 - 10.4.2.13: KavNas (Synology NAS)
 - 10.4.2.14: elantris
 **Media Stack**:
 - 10.4.2.15: sonarr (LXC 105)
 - 10.4.2.16: radarr (LXC 108)
 - 10.4.2.17: prowlarr (LXC 114)
 - 10.4.2.18: bazarr (LXC 119)
 - 10.4.2.19: whisparr (LXC 117)
 - 10.4.2.20: jellyseerr (LXC 115)
 - 10.4.2.21: kometa (LXC 120)
 - 10.4.2.22: jellyfin (LXC 121)
 **Other Services**:
 - 10.4.2.23: authelia (LXC 116)
 - 10.4.2.24: notifiarr (LXC 118)
 *Note: Update docs/network.md when allocating new IPs*
 ## Documentation Structure
 **CRITICAL**: Always read `docs/README.md` first to understand the documentation system.
 ### Core Documentation Files (ALWAYS UPDATE, NEVER CREATE NEW)
 1. **`docs/INFRASTRUCTURE.md`** - Single source of truth
   - **CHECK THIS FIRST** for node IPs, service locations, storage paths
   - Update whenever infrastructure changes
 2. **`docs/CONFIGURATIONS.md`** - Service configurations
   - API keys, config snippets, copy/paste ready configs
   - Update when service configs change
 3. **`docs/DECISIONS.md`** - Architecture decisions
   - Why we made choices, common patterns, troubleshooting
   - Update when making decisions or discovering patterns
 4. **`docs/TASKS.md`** - Current work tracking
   - Active, pending, blocked, and completed tasks
   - Update at start and end of work sessions
 5. **`docs/CHANGELOG.md`** - Historical record
   - Date-stamped entries for all changes
   - Update after completing any significant work
 ### Documentation Workflow
 **MANDATORY - Before ANY work session**:
 1. Read `docs/README.md` - Understand the documentation system
 2. Check `docs/INFRASTRUCTURE.md` - Get current infrastructure state
 3. Check `docs/TASKS.md` - See what's already in progress or pending
 **MANDATORY - During work**:
 - When you need node IPs, service locations, or paths → Read `docs/INFRASTRUCTURE.md`
 - When you need config snippets or API keys → Read `docs/CONFIGURATIONS.md`
 - When wondering "why is it done this way?" → Read `docs/DECISIONS.md`
 - When you discover a pattern or make a decision → Immediately update `docs/DECISIONS.md`
 - When you encounter issues → Check `docs/DECISIONS.md` Known Issues section first
 **MANDATORY - After completing ANY work**:
 1. Update the relevant core doc:
   - Infrastructure change? → Update `docs/INFRASTRUCTURE.md`
   - Config change? → Update `docs/CONFIGURATIONS.md`
   - New pattern/decision? → Update `docs/DECISIONS.md`
 2. Add dated entry to `docs/CHANGELOG.md` describing what changed
 3. Update `docs/TASKS.md` to mark work complete or add new tasks
 4. Update "Last Updated" date in `docs/README.md`
 **STRICTLY FORBIDDEN**:
 - Creating new documentation files without explicit user approval
 - Leaving documentation outdated after making changes
 - Creating session-specific notes files (use CHANGELOG for history)
 - Skipping documentation updates "to save time"
 - Assuming you remember infrastructure details (always check docs)
 ### When to Update Which File
 | You just did... | Update this file |
 |----------------|------------------|
 | Added/removed a service | `INFRASTRUCTURE.md` (service map) |
 | Changed an IP address | `INFRASTRUCTURE.md` (service map) |
 | Modified service config | `CONFIGURATIONS.md` (add/update config snippet) |
 | Changed API key | `CONFIGURATIONS.md` (update credentials) |
 | Made architectural decision | `DECISIONS.md` (add to decisions section) |
 | Discovered troubleshooting pattern | `DECISIONS.md` (add to common patterns) |
 | Hit a recurring issue | `DECISIONS.md` (add to known issues) |
 | Completed a task | `TASKS.md` (mark complete) + `CHANGELOG.md` (add entry) |
 | Started new work | `TASKS.md` (add to in progress) |
 | ANY significant change | `CHANGELOG.md` (always add dated entry) |
 ## Scripts
 - `scripts/provisioning/`: LXC/VM creation scripts
 - `scripts/backup/`: Backup automation scripts
 - `scripts/monitoring/`: Monitoring and health check scripts
 ## Workflow Notes
 - New LXCs are primarily deployed on **pm2**
 - Use ProxmoxVE Helper Scripts (https://helper-scripts.com) for common services
 - Always tag LXCs appropriately for organization
 - Document service URLs and access details in `docs/services.md`
 - Keep inventory documentation in sync with changes
--- a/README.md
+++ b/README.md
@@ -0,0 +1,116 @@
 # KavCorp Proxmox Infrastructure
 Documentation and management repository for the KavCorp Proxmox cluster.
 ## Quick Start
 ```bash
 # Connect to primary management node
 ssh pm2
 # View cluster status
 pvecm status
 # List all containers
 pvesh get /cluster/resources --type vm --output-format json
 ```
 ## Repository Structure
 ```
 proxmox-infra/
 ├── CLAUDE.md              # Development guidance for Claude Code
 ├── README.md              # This file
 ├── docs/                  # Documentation
 │   ├── cluster-state.md   # Current cluster topology
 │   ├── inventory.md       # VM/LXC inventory with specs
 │   ├── network.md         # Network topology and IP assignments
 │   ├── storage.md         # Storage layout and usage
 │   └── services.md        # Service mappings and dependencies
 └── scripts/               # Management scripts
    ├── backup/            # Backup automation
    ├── provisioning/      # LXC/VM creation scripts
    └── monitoring/        # Health checks and monitoring
 ```
 ## Cluster Overview
 - **Cluster Name**: KavCorp
 - **Nodes**: 5 (pm1, pm2, pm3, pm4, elantris)
 - **Total VMs**: 2
 - **Total LXCs**: 19
 - **Primary Network**: 10.4.2.0/24
 - **Management Node**: pm2 (10.4.2.6)
 ### Nodes
 | Node | IP | CPU | RAM | Role |
 |---|---|---|---|---|
 | pm1 | 10.4.2.2 | 4 cores | 16GB | General purpose |
 | pm2 | 10.4.2.6 | 12 cores | 31GB | **Primary management, media stack** |
 | pm3 | 10.4.2.3 | 16 cores | 33GB | Docker, NVR, gaming |
 | pm4 | 10.4.2.5 | 12 cores | 31GB | Docker, NVR |
 | elantris | 10.4.2.14 | 16 cores | 128GB | **Storage node, media server** |
 ## Key Services
 - **Traefik** (10.4.2.10): Reverse proxy
 - **Jellyfin** (10.4.2.22): Media server - **Recently added to Traefik**
 - **Media Automation**: Sonarr, Radarr, Prowlarr, Bazarr, Whisparr (on pm2)
 - **Home Assistant** (VMID 100): Home automation
 - **Frigate** (VMID 111): NVR with object detection
 ## Recent Changes
 **2025-11-16**:
 - ✅ Created initial repository structure and documentation
 - ✅ Documented 5-node cluster configuration
 - ✅ Added Jellyfin to Traefik configuration (jellyfin.kavcorp.com)
 - ✅ Inventoried 21 containers (2 VMs, 19 LXCs)
 ## Documentation
 See the `docs/` directory for detailed information:
 - [Cluster State](docs/cluster-state.md) - Node details and health
 - [Inventory](docs/inventory.md) - Complete VM/LXC listing
 - [Network](docs/network.md) - IP allocations and network topology
 - [Storage](docs/storage.md) - Storage pools and usage
 - [Services](docs/services.md) - Service mappings and access URLs
 ## Common Tasks
 ### Managing LXCs
 ```bash
 # Start/stop/restart
 pct start <vmid>
 pct stop <vmid>
 pct restart <vmid>
 # View config
 pct config <vmid>
 # Execute command
 pct exec <vmid> -- <command>
 ```
 ### Checking Resources
 ```bash
 # Cluster-wide resources
 pvesh get /cluster/resources --output-format json
 # Storage usage
 pvesh get /cluster/resources --type storage --output-format json
 ```
 ## Access
 - **Web UI**: https://pm2.kavcorp.com:8006 (or any node)
 - **Traefik Dashboard**: https://traefik.kavcorp.com
 - **Jellyfin**: https://jellyfin.kavcorp.com
 ## Notes
 - This is a migration project from a messy `~/infrastructure` repo
 - Goal: Move services from Docker to LXCs where appropriate
 - Primary new LXC deployment node: **pm2**
 - Most services use community helper scripts from https://helper-scripts.com
--- a/docs/CHANGELOG.md
+++ b/docs/CHANGELOG.md
@@ -0,0 +1,153 @@
 # Changelog
 > **Purpose**: Historical record of all significant infrastructure changes
 ## 2025-12-07
 ### Service Additions
 - **Vaultwarden**: Created new password manager LXC
  - LXC 125 on pm4
  - IP: 10.4.2.212
  - Domain: vtw.kavcorp.com
  - Traefik config: `/etc/traefik/conf.d/vaultwarden.yaml`
  - Tagged: community-script, password-manager
 - **Immich**: Migrated from Docker (dockge LXC 107 on pm3) to native LXC
  - LXC 126 on pm4
  - IP: 10.4.2.24:2283
  - Domain: immich.kavcorp.com
  - Traefik config: `/etc/traefik/conf.d/immich.yaml`
  - Library storage: NFS mount from elantris (`/el-pool/downloads/immich/`)
  - 38GB photo library transferred via rsync
  - Fresh database (version incompatibility: old v1.129.0 → new v2.3.1)
  - Services: immich-web.service, immich-ml.service
  - Tagged: community-script, photos
 ### Infrastructure Maintenance
 - **Traefik (LXC 104)**: Fixed disk full issue
  - Truncated 895MB access log that filled 2GB rootfs
  - Added logrotate config to prevent recurrence (50MB max, 7 day rotation)
  - Cleaned apt cache and journal logs
 ## 2025-11-20
 ### Service Changes
 - **AMP**: Added to Traefik reverse proxy
  - LXC 124 on elantris (10.4.2.26:8080)
  - Domain: amp.kavcorp.com
  - Traefik config: `/etc/traefik/conf.d/amp.yaml`
  - Purpose: Game server management via CubeCoders AMP
 ## 2025-11-19
 ### Service Changes
 - **LXC 123 (elantris)**: Migrated from Ollama to llama.cpp
  - Removed Ollama installation and service
  - Built llama.cpp from source with CURL support
  - Downloaded TinyLlama 1.1B Q4_K_M model (~667MB)
  - Created systemd service for llama.cpp server
  - Server running on port 11434 (OpenAI-compatible API)
  - Model path: `/opt/llama.cpp/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf`
  - Service: `llama-cpp.service`
  - Domain remains: ollama.kavcorp.com (pointing to llama.cpp now)
 - **LXC 124 (elantris)**: Created new AMP (Application Management Panel) container
  - IP: 10.4.2.26
  - Resources: 4 CPU cores, 4GB RAM, 16GB storage
  - Storage: local-lvm on elantris
  - OS: Ubuntu 24.04 LTS
  - Purpose: Game server management via CubeCoders AMP
  - Tagged: gaming, amp
 ## 2025-11-17
 ### Service Additions
 - **Ollama**: Added to Traefik reverse proxy
  - LXC 123 on elantris
  - IP: 10.4.2.224:11434
  - Domain: ollama.kavcorp.com
  - Traefik config: `/etc/traefik/conf.d/ollama.yaml`
  - Downloaded Qwen 3 Coder 30B model
 - **Frigate**: Added to Traefik reverse proxy
  - LXC 111 on pm3
  - IP: 10.4.2.215:5000
  - Domain: frigate.kavcorp.com
  - Traefik config: `/etc/traefik/conf.d/frigate.yaml`
 - **Foundry VTT**: Added to Traefik reverse proxy
  - LXC 112 on pm3
  - IP: 10.4.2.37:30000
  - Domain: vtt.kavcorp.com
  - Traefik config: `/etc/traefik/conf.d/foundry.yaml`
 ### Infrastructure Changes
 - **SSH Access**: Regenerated SSH keys on pm2 and distributed to all cluster nodes
  - pm3 SSH service was down, enabled and configured
  - All nodes (pm1, pm2, pm3, pm4, elantris) now accessible from pm2 via Proxmox web UI
 ### Service Configuration
 - **NZBGet**: Fixed file permissions
  - Set `UMask=0000` in nzbget.conf to create files with 777 permissions
  - Fixed permission issues causing Sonarr import failures
 - **Sonarr**: Enabled automatic permission setting
  - Media Management → Set Permissions → chmod 777
  - Ensures imported files are accessible by Jellyfin
 - **Jellyseerr**: Fixed Traefik routing
  - Corrected IP from 10.4.2.20 to 10.4.2.18 in media-services.yaml
 - **Jellyfin**: Fixed LXC mount issues
  - Restarted LXC 121 to activate media mounts
  - Media now visible in `/media/tv`, `/media/movies`, `/media/anime`
 ### Documentation
 - **Major Reorganization**: Consolidated scattered docs into structured system
  - Created `README.md` - Documentation index and guide
  - Created `INFRASTRUCTURE.md` - All infrastructure details
  - Created `CONFIGURATIONS.md` - Service configurations
  - Created `DECISIONS.md` - Architecture decisions and patterns
  - Created `TASKS.md` - Current and pending tasks
  - Created `CHANGELOG.md` - This file
  - Updated `CLAUDE.md` - Added documentation policy
 ## 2025-11-16
 ### Service Deployments
 - **Home Assistant**: Added to Traefik reverse proxy
  - Domain: hass.kavcorp.com
  - Configured trusted proxies in Home Assistant
 - **Frigate**: Added to Traefik reverse proxy
  - Domain: frigate.kavcorp.com
 - **Proxmox**: Added to Traefik reverse proxy
  - Domain: pm.kavcorp.com
  - Backend: pm2 (10.4.2.6:8006)
 - **Recyclarr**: Configured TRaSH Guides automation
  - Sonarr and Radarr quality profiles synced
  - Dolby Vision blocking implemented
  - Daily sync schedule via cron
 ### Configuration Changes
 - **Traefik**: Removed Authelia from *arr services
  - Services now use only built-in authentication
  - Simplified access for Sonarr, Radarr, Prowlarr, Bazarr, Whisparr, NZBGet
 ### Issues Encountered
 - Media organization script moved files incorrectly
 - Sonarr database corruption (lost TV series tracking)
 - Permission issues with NZBGet downloads
 - Jellyfin LXC mount not active after deployment
 ### Lessons Learned
 - Always verify file permissions (777 required for NFS media)
 - Backup service databases before running automation scripts
 - LXC mounts may need container restart to activate
 - Traefik auto-reloads configs, no restart needed
 ## Earlier History
 *To be documented from previous sessions if needed*
--- a/docs/CONFIGURATIONS.md
+++ b/docs/CONFIGURATIONS.md
@@ -0,0 +1,312 @@
 # Configuration Reference
 > **Purpose**: Detailed configuration for all services - copy/paste ready configs and settings
 > **Update Frequency**: When service configurations change
 ## Traefik
 ### SSL/TLS with Let's Encrypt
 **Location**: LXC 104 on pm2
 **Environment Variables** (`/etc/systemd/system/traefik.service.d/override.conf`):
 ```bash
 NAMECHEAP_API_USER=kavren
 NAMECHEAP_API_KEY=8156f3d9ef664c91b95f029dfbb62ad5
 NAMECHEAP_PROPAGATION_TIMEOUT=3600
 NAMECHEAP_POLLING_INTERVAL=30
 NAMECHEAP_TTL=300
 ```
 **Main Config** (`/etc/traefik/traefik.yaml`):
 ```yaml
 certificatesResolvers:
  letsencrypt:
    acme:
      email: cory.bailey87@gmail.com
      storage: /etc/traefik/ssl/acme.json
      dnsChallenge:
        provider: namecheap
        resolvers:
          - "1.1.1.1:53"
          - "8.8.8.8:53"
 ```
 ### Service Routing Examples
 **Home Assistant** (`/etc/traefik/conf.d/home-automation.yaml`):
 ```yaml
 http:
  routers:
    homeassistant:
      rule: "Host(`hass.kavcorp.com`)"
      entryPoints:
        - websecure
      service: homeassistant
      tls:
        certResolver: letsencrypt
  services:
    homeassistant:
      loadBalancer:
        servers:
          - url: "http://10.4.2.62:8123"
 ```
 **Ollama** (`/etc/traefik/conf.d/ollama.yaml`):
 ```yaml
 http:
  routers:
    ollama:
      rule: "Host(`ollama.kavcorp.com`)"
      entryPoints:
        - websecure
      service: ollama
      tls:
        certResolver: letsencrypt
  services:
    ollama:
      loadBalancer:
        servers:
          - url: "http://10.4.2.224:11434"
 ```
 **Frigate** (`/etc/traefik/conf.d/frigate.yaml`):
 ```yaml
 http:
  routers:
    frigate:
      rule: "Host(`frigate.kavcorp.com`)"
      entryPoints:
        - websecure
      service: frigate
      tls:
        certResolver: letsencrypt
  services:
    frigate:
      loadBalancer:
        servers:
          - url: "http://10.4.2.215:5000"
 ```
 **Foundry VTT** (`/etc/traefik/conf.d/foundry.yaml`):
 ```yaml
 http:
  routers:
    foundry:
      rule: "Host(`vtt.kavcorp.com`)"
      entryPoints:
        - websecure
      service: foundry
      tls:
        certResolver: letsencrypt
  services:
    foundry:
      loadBalancer:
        servers:
          - url: "http://10.4.2.37:30000"
 ```
 **Proxmox** (`/etc/traefik/conf.d/proxmox.yaml`):
 ```yaml
 http:
  routers:
    proxmox:
      rule: "Host(`pm.kavcorp.com`)"
      entryPoints:
        - websecure
      service: proxmox
      tls:
        certResolver: letsencrypt
  services:
    proxmox:
      loadBalancer:
        servers:
          - url: "https://10.4.2.6:8006"
        serversTransport: proxmox-transport
  serversTransports:
    proxmox-transport:
      insecureSkipVerify: true
 ```
 ## AMP (Application Management Panel)
 **Location**: LXC 124 on elantris
 **IP**: 10.4.2.26:8080
 **Domain**: amp.kavcorp.com
 **Traefik Config** (`/etc/traefik/conf.d/amp.yaml`):
 ```yaml
 http:
  routers:
    amp:
      rule: "Host(`amp.kavcorp.com`)"
      entryPoints:
        - websecure
      service: amp
      tls:
        certResolver: letsencrypt
  services:
    amp:
      loadBalancer:
        servers:
          - url: "http://10.4.2.26:8080"
 ```
 ## Home Assistant
 **Location**: VM 100 on pm1
 **IP**: 10.4.2.62:8123
 **Reverse Proxy Config** (`/config/configuration.yaml`):
 ```yaml
 http:
  use_x_forwarded_for: true
  trusted_proxies:
    - 10.4.2.10  # Traefik IP
    - 172.30.0.0/16  # Home Assistant internal network (for add-ons)
 ```
 ## Sonarr
 **Location**: LXC 105 on pm2
 **IP**: 10.4.2.15:8989
 **API Key**: b331fe18ec2144148a41645d9ce8b249
 **Media Management Settings**:
 - Permissions: Enabled, chmod 777
 - Hardlinks: Enabled
 - Episode title required: Always
 - Free space check: 100MB minimum
 ## Radarr
 **Location**: LXC 108
 **IP**: 10.4.2.16:7878
 **API Key**: 5e6796988abf4d6d819a2b506a44f422
 ## NZBGet
 **Location**: Docker on kavnas (10.4.2.13)
 **Port**: 6789
 **Web User**: kavren
 **Web Password**: fre8ub2ax8
 **Key Settings** (`/volume1/docker/nzbget/config/nzbget.conf`):
 ```ini
 MainDir=/config
 DestDir=/downloads/completed
 InterDir=/downloads/intermediate
 UMask=0000  # Creates files with 777 permissions
 ```
 **Docker Mounts**:
 - Config: `/volume1/docker/nzbget/config:/config`
 - Downloads: `/volume1/Media/downloads:/downloads`
 ## Recyclarr
 **Location**: LXC 122 on pm2
 **IP**: 10.4.2.25
 **Binary**: `/usr/local/bin/recyclarr`
 **Config**: `/root/.config/recyclarr/recyclarr.yml`
 **Sync Schedule**: Daily at 3 AM via cron
 **Configured Profiles**:
 - **Radarr**: HD Bluray + WEB (1080p), Remux-1080p - Anime
 - **Sonarr**: WEB-1080p, Remux-1080p - Anime
 - **Custom Formats**: TRaSH Guides synced (Dolby Vision blocked, release group tiers)
 ## Jellyfin
 **Location**: LXC 121 on elantris
 **IP**: 10.4.2.21:8096
 **Media Mounts** (inside LXC):
 - `/media/tv` → `/el-pool/media/tv`
 - `/media/anime` → `/el-pool/media/anime`
 - `/media/movies` → `/el-pool/media/movies`
 **Permissions**: Files must be 777 for Jellyfin user (UID 100107 in LXC) to access
 ## Vaultwarden
 **Location**: LXC 125 on pm4
 **IP**: 10.4.2.212:80
 **Domain**: vtw.kavcorp.com
 **Traefik Config** (`/etc/traefik/conf.d/vaultwarden.yaml`):
 ```yaml
 http:
  routers:
    vaultwarden:
      rule: "Host(`vtw.kavcorp.com`)"
      entryPoints:
        - websecure
      service: vaultwarden
      tls:
        certResolver: letsencrypt
  services:
    vaultwarden:
      loadBalancer:
        servers:
          - url: "http://10.4.2.212:80"
 ```
 ## Immich
 **Location**: LXC 126 on pm4
 **IP**: 10.4.2.24:2283
 **Domain**: immich.kavcorp.com
 **Config** (`/opt/immich/.env`):
 ```bash
 TZ=America/Indiana/Indianapolis
 IMMICH_VERSION=release
 NODE_ENV=production
 DB_HOSTNAME=127.0.0.1
 DB_USERNAME=immich
 DB_PASSWORD=AulF5JhgWXrRxtaV05
 DB_DATABASE_NAME=immich
 DB_VECTOR_EXTENSION=pgvector
 REDIS_HOSTNAME=127.0.0.1
 IMMICH_MACHINE_LEARNING_URL=http://127.0.0.1:3003
 MACHINE_LEARNING_CACHE_FOLDER=/opt/immich/cache
 IMMICH_MEDIA_LOCATION=/mnt/immich-library
 ```
 **NFS Mount** (configured via `pct set 126 -mp0`):
 - Host path: `/mnt/pve/elantris-downloads/immich`
 - Container path: `/mnt/immich-library`
 - Source: elantris (`/el-pool/downloads/immich/`)
 **Systemd Services**:
 - `immich-web.service` - Web UI and API
 - `immich-ml.service` - Machine learning service
 **Traefik Config** (`/etc/traefik/conf.d/immich.yaml`):
 ```yaml
 http:
  routers:
    immich:
      rule: "Host(`immich.kavcorp.com`)"
      entryPoints:
        - websecure
      service: immich
      tls:
        certResolver: letsencrypt
  services:
    immich:
      loadBalancer:
        servers:
          - url: "http://10.4.2.24:2283"
 ```
--- a/docs/DECISIONS.md
+++ b/docs/DECISIONS.md
@@ -0,0 +1,163 @@
 # Architecture Decisions & Patterns
 > **Purpose**: Record of important decisions, patterns, and "why we do it this way"
 > **Update Frequency**: When making significant architectural choices
 ## Service Organization
 ### Authentication Strategy
 **Decision**: Services use their own built-in authentication, not Authelia
 **Reason**: Most *arr services and media tools have robust auth systems
 **Exception**: Consider Authelia for future services that lack authentication
 ### LXC vs Docker
 **Keep in Docker**:
 - NZBGet (requires specific volume mapping, works well in Docker)
 - Multi-container stacks
 - Services requiring Docker-specific features
 **Migrate to LXC**:
 - Single-purpose services (Sonarr, Radarr, etc.)
 - Services benefiting from isolation
 - Stateless applications
 ## File Permissions
 ### Media Files
 **Standard**: All media files and folders must be 777
 **Reason**:
 - NFS mounts between multiple systems with different UID mappings
 - Jellyfin runs in LXC with UID namespace mapping (100107)
 - Sonarr runs in LXC with different UID mapping
 - NZBGet runs in Docker with UID 1000
 **Implementation**:
 - NZBGet: `UMask=0000` to create files with 777
 - Sonarr: Media management → Set permissions → chmod 777
 - Manual fixes: `chmod -R 777` on media directories as needed
 ## Network Architecture
 ### Reverse Proxy
 **Decision**: Single Traefik instance handles all external access
 **Location**: LXC 104 on pm2
 **Benefits**:
 - Single point for SSL/TLS management
 - Automatic Let's Encrypt certificate renewal
 - Centralized routing configuration
 - DNS-01 challenge for wildcard certificates
 ### Service Domains
 **Pattern**: `<service>.kavcorp.com`
 **DNS**: All subdomains point to public IP (99.74.188.161)
 **Routing**: Traefik inspects Host header and routes internally
 ## Storage Architecture
 ### Media Storage
 **Decision**: NFS mount from elantris for all media
 **Path**: `/mnt/pve/elantris-media` → elantris `/el-pool/media`
 **Reason**:
 - Centralized storage
 - Accessible from all cluster nodes
 - Large capacity (24TB ZFS pool)
 - Easy to backup/snapshot
 ### LXC Root Filesystems
 **Decision**: Store on KavNas NFS for most services
 **Reason**:
 - Easy backups
 - Portable between nodes
 - Network storage sufficient for most workloads
 **Exception**: High I/O services use local-lvm
 ## Monitoring & Maintenance
 ### Configuration Management
 **Decision**: Manual configuration with documentation
 **Reason**: Small scale doesn't justify Ansible/Terraform complexity
 **Trade-off**: Requires disciplined documentation updates
 ### Backup Strategy
 **Decision**: Proxmox built-in backup to KavNas
 **Frequency**: [To be determined]
 **Retention**: [To be determined]
 ## Common Patterns
 ### Adding a New Service Behind Traefik
 1. Deploy service with static IP in 10.4.2.0/24 range
 2. Create Traefik config in `/etc/traefik/conf.d/<service>.yaml`
 3. Use pattern:
   ```yaml
   http:
     routers:
       <service>:
         rule: "Host(`<service>.kavcorp.com`)"
         entryPoints: [websecure]
         service: <service>
         tls:
           certResolver: letsencrypt
     services:
       <service>:
         loadBalancer:
           servers:
             - url: "http://<ip>:<port>"
   ```
 4. Traefik auto-reloads (no restart needed)
 5. Update `docs/INFRASTRUCTURE.md` with service details
 ### Troubleshooting Permission Issues
 1. Check file ownership: `ls -la /path/to/file`
 2. Check if 777: `stat /path/to/file`
 3. Fix permissions: `chmod -R 777 /path/to/directory`
 4. For NZBGet: Verify `UMask=0000` in nzbget.conf
 5. For Sonarr/Radarr: Check Settings → Media Management → Set Permissions
 ### Node SSH Access
 **From local machine**:
 - User: `kavren`
 - Key: `~/.ssh/id_ed25519`
 **Between cluster nodes**:
 - User: `root`
 - Each node has other nodes' keys in `/root/.ssh/authorized_keys`
 - Proxmox web UI uses node SSH for shell access
 ## Known Issues & Workarounds
 ### Jellyfin Not Seeing Media After Import
 **Symptom**: Files imported to `/media/tv` but Jellyfin shows empty
 **Cause**: Jellyfin LXC mount not active or permissions wrong
 **Fix**:
 1. Restart Jellyfin LXC: `pct stop 121 && pct start 121`
 2. Verify mount inside LXC: `pct exec 121 -- ls -la /media/tv/`
 3. Fix permissions if needed: `chmod -R 777 /mnt/pve/elantris-media/tv/`
 ### Sonarr/Radarr Import Failures
 **Symptom**: "Access denied" errors in logs
 **Cause**: Permission mismatch between download client and *arr service
 **Fix**: Ensure download folder has 777 permissions
 ## Future Considerations
 - [ ] Automated backup strategy
 - [ ] Monitoring/alerting system (Prometheus + Grafana?)
 - [ ] Consider Authelia for future services without built-in auth
 - [ ] Document disaster recovery procedures
 - [ ] Consider consolidating Docker hosts
--- a/docs/INFRASTRUCTURE.md
+++ b/docs/INFRASTRUCTURE.md
@@ -0,0 +1,120 @@
 # Infrastructure Reference
 > **Purpose**: Single source of truth for all infrastructure details - nodes, IPs, services, storage, network
 > **Update Frequency**: Immediately when infrastructure changes
 ## Proxmox Cluster Nodes
 | Hostname | IP Address  | Role | Resources |
 |----------|-------------|------|-----------|
 | pm1      | 10.4.2.2    | Proxmox cluster node | - |
 | pm2      | 10.4.2.6    | Proxmox cluster node (primary management) | - |
 | pm3      | 10.4.2.3    | Proxmox cluster node | - |
 | pm4      | 10.4.2.5    | Proxmox cluster node | - |
 | elantris | 10.4.2.14   | Proxmox cluster node (Debian-based) | 128GB RAM, ZFS storage (24TB) |
 **Cluster Name**: KavCorp
 **Network**: 10.4.2.0/24
 **Gateway**: 10.4.2.254
 ## Service Map
 | Service | IP:Port | Location | Domain | Auth |
 |---------|---------|----------|--------|------|
 | **Proxmox Web UI** | 10.4.2.6:8006 | pm2 | pm.kavcorp.com | Proxmox built-in |
 | **Traefik** | 10.4.2.10 | LXC 104 (pm2) | - | None (reverse proxy) |
 | **Authelia** | 10.4.2.19 | LXC 116 (pm2) | auth.kavcorp.com | SSO provider |
 | **Sonarr** | 10.4.2.15:8989 | LXC 105 (pm2) | sonarr.kavcorp.com | Built-in |
 | **Radarr** | 10.4.2.16:7878 | LXC 108 (pm2) | radarr.kavcorp.com | Built-in |
 | **Prowlarr** | 10.4.2.17:9696 | LXC 114 (pm2) | prowlarr.kavcorp.com | Built-in |
 | **Jellyseerr** | 10.4.2.18:5055 | LXC 115 (pm2) | jellyseerr.kavcorp.com | Built-in |
 | **Whisparr** | 10.4.2.20:6969 | LXC 117 (pm2) | whisparr.kavcorp.com | Built-in |
 | **Notifiarr** | 10.4.2.21 | LXC 118 (pm2) | - | API key |
 | **Jellyfin** | 10.4.2.21:8096 | LXC 121 (elantris) | jellyfin.kavcorp.com | Built-in |
 | **Bazarr** | 10.4.2.22:6767 | LXC 119 (pm2) | bazarr.kavcorp.com | Built-in |
 | **Kometa** | 10.4.2.23 | LXC 120 (pm2) | - | N/A |
 | **Recyclarr** | 10.4.2.25 | LXC 122 (pm2) | - | CLI only |
 | **NZBGet** | 10.4.2.13:6789 | Docker (kavnas) | nzbget.kavcorp.com | Built-in |
 | **Home Assistant** | 10.4.2.62:8123 | VM 100 (pm1) | hass.kavcorp.com | Built-in |
 | **Frigate** | 10.4.2.215:5000 | LXC 111 (pm3) | frigate.kavcorp.com | Built-in |
 | **Foundry VTT** | 10.4.2.37:30000 | LXC 112 (pm3) | vtt.kavcorp.com | Built-in |
 | **llama.cpp** | 10.4.2.224:11434 | LXC 123 (elantris) | ollama.kavcorp.com | None (API) |
 | **AMP** | 10.4.2.26:8080 | LXC 124 (elantris) | amp.kavcorp.com | Built-in |
 | **Vaultwarden** | 10.4.2.212 | LXC 125 (pm4) | vtw.kavcorp.com | Built-in |
 | **Immich** | 10.4.2.24:2283 | LXC 126 (pm4) | immich.kavcorp.com | Built-in |
 | **KavNas** | 10.4.2.13 | Synology NAS | - | NAS auth |
 ## Storage Architecture
 ### NFS Mounts (Shared)
 | Mount Name | Source | Mount Point | Size | Usage |
 |------------|--------|-------------|------|-------|
 | elantris-media | elantris:/el-pool/media | /mnt/pve/elantris-media | ~24TB | Media files (movies, TV, anime) |
 | KavNas | kavnas:10.4.2.13:/volume1 | /mnt/pve/KavNas | ~23TB | Backups, ISOs, LXC storage, downloads |
 ### Local Storage (Per-Node)
 | Storage | Type | Size | Usage |
 |---------|------|------|-------|
 | local | Directory | ~100GB | Backups, templates, ISOs |
 | local-lvm | LVM thin pool | ~350-375GB | VM/LXC disks |
 ### ZFS Pools
 | Pool | Location | Size | Usage |
 |------|----------|------|-------|
 | el-pool | elantris | 24TB | Large data storage |
 ### Media Folders
 | Path | Type | Permissions | Notes |
 |------|------|-------------|-------|
 | /mnt/pve/elantris-media/movies | NFS | 777 | Movie library |
 | /mnt/pve/elantris-media/tv | NFS | 777 | TV show library |
 | /mnt/pve/elantris-media/anime | NFS | 777 | Anime library |
 | /mnt/pve/elantris-media/processing | NFS | 777 | Processing/cleanup folder |
 | /mnt/pve/KavNas/downloads | NFS | 777 | Download client output |
 ## Network Configuration
 ### DNS & Domains
 **Domain**: kavcorp.com
 **DNS Provider**: Namecheap
 **Public IP**: 99.74.188.161
 All `*.kavcorp.com` subdomains route through Traefik reverse proxy (10.4.2.10) for SSL termination and routing.
 ### Standard Bridge
 **Bridge**: vmbr0
 **Physical Interface**: eno1
 **CIDR**: 10.4.2.0/24
 **Gateway**: 10.4.2.254
 ## Access & Credentials
 ### SSH Access
 - **User**: kavren (from local machine)
 - **User**: root (between cluster nodes)
 - **Key Type**: ed25519
 - **Node-to-Node**: Passwordless SSH configured for cluster operations
 ### Important Paths
 **Traefik (LXC 104)**:
 - Config: `/etc/traefik/traefik.yaml`
 - Service configs: `/etc/traefik/conf.d/*.yaml`
 - SSL certs: `/etc/traefik/ssl/acme.json`
 - Service file: `/etc/systemd/system/traefik.service.d/override.conf`
 **Media Services**:
 - Sonarr config: `/var/lib/sonarr/`
 - Radarr config: `/var/lib/radarr/`
 - Recyclarr config: `/root/.config/recyclarr/recyclarr.yml`
 **NZBGet (Docker on kavnas)**:
 - Config: `/volume1/docker/nzbget/config/nzbget.conf`
 - Downloads: `/volume1/Media/downloads/`
--- a/docs/README.md
+++ b/docs/README.md
@@ -0,0 +1,145 @@
 # Documentation Index
 > **Last Updated**: 2025-11-17 (Added Frigate and Foundry VTT to Traefik)
 > **IMPORTANT**: Update this index whenever you modify documentation files
 ## Quick Reference
 Need to know... | Check this file
 --- | ---
 Node IPs, service locations, storage paths | `INFRASTRUCTURE.md`
 Service configs, API keys, copy/paste configs | `CONFIGURATIONS.md`
 Why we made a decision, common patterns | `DECISIONS.md`
 What's currently being worked on | `TASKS.md`
 Recent changes and when they happened | `CHANGELOG.md`
 ## Core Documentation Files
 ### INFRASTRUCTURE.md
 **Purpose**: Single source of truth for all infrastructure
 **Contains**:
 - Cluster node IPs and specs
 - Complete service map with IPs, ports, domains
 - Storage architecture (NFS mounts, local storage, ZFS)
 - Network configuration
 - Important file paths
 **Update when**: Infrastructure changes (new service, IP change, storage mount)
 ---
 ### CONFIGURATIONS.md
 **Purpose**: Detailed service configurations
 **Contains**:
 - Traefik SSL/TLS setup
 - Service routing examples
 - API keys and credentials
 - Copy/paste ready config snippets
 - Service-specific settings
 **Update when**: Service configuration changes, API keys rotate, new services added
 ---
 ### DECISIONS.md
 **Purpose**: Architecture decisions and patterns
 **Contains**:
 - Why we chose LXC vs Docker for services
 - Authentication strategy
 - File permission standards (777 for media)
 - Common troubleshooting patterns
 - Known issues and workarounds
 **Update when**: Making architectural decisions, discovering new patterns, solving recurring issues
 ---
 ### TASKS.md
 **Purpose**: Track ongoing work and TODO items
 **Contains**:
 - Active tasks being worked on
 - Pending tasks
 - Blocked items
 - Task priority
 **Update when**: Starting new work, completing tasks, discovering new work
 ---
 ### CHANGELOG.md
 **Purpose**: Historical record of changes
 **Contains**:
 - Date-stamped entries for all significant changes
 - Who made the change (user/Claude)
 - What was changed and why
 - Links to relevant commits or files
 **Update when**: After completing any significant work
 ---
 ## Legacy Files (To Be Removed)
 These files will be consolidated into the core docs above:
 - ~~`infrastructure-map.md`~~ → Merged into `INFRASTRUCTURE.md`
 - ~~`home-assistant-traefik.md`~~ → Merged into `CONFIGURATIONS.md`
 - ~~`traefik-ssl-setup.md`~~ → Merged into `CONFIGURATIONS.md`
 - ~~`recyclarr-setup.md`~~ → Merged into `CONFIGURATIONS.md`
 Keep for reference (detailed info):
 - `cluster-state.md` - Detailed cluster topology
 - `inventory.md` - Complete VM/LXC inventory
 - `network.md` - Detailed network info
 - `storage.md` - Detailed storage info
 - `services.md` - Service dependencies and details
 ## Documentation Workflow
 ### When Making Changes
 1. **Before starting**: Check `INFRASTRUCTURE.md` for current state
 2. **During work**: Note what you're changing
 3. **After completing**:
   - Update relevant core doc (`INFRASTRUCTURE.md`, `CONFIGURATIONS.md`, or `DECISIONS.md`)
   - Add entry to `CHANGELOG.md` with date and description
   - Update `TASKS.md` to mark work complete
   - Update `README.md` (this file) Last Updated date
 ### Example Workflow
 ```
 Task: Add new service "Tautulli" to monitor Jellyfin
 1. Check INFRASTRUCTURE.md → Find next available IP
 2. Deploy service
 3. Update INFRASTRUCTURE.md → Add Tautulli to service map
 4. Update CONFIGURATIONS.md → Add Tautulli config snippet
 5. Update CHANGELOG.md → "2025-11-17: Added Tautulli LXC..."
 6. Update TASKS.md → Mark "Deploy Tautulli" as complete
 7. Update README.md → Change Last Updated date
 ```
 ## File Organization
 ```
 docs/
 ├── README.md              ← You are here (index and guide)
 ├── INFRASTRUCTURE.md      ← Infrastructure reference
 ├── CONFIGURATIONS.md      ← Service configurations
 ├── DECISIONS.md           ← Architecture decisions
 ├── TASKS.md               ← Current/ongoing tasks
 ├── CHANGELOG.md           ← Historical changes
 ├── cluster-state.md       ← [Keep] Detailed topology
 ├── inventory.md           ← [Keep] Full VM/LXC list
 ├── network.md             ← [Keep] Network details
 ├── storage.md             ← [Keep] Storage details
 └── services.md            ← [Keep] Service details
 ```
 ## Maintenance
 - Review and update docs weekly
 - Clean up completed tasks monthly
 - Archive old changelog entries yearly
 - Verify INFRASTRUCTURE.md matches reality regularly
--- a/docs/TASKS.md
+++ b/docs/TASKS.md
@@ -0,0 +1,37 @@
 # Current Tasks
 > **Last Updated**: 2025-11-17
 ## In Progress
 None currently.
 ## Pending
 ### Media Organization
 - [ ] Verify Jellyfin can see all imported media
 - [ ] Clean up `.processing-loose-episodes` folder
 - [ ] Review and potentially restore TV shows from processing folder
 ### Configuration
 - [ ] Consider custom format to prefer English audio releases
 - [ ] Review Sonarr language profiles for non-English releases
 ### Infrastructure
 - [ ] Define backup strategy and schedule
 - [ ] Set up monitoring/alerting system
 - [ ] Document disaster recovery procedures
 ## Completed (Recent)
 - [x] Fixed SSH access between cluster nodes (pm2 can access all nodes)
 - [x] Fixed NZBGet permissions (UMask=0000 for 777 files)
 - [x] Fixed Sonarr permissions (chmod 777 on imports)
 - [x] Fixed Jellyfin LXC mounts (restarted LXC)
 - [x] Fixed Jellyseerr IP in Traefik config
 - [x] Consolidated documentation structure
 - [x] Created documentation index
 ## Blocked
 None currently.
--- a/docs/cluster-state.md
+++ b/docs/cluster-state.md
@@ -0,0 +1,115 @@
 # KavCorp Proxmox Cluster State
 **Last Updated**: 2025-11-16
 ## Cluster Overview
 - **Cluster Name**: KavCorp
 - **Config Version**: 6
 - **Transport**: knet
 - **Quorum Status**: Quorate (5/5 nodes online)
 - **Total Nodes**: 5
 - **Total VMs**: 2
 - **Total LXCs**: 19
 ## Node Details
 ### pm1 (10.4.2.2)
 - **CPU**: 4 cores
 - **Memory**: 16GB (15.4 GiB)
 - **Storage**: ~100GB local
 - **Uptime**: ~52 hours
 - **Status**: Online
 - **Running Containers**:
  - VMID 100: haos12.1 (VM - Home Assistant OS)
  - VMID 101: twingate (LXC)
  - VMID 102: zwave-js-ui (LXC)
 ### pm2 (10.4.2.6) - Primary Management Node
 - **CPU**: 12 cores
 - **Memory**: 31GB (29.3 GiB)
 - **Storage**: ~100GB local
 - **Uptime**: ~52 hours
 - **Status**: Online
 - **Running Containers**:
  - VMID 104: traefik (LXC - Reverse Proxy)
  - VMID 105: sonarr (LXC)
  - VMID 108: radarr (LXC)
  - VMID 113: docker-pm2 (LXC - Docker host)
  - VMID 114: prowlarr (LXC)
  - VMID 115: jellyseerr (LXC)
  - VMID 116: authelia (LXC)
  - VMID 117: whisparr (LXC)
  - VMID 118: notifiarr (LXC)
  - VMID 119: bazarr (LXC)
  - VMID 120: kometa (LXC)
 ### pm3 (10.4.2.3)
 - **CPU**: 16 cores
 - **Memory**: 33GB (30.7 GiB)
 - **Storage**: ~100GB local
 - **Uptime**: ~319 hours (~13 days)
 - **Status**: Online
 - **Running Containers**:
  - VMID 106: mqtt (LXC)
  - VMID 107: dockge (LXC - Docker management UI, 12 CPU, 8GB RAM)
  - VMID 109: docker-pm3 (VM - Docker host, 4 CPU, 12GB RAM)
  - VMID 111: frigate (LXC - NVR)
  - VMID 112: foundryvtt (LXC - Virtual tabletop)
 ### pm4 (10.4.2.5)
 - **CPU**: 12 cores
 - **Memory**: 31GB (29.3 GiB)
 - **Storage**: ~100GB local
 - **Uptime**: ~52 hours
 - **Status**: Online
 - **Running Containers**:
  - VMID 103: shinobi (LXC - NVR)
  - VMID 110: docker-pm4 (LXC - Docker host)
 ### elantris (10.4.2.14) - Storage Node
 - **CPU**: 16 cores
 - **Memory**: 128GB (125.7 GiB) - **Largest node**
 - **Storage**: ~100GB local + 24TB ZFS pool (el-pool)
 - **Uptime**: ~26 minutes (recently rebooted)
 - **Status**: Online
 - **Running Containers**:
  - VMID 121: jellyfin (LXC - Media server)
 ## Cluster Health
 - **Quorum**: Yes (3/5 required, 5/5 available)
 - **Expected Votes**: 5
 - **Total Votes**: 5
 - **All Nodes**: Online and healthy
 ## Network Architecture
 - **Primary Network**: 10.4.2.0/24
 - **Gateway**: 10.4.2.254
 - **Bridge**: vmbr0 (on all nodes, bridged to eno1)
 - **DNS**: Managed by gateway/router
 ## Storage Summary
 ### Shared Storage
 - **KavNas** (NFS): 23TB total, ~9.2TB used - Primary shared storage from Synology DS918+
 - **elantris-downloads** (NFS): 23TB total, ~10.6TB used - Download storage from elantris
 ### Node-Local Storage
 Each node has:
 - **local**: ~100GB directory storage (backups, templates, ISOs)
 - **local-lvm**: ~350-375GB LVM thin pool (VM/LXC disks)
 ### ZFS Storage
 - **el-pool** (elantris only): 24TB ZFS pool, ~13.8TB used
 ## Migration Status
 Currently migrating services from Docker containers to dedicated LXCs. Most media stack services (Sonarr, Radarr, etc.) have been successfully migrated to LXCs on pm2.
 **Active Docker Hosts**:
 - docker-pm2 (LXC 113): Currently empty/minimal usage
 - docker-pm3 (VM 109): Active, running containerized services
 - docker-pm4 (LXC 110): Active
 - dockge (LXC 107): Docker management UI with web interface
--- a/docs/home-assistant-traefik.md
+++ b/docs/home-assistant-traefik.md
@@ -0,0 +1,304 @@
 # Home Assistant + Traefik Configuration
 **Last Updated**: 2025-11-16
 ## Overview
 Home Assistant is configured to work behind Traefik as a reverse proxy, accessible via `https://hass.kavcorp.com`.
 ## Configuration Details
 ### Home Assistant
 - **VMID**: 100
 - **Node**: pm1
 - **Type**: QEMU VM (Home Assistant OS)
 - **Internal IP**: 10.4.2.62
 - **Internal Port**: 8123
 - **External URL**: https://hass.kavcorp.com
 ### Traefik Configuration
 **Location**: `/etc/traefik/conf.d/home-automation.yaml` (inside Traefik LXC 104)
 ```yaml
 http:
  routers:
    homeassistant:
      rule: "Host(`hass.kavcorp.com`)"
      entryPoints:
        - websecure
      service: homeassistant
      tls:
        certResolver: letsencrypt
      # Home Assistant has its own auth
  services:
    homeassistant:
      loadBalancer:
        servers:
          - url: "http://10.4.2.62:8123"
 ```
 ### Home Assistant Configuration
 **File**: `/config/configuration.yaml` (inside Home Assistant VM)
 Add or merge the following section:
 ```yaml
 http:
  use_x_forwarded_for: true
  trusted_proxies:
    - 10.4.2.10  # Traefik IP
    - 172.30.0.0/16  # Home Assistant internal network (for add-ons)
 ```
 #### Configuration Explanation:
 - **`use_x_forwarded_for: true`**: Enables Home Assistant to read the real client IP from the `X-Forwarded-For` header that Traefik adds. This is important for:
  - Accurate logging of client IPs
  - IP-based authentication and blocking
  - Geolocation features
 - **`trusted_proxies`**: Whitelist of proxy IPs that Home Assistant will trust
  - `10.4.2.10` - Traefik reverse proxy
  - `172.30.0.0/16` - Home Assistant's internal Docker network (needed for add-ons to communicate)
 ## Setup Steps
 ### Method 1: Web UI (Recommended)
 1. **Install File Editor Add-on** (if not already installed):
   - Go to **Settings** → **Add-ons** → **Add-on Store**
   - Search for "File editor"
   - Click **Install**
 2. **Edit Configuration**:
   - Open the **File editor** add-on
   - Navigate to `/config/configuration.yaml`
   - Add the `http:` section shown above
   - If an `http:` section already exists, merge the settings
   - Save the file
 3. **Check Configuration**:
   - Go to **Developer Tools** → **YAML**
   - Click **Check Configuration**
   - Fix any errors if shown
 4. **Restart Home Assistant**:
   - Go to **Settings** → **System** → **Restart**
   - Wait for Home Assistant to come back online
 ### Method 2: Terminal & SSH Add-on
 If you have the **Terminal & SSH** add-on installed:
 ```bash
 # Edit the configuration
 nano /config/configuration.yaml
 # Add the http section shown above
 # Save with Ctrl+X, Y, Enter
 # Check configuration
 ha core check
 # Restart Home Assistant
 ha core restart
 ```
 ### Method 3: SSH to VM (Advanced)
 If you have SSH access to the Home Assistant VM:
 ```bash
 # SSH to pm1 first, then to the VM
 ssh pm1
 ssh root@10.4.2.62
 # Edit configuration
 vi /config/configuration.yaml
 # Restart Home Assistant
 ha core restart
 ```
 ## Verification
 After configuration and restart:
 1. **Test Internal Access**:
   ```bash
   curl -I http://10.4.2.62:8123
   ```
   Should return `HTTP/1.1 200 OK` or `405 Method Not Allowed`
 2. **Test Traefik Proxy**:
   ```bash
   curl -I https://hass.kavcorp.com
   ```
   Should return `HTTP/2 200` with valid SSL certificate
 3. **Check Logs**:
   - In Home Assistant: **Settings** → **System** → **Logs**
   - Look for any errors related to HTTP or trusted proxies
   - Client IPs should now show actual client IPs, not Traefik's IP
 4. **Verify Headers**:
   - Open browser developer tools (F12)
   - Go to **Network** tab
   - Access `https://hass.kavcorp.com`
   - Check response headers for `X-Forwarded-For`, `X-Forwarded-Proto`, etc.
 ## Troubleshooting
 ### 400 Bad Request / Untrusted Proxy
 **Symptom**: Home Assistant returns 400 errors when accessing via Traefik
 **Solution**: Verify the `trusted_proxies` configuration includes Traefik's IP (`10.4.2.10`)
 ```yaml
 http:
  trusted_proxies:
    - 10.4.2.10
 ```
 ### Wrong Client IP in Logs
 **Symptom**: All requests show Traefik's IP (10.4.2.10) instead of real client IP
 **Solution**: Enable `use_x_forwarded_for`:
 ```yaml
 http:
  use_x_forwarded_for: true
 ```
 ### Configuration Check Fails
 **Symptom**: YAML validation fails with syntax errors
 **Solution**:
 - Ensure proper indentation (2 spaces per level, no tabs)
 - Check for special characters that need quoting
 - Use `ha core check` to see detailed error messages
 ### Cannot Access via Domain
 **Symptom**: `https://hass.kavcorp.com` doesn't work but direct IP does
 **Solution**:
 1. Check Traefik logs:
   ```bash
   ssh pm2 "pct exec 104 -- tail -f /var/log/traefik/traefik.log"
   ```
 2. Verify DNS resolves correctly:
   ```bash
   nslookup hass.kavcorp.com
   ```
 3. Check Traefik config was loaded:
   ```bash
   ssh pm2 "pct exec 104 -- cat /etc/traefik/conf.d/home-automation.yaml"
   ```
 ### SSL Certificate Issues
 **Symptom**: Browser shows SSL certificate errors
 **Solution**:
 1. Check if Let's Encrypt certificate was generated:
   ```bash
   ssh pm2 "pct exec 104 -- cat /etc/traefik/ssl/acme.json | grep hass"
   ```
 2. Allow time for DNS propagation (up to 1 hour with Namecheap)
 3. Check Traefik logs for ACME errors
 ## Security Considerations
 ### Authentication
 - Home Assistant has its own authentication system
 - No Authelia middleware is applied to this route
 - Users must log in to Home Assistant directly
 - Consider enabling **Multi-Factor Authentication** in Home Assistant:
  - **Settings** → **People** → Your User → **Enable MFA**
 ### Trusted Networks
 If you want to bypass authentication for local network access, add to `configuration.yaml`:
 ```yaml
 homeassistant:
  auth_providers:
    - type: trusted_networks
      trusted_networks:
        - 10.4.2.0/24  # Local network
      allow_bypass_login: true
    - type: homeassistant
 ```
 **Warning**: Only use this if your local network is secure!
 ### IP Banning
 Home Assistant can automatically ban IPs after failed login attempts. Ensure `use_x_forwarded_for` is enabled so it bans the actual attacker's IP, not Traefik's IP.
 ## Related Services
 ### Frigate Integration
 If Frigate is integrated with Home Assistant:
 - Frigate is accessible via `https://frigate.kavcorp.com` (see separate Frigate documentation)
 - Home Assistant can embed Frigate camera streams
 - Both services trust Traefik as reverse proxy
 ### Add-ons and Internal Communication
 Home Assistant add-ons communicate via the internal Docker network (`172.30.0.0/16`). This network must be in `trusted_proxies` for add-ons to work correctly when accessing the Home Assistant API.
 ## Updating Configuration
 When making changes to Home Assistant configuration:
 1. **Always check configuration** before restarting:
   ```bash
   ha core check
   ```
 2. **Back up configuration** before major changes:
   - **Settings** → **System** → **Backups** → **Create Backup**
 3. **Test changes** in a development environment if possible
 4. **Monitor logs** after restarting for errors
 ## DNS Configuration
 Ensure your DNS provider (Namecheap) has the correct A record:
 ```
 hass.kavcorp.com → Your public IP (99.74.188.161)
 ```
 Or use a CNAME if you have a wildcard:
 ```
 *.kavcorp.com → Your public IP
 ```
 Traefik handles the Let's Encrypt DNS-01 challenge automatically.
 ## Additional Resources
 - [Home Assistant Reverse Proxy Documentation](https://www.home-assistant.io/integrations/http/#reverse-proxies)
 - [Traefik Documentation](https://doc.traefik.io/traefik/)
 - [TRaSH Guides - Traefik Setup](https://trash-guides.info/Hardlinks/Examples/)
 ## Change Log
 **2025-11-16**:
 - Initial configuration created
 - Added Home Assistant to Traefik
 - Configured trusted proxies
 - Set up `hass.kavcorp.com` domain
--- a/docs/infrastructure-map.md
+++ b/docs/infrastructure-map.md
@@ -0,0 +1,44 @@
 # Infrastructure Map
 ## Proxmox Cluster Nodes
 | Hostname | IP Address  | Role |
 |----------|-------------|------|
 | pm1      | 10.4.2.2    | Proxmox cluster node |
 | pm2      | 10.4.2.6    | Proxmox cluster node |
 | pm3      | 10.4.2.3    | Proxmox cluster node |
 | pm4      | 10.4.2.5    | Proxmox cluster node |
 | elantris | 10.4.2.14   | Proxmox cluster node (Debian-based) |
 ## Key Services
 | Service | IP:Port | Location | Notes |
 |---------|---------|----------|-------|
 | Sonarr | 10.4.2.15:8989 | LXC 105 on pm2 | TV shows |
 | Radarr | 10.4.2.16:7878 | - | Movies |
 | Prowlarr | 10.4.2.17:9696 | - | Indexer manager |
 | Bazarr | 10.4.2.18:6767 | - | Subtitles |
 | Whisparr | 10.4.2.19:6969 | - | Adult content |
 | Jellyseerr | 10.4.2.20:5055 | LXC 115 on pm2 | Request management |
 | Jellyfin | 10.4.2.21:8096 | LXC 121 on elantris | Media server |
 | NZBGet | 10.4.2.13:6789 | Docker on kavnas | Download client |
 | Traefik | 10.4.2.10 | LXC 104 on pm2 | Reverse proxy |
 | Home Assistant | 10.4.2.62:8123 | VM 100 on pm1 | Home automation |
 | Frigate | 10.4.2.63:5000 | - | NVR/Camera system |
 ## Storage
 | Mount | Path | Notes |
 |-------|------|-------|
 | elantris-media | /mnt/pve/elantris-media | NFS from elantris:/el-pool/media |
 | KavNas | /mnt/pve/KavNas | NFS from kavnas:/volume1 |
 ## Domain Mappings
 All services accessible via `*.kavcorp.com` through Traefik reverse proxy:
 - pm.kavcorp.com → pm2 (10.4.2.6:8006)
 - sonarr.kavcorp.com → 10.4.2.15:8989
 - radarr.kavcorp.com → 10.4.2.16:7878
 - jellyfin.kavcorp.com → 10.4.2.21:8096
 - hass.kavcorp.com → 10.4.2.62:8123
 - etc.
--- a/docs/inventory.md
+++ b/docs/inventory.md
@@ -0,0 +1,289 @@
 # VM and LXC Inventory
 **Last Updated**: 2025-11-16
 ## Virtual Machines
 ### VMID 100 - haos12.1 (Home Assistant OS)
 - **Node**: pm1
 - **Type**: QEMU VM
 - **CPU**: 2 cores
 - **Memory**: 4GB
 - **Disk**: 32GB
 - **Status**: Running
 - **Uptime**: ~52 hours
 - **Tags**: proxmox-helper-scripts
 - **Purpose**: Home automation platform
 ### VMID 109 - docker-pm3
 - **Node**: pm3
 - **Type**: QEMU VM
 - **CPU**: 4 cores
 - **Memory**: 12GB
 - **Disk**: 100GB
 - **Status**: Running
 - **Uptime**: ~190 hours (~8 days)
 - **Purpose**: Docker host for containerized services
 - **Notes**: Primary Docker host, high network traffic
 ## LXC Containers
 ### Infrastructure Services
 #### VMID 104 - traefik
 - **Node**: pm2
 - **IP**: 10.4.2.10
 - **CPU**: 2 cores
 - **Memory**: 2GB
 - **Disk**: 10GB (KavNas)
 - **Status**: Running
 - **Tags**: community-script, proxy
 - **Purpose**: Reverse proxy and load balancer
 - **Features**: Unprivileged, nesting enabled
 - **Uptime**: ~2.5 hours
 #### VMID 106 - mqtt
 - **Node**: pm3
 - **CPU**: 1 core
 - **Memory**: 512MB
 - **Disk**: 2GB (local-lvm)
 - **Status**: Running
 - **Tags**: proxmox-helper-scripts
 - **Purpose**: MQTT message broker for IoT devices
 - **Uptime**: ~319 hours (~13 days)
 - **Notes**: High inbound network traffic (3.4GB)
 #### VMID 116 - authelia
 - **Node**: pm2
 - **IP**: 10.4.2.23
 - **CPU**: 1 core
 - **Memory**: 512MB
 - **Disk**: 2GB (KavNas)
 - **Status**: Running
 - **Tags**: authenticator, community-script
 - **Purpose**: Authentication and authorization server
 - **Features**: Unprivileged, nesting enabled
 - **Uptime**: ~1.9 hours
 ### Media Stack (*arr services)
 #### VMID 105 - sonarr
 - **Node**: pm2
 - **IP**: 10.4.2.15
 - **CPU**: 2 cores
 - **Memory**: 1GB
 - **Disk**: 4GB (KavNas)
 - **Mount Points**:
  - /media → elantris-media (NFS)
  - /mnt/kavnas → KavNas (NFS)
 - **Status**: Running
 - **Tags**: arr, community-script
 - **Features**: Unprivileged, nesting enabled
 - **Uptime**: ~56 minutes
 #### VMID 108 - radarr
 - **Node**: pm2
 - **IP**: 10.4.2.16
 - **CPU**: 2 cores
 - **Memory**: 1GB
 - **Disk**: 4GB (KavNas)
 - **Mount Points**:
  - /media → elantris-media (NFS)
  - /mnt/kavnas → KavNas (NFS)
 - **Status**: Running
 - **Tags**: arr, community-script
 - **Features**: Unprivileged, nesting enabled
 - **Uptime**: ~56 minutes
 #### VMID 114 - prowlarr
 - **Node**: pm2
 - **IP**: 10.4.2.17
 - **CPU**: 2 cores
 - **Memory**: 1GB
 - **Disk**: 4GB (KavNas)
 - **Status**: Running
 - **Tags**: arr, community-script
 - **Features**: Unprivileged, nesting enabled
 - **Purpose**: Indexer manager for *arr services
 - **Uptime**: ~56 minutes
 #### VMID 117 - whisparr
 - **Node**: pm2
 - **IP**: 10.4.2.19
 - **CPU**: 2 cores
 - **Memory**: 1GB
 - **Disk**: 4GB (KavNas)
 - **Status**: Running
 - **Tags**: arr, community-script
 - **Features**: Unprivileged, nesting enabled
 - **Uptime**: ~56 minutes
 #### VMID 119 - bazarr
 - **Node**: pm2
 - **IP**: 10.4.2.18
 - **CPU**: 2 cores
 - **Memory**: 1GB
 - **Disk**: 4GB (KavNas)
 - **Status**: Running
 - **Tags**: arr, community-script
 - **Features**: Unprivileged, nesting enabled
 - **Purpose**: Subtitle management for Sonarr/Radarr
 - **Uptime**: ~56 minutes
 ### Media Servers
 #### VMID 115 - jellyseerr
 - **Node**: pm2
 - **IP**: 10.4.2.20
 - **CPU**: 4 cores
 - **Memory**: 4GB
 - **Disk**: 8GB (KavNas)
 - **Status**: Running
 - **Tags**: community-script, media
 - **Features**: Unprivileged, nesting enabled
 - **Purpose**: Request management for Jellyfin
 - **Uptime**: ~56 minutes
 #### VMID 120 - kometa
 - **Node**: pm2
 - **IP**: 10.4.2.21
 - **CPU**: 2 cores
 - **Memory**: 4GB
 - **Disk**: 8GB (KavNas)
 - **Status**: Running
 - **Tags**: community-script, media, streaming
 - **Features**: Unprivileged, nesting enabled
 - **Purpose**: Media library metadata manager
 - **Uptime**: ~1.9 hours
 #### VMID 121 - jellyfin
 - **Node**: elantris
 - **IP**: 10.4.2.22
 - **CPU**: 2 cores
 - **Memory**: 2GB
 - **Disk**: 16GB (el-pool)
 - **Status**: Running
 - **Tags**: community-script, media
 - **Features**: Unprivileged, nesting enabled
 - **Purpose**: Media server
 - **Uptime**: ~19 minutes
 - **Notes**: Recently migrated to elantris
 #### VMID 118 - notifiarr
 - **Node**: pm2
 - **IP**: 10.4.2.24
 - **CPU**: 1 core
 - **Memory**: 512MB
 - **Disk**: 2GB (KavNas)
 - **Status**: Running
 - **Tags**: arr, community-script
 - **Features**: Unprivileged, nesting enabled
 - **Purpose**: Notification service for *arr apps
 - **Uptime**: ~1.9 hours
 ### Docker Hosts
 #### VMID 107 - dockge
 - **Node**: pm3
 - **CPU**: 12 cores
 - **Memory**: 8GB
 - **Disk**: 120GB (local-lvm)
 - **Status**: Running
 - **Tags**: proxmox-helper-scripts
 - **Features**: Unprivileged, nesting enabled
 - **Purpose**: Docker Compose management UI
 - **Uptime**: ~319 hours (~13 days)
 #### VMID 110 - docker-pm4
 - **Node**: pm4
 - **CPU**: 4 cores
 - **Memory**: 8GB
 - **Disk**: 10GB (local-lvm)
 - **Status**: Running
 - **Tags**: community-script, docker
 - **Features**: Unprivileged, nesting enabled
 - **Purpose**: Docker host
 - **Uptime**: ~45 hours
 #### VMID 113 - docker-pm2
 - **Node**: pm2
 - **CPU**: 4 cores
 - **Memory**: 8GB
 - **Disk**: 10GB (local-lvm)
 - **Status**: Running
 - **Tags**: community-script, docker
 - **Features**: Unprivileged, nesting enabled
 - **Purpose**: Docker host
 - **Uptime**: ~45 hours
 - **Notes**: Currently empty/minimal usage
 ### Smart Home & IoT
 #### VMID 101 - twingate
 - **Node**: pm1
 - **CPU**: 1 core
 - **Memory**: 512MB
 - **Disk**: 8GB (local-lvm)
 - **Status**: Running
 - **Features**: Unprivileged
 - **Purpose**: Zero-trust network access
 - **Uptime**: ~52 hours
 #### VMID 102 - zwave-js-ui
 - **Node**: pm1
 - **CPU**: 2 cores
 - **Memory**: 1GB
 - **Disk**: 4GB (local-lvm)
 - **Status**: Running
 - **Tags**: proxmox-helper-scripts
 - **Features**: Unprivileged, nesting enabled
 - **Purpose**: Z-Wave device management
 - **Uptime**: ~52 hours
 ### Surveillance & NVR
 #### VMID 103 - shinobi
 - **Node**: pm4
 - **CPU**: 2 cores
 - **Memory**: 2GB
 - **Disk**: 8GB (local-lvm)
 - **Status**: Running
 - **Tags**: community-script, nvr
 - **Features**: Unprivileged, nesting enabled
 - **Purpose**: Network Video Recorder
 - **Uptime**: ~52 hours
 - **Notes**: Very high network traffic (407GB in, 162GB out)
 #### VMID 111 - frigate
 - **Node**: pm3
 - **CPU**: 4 cores
 - **Memory**: 8GB
 - **Disk**: 120GB (local-lvm)
 - **Status**: Running
 - **Tags**: proxmox-helper-scripts
 - **Features**: Unprivileged, nesting enabled
 - **Purpose**: NVR with object detection
 - **Uptime**: ~18 hours
 - **Notes**: High storage and network usage
 ### Gaming
 #### VMID 112 - foundryvtt
 - **Node**: pm3
 - **CPU**: 4 cores
 - **Memory**: 6GB
 - **Disk**: 100GB (local-lvm)
 - **Status**: Running
 - **Features**: Unprivileged, nesting enabled
 - **Purpose**: Virtual tabletop gaming platform
 - **Uptime**: ~116 hours (~5 days)
 ## Summary Statistics
 - **Total Containers**: 21 (2 VMs + 19 LXCs)
 - **All Running**: Yes
 - **Total CPU Allocation**: 62 cores
 - **Total Memory Allocation**: 63.5GB
 - **Primary Storage**: KavNas (NFS) for most LXCs
 - **Most Active Node**: pm2 (11 containers)
 - **Newest Deployments**: Media stack on pm2 (mostly < 2 hours uptime)
--- a/docs/network.md
+++ b/docs/network.md
@@ -0,0 +1,132 @@
 # Network Architecture
 **Last Updated**: 2025-11-16
 ## Network Overview
 - **Primary Network**: 10.4.2.0/24
 - **Gateway**: 10.4.2.254
 - **Bridge**: vmbr0 (standard on all nodes)
 ## Node Network Configuration
 All Proxmox nodes use a similar network configuration:
 - **Physical Interface**: eno1 (1Gbps Ethernet)
 - **Bridge**: vmbr0 (Linux bridge)
 - **Bridge Config**: STP off, forward delay 0
 ### Example Configuration (pm2)
 ```
 auto vmbr0
 iface vmbr0 inet static
    address 10.4.2.6/24
    gateway 10.4.2.254
    bridge-ports eno1
    bridge-stp off
    bridge-fd 0
 ```
 ## IP Address Allocation
 ### Infrastructure Devices
 | IP | Device | Type | Notes |
 |---|---|---|---|
 | 10.4.2.2 | pm1 | Proxmox Node | 4 cores, 16GB RAM |
 | 10.4.2.3 | pm3 | Proxmox Node | 16 cores, 33GB RAM |
 | 10.4.2.5 | pm4 | Proxmox Node | 12 cores, 31GB RAM |
 | 10.4.2.6 | pm2 | Proxmox Node | 12 cores, 31GB RAM (primary mgmt) |
 | 10.4.2.13 | KavNas | Synology DS918+ | Primary NFS storage |
 | 10.4.2.14 | elantris | Proxmox Node | 16 cores, 128GB RAM, Storage node |
 | 10.4.2.254 | Gateway | Router | Network gateway |
 ### Service IPs (LXC/VM)
 #### Reverse Proxy & Auth
 | IP | Service | VMID | Node | Purpose |
 |---|---|---|---|---|
 | 10.4.2.10 | traefik | 104 | pm2 | Reverse proxy |
 | 10.4.2.23 | authelia | 116 | pm2 | Authentication |
 #### Media Automation Stack
 | IP | Service | VMID | Node | Purpose |
 |---|---|---|---|---|
 | 10.4.2.15 | sonarr | 105 | pm2 | TV show management |
 | 10.4.2.16 | radarr | 108 | pm2 | Movie management |
 | 10.4.2.17 | prowlarr | 114 | pm2 | Indexer manager |
 | 10.4.2.18 | bazarr | 119 | pm2 | Subtitle management |
 | 10.4.2.19 | whisparr | 117 | pm2 | Adult content management |
 | 10.4.2.24 | notifiarr | 118 | pm2 | Notification service |
 #### Media Servers
 | IP | Service | VMID | Node | Purpose |
 |---|---|---|---|---|
 | 10.4.2.20 | jellyseerr | 115 | pm2 | Request management |
 | 10.4.2.21 | kometa | 120 | pm2 | Metadata manager |
 | 10.4.2.22 | jellyfin | 121 | elantris | Media server |
 ### Dynamic/DHCP Services
 The following services currently use DHCP or don't have static IPs documented:
 - VMID 100: haos12.1 (Home Assistant)
 - VMID 101: twingate
 - VMID 102: zwave-js-ui
 - VMID 103: shinobi
 - VMID 106: mqtt
 - VMID 107: dockge
 - VMID 109: docker-pm3
 - VMID 110: docker-pm4
 - VMID 111: frigate
 - VMID 112: foundryvtt
 - VMID 113: docker-pm2
 ## Reserved IP Ranges
 **Recommendation**: Reserve IP ranges for different service types:
 - `10.4.2.1-10.4.2.20`: Infrastructure and core services
 - `10.4.2.21-10.4.2.50`: Media services
 - `10.4.2.51-10.4.2.100`: Home automation and IoT
 - `10.4.2.101-10.4.2.150`: General applications
 - `10.4.2.151-10.4.2.200`: Testing and development
 ## NFS Mounts
 ### KavNas (10.4.2.13)
 - **Source**: Synology DS918+ NAS
 - **Mount**: Available on all Proxmox nodes
 - **Capacity**: 23TB total
 - **Usage**: ~9.2TB used
 - **Purpose**: Primary shared storage for LXC rootfs, backups, ISOs, templates
 - **Mount Point on Nodes**: `/mnt/pve/KavNas`
 ### elantris-downloads (10.4.2.14)
 - **Source**: elantris node
 - **Mount**: Available on all Proxmox nodes
 - **Capacity**: 23TB total
 - **Usage**: ~10.6TB used
 - **Purpose**: Download storage, media staging
 - **Mount Point on Nodes**: `/mnt/pve/elantris-downloads`
 ### elantris-media
 - **Source**: elantris node
 - **Mount**: Used by media services
 - **Purpose**: Media library storage
 - **Mounted in LXCs**: sonarr, radarr (mounted at `/media`)
 ## Firewall Notes
 *TODO: Document firewall rules and port forwarding as configured*
 ## VLAN Configuration
 Currently using a flat network (no VLANs configured). Consider implementing VLANs for:
 - Management network (Proxmox nodes)
 - Service network (LXC/VM services)
 - IoT network (smart home devices)
 - Storage network (NFS traffic)
 ## Future Network Improvements
 - [ ] Implement VLANs for network segmentation
 - [ ] Document all static IP assignments
 - [ ] Set up monitoring for network traffic
 - [ ] Consider 10GbE for storage traffic between nodes
 - [ ] Implement proper DNS (currently using gateway)
--- a/docs/recyclarr-setup.md
+++ b/docs/recyclarr-setup.md
@@ -0,0 +1,178 @@
 # Recyclarr Setup - TRaSH Guides Automation
 **Last Updated**: 2025-11-16
 ## Overview
 Recyclarr automatically syncs TRaSH Guides recommended custom formats and quality profiles to Radarr and Sonarr.
 ## Installation Details
 - **LXC**: VMID 122 on pm2
 - **IP Address**: 10.4.2.25
 - **Binary**: `/usr/local/bin/recyclarr`
 - **Config**: `/root/.config/recyclarr/recyclarr.yml`
 ## Configuration Summary
 ### Radarr (Movies)
 - **URL**: http://10.4.2.16:7878
 - **API Key**: 5e6796988abf4d6d819a2b506a44f422
 - **Quality Profiles**:
  - HD Bluray + WEB (1080p standard)
  - Remux-1080p - Anime
 - **Custom Formats**: 34 formats synced
 - **Dolby Vision**: **BLOCKED** (DV w/o HDR fallback scored at -10000)
 **Key Settings**:
 - Standard profile prefers 1080p Bluray and WEB releases
 - Anime profile includes Remux with merged quality groups
 - Blocks Dolby Vision Profile 5 (no HDR fallback) on standard profile
 - Blocks unwanted formats (BR-DISK, LQ, x265 HD, 3D, AV1, Extras)
 - Uses TRaSH Guides release group tiers (BD, WEB, Anime BD, Anime WEB)
 ### Sonarr (TV Shows)
 - **URL**: http://10.4.2.15:8989
 - **API Key**: b331fe18ec2144148a41645d9ce8b249
 - **Quality Profiles**:
  - WEB-1080p (standard)
  - Remux-1080p - Anime
 - **Custom Formats**: 29 formats synced
 - **Dolby Vision**: **BLOCKED** (DV w/o HDR fallback scored at -10000)
 **Key Settings**:
 - Standard profile prefers 1080p WEB releases (WEB-DL and WEBRip)
 - Anime profile includes Bluray Remux with merged quality groups
 - Blocks Dolby Vision Profile 5 (no HDR fallback) on standard profile
 - Blocks unwanted formats (BR-DISK, LQ, x265 HD, AV1, Extras)
 - Uses TRaSH Guides WEB release group tiers and Anime tiers
 ## Automated Sync Schedule
 Recyclarr runs daily at 6:00 AM via cron:
 ```bash
 0 6 * * * /usr/local/bin/recyclarr sync > /dev/null 2>&1
 ```
 ## Manual Sync
 To manually trigger a sync:
 ```bash
 ssh pm2 "pct exec 122 -- /usr/local/bin/recyclarr sync"
 ```
 ## Dolby Vision Blocking
 Both Radarr and Sonarr are configured to **completely block** Dolby Vision releases without HDR10 fallback (Profile 5). These releases will receive a score of **-10000**, ensuring they are never downloaded.
 **What this blocks**:
 - WEB-DL releases with Dolby Vision Profile 5 (no HDR10 fallback)
 - Any release that only plays back in DV without falling back to HDR10
 **What this allows**:
 - HDR10 releases
 - HDR10+ releases
 - Dolby Vision Profile 7 with HDR10 fallback (from UHD Blu-ray)
 ## Custom Format Details
 ### Blocked Formats (Score: -10000)
 - **DV (w/o HDR fallback)**: Blocks DV Profile 5
 - **BR-DISK**: Blocks full BluRay disc images
 - **LQ**: Blocks low-quality releases
 - **x265 (HD)**: Blocks x265 encoded HD content (720p/1080p)
 - **3D**: Blocks 3D releases
 - **AV1**: Blocks AV1 codec
 - **Extras**: Blocks extras, featurettes, etc.
 ### Preferred Formats
 - **WEB Tier 01-03**: Scored 1600-1700 (high-quality WEB groups)
 - **UHD Bluray Tier 01-03**: Scored 1700 (Radarr only)
 - **Streaming Services**: Neutral score (AMZN, ATVP, DSNP, HBO, etc.)
 - **Repack/Proper**: Scored 5-7 (prefers repacks over originals)
 ## Monitoring
 Check Recyclarr logs:
 ```bash
 ssh pm2 "pct exec 122 -- cat /root/.config/recyclarr/logs/recyclarr.log"
 ```
 View last sync results:
 ```bash
 ssh pm2 "pct exec 122 -- /usr/local/bin/recyclarr sync --preview"
 ```
 ## Updating Configuration
 1. Edit config: `ssh pm2 "pct exec 122 -- nano /root/.config/recyclarr/recyclarr.yml"`
 2. Test config: `ssh pm2 "pct exec 122 -- /usr/local/bin/recyclarr config check"`
 3. Run sync: `ssh pm2 "pct exec 122 -- /usr/local/bin/recyclarr sync"`
 ## Troubleshooting
 ### Check if sync is working
 ```bash
 ssh pm2 "pct exec 122 -- /usr/local/bin/recyclarr sync --preview"
 ```
 ### Verify API connectivity
 ```bash
 # Test Radarr
 curl -H "X-Api-Key: 5e6796988abf4d6d819a2b506a44f422" http://10.4.2.16:7878/api/v3/system/status
 # Test Sonarr
 curl -H "X-Api-Key: b331fe18ec2144148a41645d9ce8b249" http://10.4.2.15:8989/api/v3/system/status
 ```
 ### Force resync all custom formats
 ```bash
 ssh pm2 "pct exec 122 -- /usr/local/bin/recyclarr sync --force"
 ```
 ## Important Notes
 - **Do not modify custom format scores manually** in Radarr/Sonarr web UI - they will be overwritten on next sync
 - **Quality profile changes** made in the web UI may be preserved unless they conflict with Recyclarr config
 - **The DV blocking is automatic** - no manual intervention needed
 - Recyclarr keeps custom formats up-to-date with TRaSH Guides automatically
 ## Next Steps
 - Monitor downloads to ensure DV content is properly blocked
 - Adjust quality profiles in Recyclarr config if needed (e.g., prefer 1080p over 4K)
 - Review TRaSH Guides for additional custom formats: https://trash-guides.info/
 ## Anime Configuration
 Both Radarr and Sonarr include a dedicated "Remux-1080p - Anime" quality profile for anime content.
 **Key Anime Settings**:
 - **Quality groups merged** per TRaSH Guides (Remux + Bluray + WEB + HDTV in combined groups)
 - **Anime BD Tiers 01-08**: Scored 1300-1400 (SeaDex muxers, remuxes, fansubs, P2P, mini encodes)
 - **Anime WEB Tiers 01-06**: Scored 150-350 (muxers, top fansubs, official subs)
 - **Dual Audio preferred**: +101 score for releases with both Japanese and English audio
 - **Unwanted blocked**: Same as standard profile (BR-DISK, LQ, x265 HD, AV1, Extras)
 **Scoring Differences from Standard Profile**:
 - Anime Web Tier 01 scores 350 (vs 1600 for standard WEB Tier 01)
 - Emphasizes BD quality over WEB for anime (BD Tier 01 = 1400)
 - Merged quality groups allow HDTV to be considered alongside WEB for anime releases
 **To use anime profile**:
 1. In Radarr/Sonarr, edit a movie or series
 2. Change quality profile to "Remux-1080p - Anime"
 3. Recyclarr will automatically manage custom format scores
 ## Inventory Update
 Added to cluster inventory:
 - **VMID**: 122
 - **Name**: recyclarr
 - **Node**: pm2
 - **IP**: 10.4.2.25
 - **CPU**: 1 core
 - **Memory**: 512MB
 - **Disk**: 2GB (KavNas)
 - **Purpose**: TRaSH Guides automation for Radarr/Sonarr
 - **Tags**: arr, community-script
--- a/docs/services.md
+++ b/docs/services.md
@@ -0,0 +1,222 @@
 # Service Mappings and Dependencies
 **Last Updated**: 2025-11-16
 ## Service Categories
 ### Reverse Proxy & Authentication
 #### Traefik (VMID 104)
 - **Node**: pm2
 - **IP**: 10.4.2.10
 - **Port**: 80, 443
 - **Purpose**: Reverse proxy and load balancer
 - **Config Location**: *TODO: Document Traefik config location*
 - **Dependencies**: None
 - **Backends**: Routes traffic to all web services
 #### Authelia (VMID 116)
 - **Node**: pm2
 - **IP**: 10.4.2.23
 - **Purpose**: Single sign-on and authentication
 - **Dependencies**: Traefik
 - **Protected Services**: *TODO: Document which services require auth*
 ### Media Automation Stack
 #### Prowlarr (VMID 114)
 - **Node**: pm2
 - **IP**: 10.4.2.17
 - **Port**: 9696 (default)
 - **Purpose**: Indexer manager for *arr services
 - **Dependencies**: None
 - **Integrated With**: Sonarr, Radarr, Whisparr
 #### Sonarr (VMID 105)
 - **Node**: pm2
 - **IP**: 10.4.2.15
 - **Port**: 8989 (default)
 - **Purpose**: TV show automation
 - **Dependencies**: Prowlarr
 - **Mount Points**:
  - `/media` - Media library
  - `/mnt/kavnas` - Download staging
 - **Integrated With**: Jellyfin, Jellyseerr, Bazarr
 #### Radarr (VMID 108)
 - **Node**: pm2
 - **IP**: 10.4.2.16
 - **Port**: 7878 (default)
 - **Purpose**: Movie automation
 - **Dependencies**: Prowlarr
 - **Mount Points**:
  - `/media` - Media library
  - `/mnt/kavnas` - Download staging
 - **Integrated With**: Jellyfin, Jellyseerr, Bazarr
 #### Whisparr (VMID 117)
 - **Node**: pm2
 - **IP**: 10.4.2.19
 - **Port**: 6969 (default)
 - **Purpose**: Adult content automation
 - **Dependencies**: Prowlarr
 - **Integrated With**: Jellyfin
 #### Bazarr (VMID 119)
 - **Node**: pm2
 - **IP**: 10.4.2.18
 - **Port**: 6767 (default)
 - **Purpose**: Subtitle automation
 - **Dependencies**: Sonarr, Radarr
 - **Integrated With**: Jellyfin
 ### Media Servers & Requests
 #### Jellyfin (VMID 121)
 - **Node**: elantris
 - **IP**: 10.4.2.22
 - **Port**: 8096 (default)
 - **Purpose**: Media server
 - **Dependencies**: None (reads media library)
 - **Media Sources**: *TODO: Document media library paths*
 - **Status**: Needs to be added to Traefik config
 #### Jellyseerr (VMID 115)
 - **Node**: pm2
 - **IP**: 10.4.2.20
 - **Port**: 5055 (default)
 - **Purpose**: Media request management
 - **Dependencies**: Jellyfin, Sonarr, Radarr
 - **Integrated With**: Jellyfin (for library data)
 #### Kometa (VMID 120)
 - **Node**: pm2
 - **IP**: 10.4.2.21
 - **Purpose**: Automated metadata and collection management for Jellyfin
 - **Dependencies**: Jellyfin
 - **Run Mode**: Scheduled/automated (not web UI)
 #### Notifiarr (VMID 118)
 - **Node**: pm2
 - **IP**: 10.4.2.24
 - **Purpose**: Notification relay for *arr apps
 - **Dependencies**: Sonarr, Radarr, Prowlarr, etc.
 - **Notifications For**: Downloads, upgrades, errors
 ### Docker Hosts
 #### dockge (VMID 107)
 - **Node**: pm3
 - **Purpose**: Docker Compose management web UI
 - **Port**: 5001 (default)
 - **Manages**: Docker containers across docker-pm2, docker-pm3, docker-pm4
 - **Web UI**: Accessible via browser
 #### docker-pm2 (VMID 113)
 - **Node**: pm2
 - **Purpose**: Docker host (currently empty/minimal)
 - **Status**: Available for new containerized services
 #### docker-pm3 (VMID 109)
 - **Node**: pm3
 - **Purpose**: Primary Docker host
 - **Status**: Running containerized services (details TBD)
 #### docker-pm4 (VMID 110)
 - **Node**: pm4
 - **Purpose**: Docker host
 - **Status**: Running containerized services
 ### Smart Home & IoT
 #### Home Assistant (VMID 100)
 - **Node**: pm1
 - **Purpose**: Home automation platform
 - **Port**: 8123 (default)
 - **Type**: Full VM (HAOS)
 - **Integrations**: Z-Wave, MQTT, Twingate
 #### Z-Wave JS UI (VMID 102)
 - **Node**: pm1
 - **Purpose**: Z-Wave device management
 - **Port**: 8091 (default)
 - **Dependencies**: USB Z-Wave stick
 - **Integrated With**: Home Assistant
 #### MQTT (VMID 106)
 - **Node**: pm3
 - **Port**: 1883 (MQTT), 9001 (WebSocket)
 - **Purpose**: Message broker for IoT devices
 - **Dependencies**: None
 - **Clients**: Home Assistant, IoT devices
 #### Twingate (VMID 101)
 - **Node**: pm1
 - **Purpose**: Zero-trust network access
 - **Type**: VPN alternative
 ### Surveillance & NVR
 #### Frigate (VMID 111)
 - **Node**: pm3
 - **Port**: 5000 (default)
 - **Purpose**: NVR with AI object detection
 - **Dependencies**: None
 - **Storage**: High (120GB allocated)
 - **Features**: Object detection, motion detection
 - **Integrated With**: Home Assistant
 #### Shinobi (VMID 103)
 - **Node**: pm4
 - **Port**: 8080 (default)
 - **Purpose**: Network Video Recorder
 - **Storage**: High network traffic (407GB in)
 - **Status**: May be deprecated in favor of Frigate
 ### Gaming
 #### FoundryVTT (VMID 112)
 - **Node**: pm3
 - **Port**: 30000 (default)
 - **Purpose**: Virtual tabletop for RPG gaming
 - **Storage**: 100GB (for assets, maps, modules)
 - **Access**: Password protected
 ## Service Access URLs
 *TODO: Document Traefik routes for each service*
 Expected format:
 - Jellyfin: https://jellyfin.yourdomain.com
 - Sonarr: https://sonarr.yourdomain.com
 - Radarr: https://radarr.yourdomain.com
 - etc.
 ## Service Dependencies Map
 ```
 Traefik (proxy)
 ├── Authelia (auth)
 ├── Jellyfin (media server)
 ├── Jellyseerr (requests) → Jellyfin, Sonarr, Radarr
 ├── Sonarr → Prowlarr, Bazarr
 ├── Radarr → Prowlarr, Bazarr
 ├── Whisparr → Prowlarr
 ├── Prowlarr (indexers)
 ├── Bazarr → Sonarr, Radarr
 ├── Home Assistant → MQTT, Z-Wave JS UI
 ├── Frigate → Home Assistant (optional)
 └── FoundryVTT
 ```
 ## Migration Candidates (Docker → LXC)
 Services currently in Docker that could be migrated to LXC:
 - *TODO: Document after reviewing Docker container inventory*
 ## Service Maintenance Notes
 - Most services auto-update or have update notifications
 - Monitor Frigate storage usage (generates large video files)
 - Dockge provides easy UI for managing Docker stacks
 - *arr services should be updated together to maintain compatibility
--- a/docs/storage.md
+++ b/docs/storage.md
@@ -0,0 +1,184 @@
 # Storage Architecture
 **Last Updated**: 2025-11-16
 ## Storage Overview
 The KavCorp cluster uses a multi-tiered storage approach:
 1. **Local node storage**: For node-specific data, templates, ISOs
 2. **NFS shared storage**: For LXC containers, backups, and shared data
 3. **ZFS pools**: For high-performance storage on specific nodes
 ## Storage Pools
 ### Local Storage (Per-Node)
 Each node has two local storage pools:
 #### `local` - Directory Storage
 - **Type**: Directory
 - **Size**: ~100GB per node
 - **Content Types**: backup, vztmpl (templates), iso
 - **Location**: `/var/lib/vz`
 - **Usage**: Node-specific backups, templates, ISO images
 - **Shared**: No
 **Per-Node Status**:
 | Node | Used | Total | Available |
 |---|---|---|---|
 | pm1 | 10.1GB | 100.9GB | 90.8GB |
 | pm2 | 8.0GB | 100.9GB | 92.9GB |
 | pm3 | 6.9GB | 100.9GB | 94.0GB |
 | pm4 | 7.5GB | 100.9GB | 93.4GB |
 | elantris | 4.1GB | 100.9GB | 96.8GB |
 #### `local-lvm` - LVM Thin Pool
 - **Type**: LVM Thin
 - **Size**: ~350-375GB per node (varies)
 - **Content Types**: rootdir, images
 - **Usage**: High-performance VM/LXC disks
 - **Shared**: No
 - **Best For**: Services requiring fast local storage
 **Per-Node Status**:
 | Node | Used | Total | Available |
 |---|---|---|---|
 | pm1 | 16.9GB | 374.5GB | 357.6GB |
 | pm2 | 0GB | 374.5GB | 374.5GB |
 | pm3 | 178.8GB | 362.8GB | 184.0GB |
 | pm4 | 0GB | 374.5GB | 374.5GB |
 | elantris | 0GB | 362.8GB | 362.8GB |
 **Note**: pm3's local-lvm is heavily used (178.8GB) due to:
 - VMID 107: dockge (120GB)
 - VMID 111: frigate (120GB)
 - VMID 112: foundryvtt (100GB)
 ### NFS Shared Storage
 #### `KavNas` - Primary Shared Storage
 - **Type**: NFS
 - **Source**: 10.4.2.13 (Synology DS918+ NAS)
 - **Size**: 23TB (23,029,958,311,936 bytes)
 - **Used**: 9.2TB (9,241,738,215,424 bytes)
 - **Available**: 13.8TB
 - **Content Types**: snippets, iso, images, backup, rootdir, vztmpl
 - **Shared**: Yes (available on all nodes)
 - **Best For**:
  - LXC container rootfs (most new containers use this)
  - Backups
  - ISO images
  - Templates
  - Data that needs to be accessible across nodes
 **Current Usage**:
 - Most LXC containers on pm2 use KavNas for rootfs
 - Provides easy migration between nodes
 - Centralized backup location
 #### `elantris-downloads` - Download Storage
 - **Type**: NFS
 - **Source**: 10.4.2.14 (elantris node)
 - **Size**: 23TB (23,116,582,486,016 bytes)
 - **Used**: 10.6TB (10,630,966,804,480 bytes)
 - **Available**: 12.5TB
 - **Content Types**: rootdir, images
 - **Shared**: Yes (available on all nodes)
 - **Best For**:
  - Download staging area
  - Media downloads
  - Large file operations
 ### ZFS Storage
 #### `el-pool` - ZFS Pool (elantris)
 - **Type**: ZFS
 - **Node**: elantris only
 - **Size**: 24TB (26,317,550,091,635 bytes)
 - **Used**: 13.8TB (13,831,934,311,603 bytes)
 - **Available**: 12.5TB
 - **Content Types**: images, rootdir
 - **Shared**: No (elantris only)
 - **Best For**:
  - High-performance storage on elantris
  - Large data sets requiring ZFS features
  - Services that benefit from compression/deduplication
 **Current Usage**:
 - VMID 121: jellyfin (16GB on el-pool)
 **Status on Other Nodes**: Shows as "unknown" - ZFS pool is local to elantris only
 ## Storage Recommendations
 ### For New LXC Containers
 **General Purpose Services** (web apps, APIs, small databases):
 - **Storage**: `KavNas`
 - **Disk Size**: 4-10GB
 - **Rationale**: Shared, easy to migrate, automatically backed up
 **High-Performance Services** (databases, caches):
 - **Storage**: `local-lvm`
 - **Disk Size**: As needed
 - **Rationale**: Fast local SSD storage
 **Large Storage Services** (media, file storage):
 - **Storage**: `elantris-downloads` or `el-pool`
 - **Disk Size**: As needed
 - **Rationale**: Large capacity, optimized for bulk storage
 ### Mount Points for Media Services
 Media-related LXCs typically mount:
 ```
 mp0: /mnt/pve/elantris-media,mp=/media,ro=0
 mp1: /mnt/pve/KavNas,mp=/mnt/kavnas
 ```
 This provides:
 - Access to media library via `/media`
 - Access to NAS storage via `/mnt/kavnas`
 ## Storage Performance Notes
 ### Best Performance
 1. `local-lvm` (local SSD on each node)
 ### Best Redundancy/Availability
 1. `KavNas` (NAS with RAID, accessible from all nodes)
 2. `elantris-downloads` (large capacity, shared)
 ### Best for Large Files
 1. `el-pool` (ZFS on elantris, 24TB)
 2. `elantris-downloads` (23TB NFS)
 3. `KavNas` (23TB NFS)
 ## Backup Strategy
 **Current Setup**:
 - Backups stored on `KavNas` NFS share
 - All nodes can write backups to KavNas
 - Centralized backup location
 **Recommendations**:
 - [ ] Document automated backup schedules
 - [ ] Implement off-site backup rotation
 - [ ] Test restore procedures
 - [ ] Monitor KavNas free space (currently 60% used)
 ## Storage Monitoring
 **Watch These Metrics**:
 - pm3 `local-lvm`: 49% used (178.8GB / 362.8GB)
 - KavNas: 40% used (9.2TB / 23TB)
 - elantris-downloads: 46% used (10.6TB / 23TB)
 - el-pool: 53% used (13.8TB / 24TB)
 ## Future Storage Improvements
 - [ ] Set up automated cleanup of old backups
 - [ ] Implement storage quotas for LXC containers
 - [ ] Consider SSD caching for NFS mounts
 - [ ] Document backup retention policies
 - [ ] Set up alerts for storage thresholds (80%, 90%)
--- a/docs/traefik-ssl-setup.md
+++ b/docs/traefik-ssl-setup.md
@@ -0,0 +1,131 @@
 # Traefik SSL/TLS Setup with Namecheap
 **Last Updated**: 2025-11-16
 ## Configuration Summary
 Traefik is configured to use Let's Encrypt with DNS-01 challenge via Namecheap for wildcard SSL certificates.
 ### Environment Variables
 Located in: `/etc/systemd/system/traefik.service.d/override.conf` (inside Traefik LXC 104)
 ```bash
 NAMECHEAP_API_USER=kavren
 NAMECHEAP_API_KEY=8156f3d9ef664c91b95f029dfbb62ad5
 NAMECHEAP_PROPAGATION_TIMEOUT=3600  # 1 hour timeout for DNS propagation
 NAMECHEAP_POLLING_INTERVAL=30        # Check every 30 seconds
 NAMECHEAP_TTL=300                    # 5 minute TTL for DNS records
 ```
 ### Traefik Configuration
 File: `/etc/traefik/traefik.yaml`
 ```yaml
 certificatesResolvers:
  letsencrypt:
    acme:
      email: cory.bailey87@gmail.com
      storage: /etc/traefik/ssl/acme.json
      dnsChallenge:
        provider: namecheap
        resolvers:
          - "1.1.1.1:53"
          - "8.8.8.8:53"
 ```
 ### Wildcard Certificate
 Configured for:
 - Main domain: `kavcorp.com`
 - Wildcard: `*.kavcorp.com`
 ## Namecheap API Requirements
 1. **API Access Enabled**: Must have API access enabled in Namecheap account
 2. **IP Whitelisting**: Public IP `99.74.188.161` must be whitelisted
 3. **API Key**: Must have valid API key with DNS modification permissions
 ### Verifying API Access
 Test Namecheap API from Traefik LXC:
 ```bash
 pct exec 104 -- curl -s 'https://api.namecheap.com/xml.response?ApiUser=kavren&ApiKey=8156f3d9ef664c91b95f029dfbb62ad5&UserName=kavren&Command=namecheap.domains.getList&ClientIp=99.74.188.161'
 ```
 ## Existing Certificates
 Valid Let's Encrypt certificates already obtained:
 - `traefik.kavcorp.com`
 - `sonarr.kavcorp.com`
 - `radarr.kavcorp.com`
 Stored in: `/etc/traefik/ssl/acme.json`
 ## Troubleshooting
 ### Common Issues
 **DNS Propagation Timeout**:
 - Error: "propagation: time limit exceeded"
 - Solution: Increased `NAMECHEAP_PROPAGATION_TIMEOUT` to 3600 seconds (1 hour)
 **API Authentication Failed**:
 - Verify IP whitelisted: 99.74.188.161
 - Verify API key is correct
 - Check API access is enabled in Namecheap
 **Deprecated Configuration Warning**:
 - Fixed: Removed deprecated `delayBeforeCheck` option
 - Now using default propagation settings controlled by environment variables
 ### Monitoring Certificate Generation
 Check Traefik logs:
 ```bash
 ssh pm2 "pct exec 104 -- tail -f /var/log/traefik/traefik.log"
 ```
 Filter for ACME/certificate errors:
 ```bash
 ssh pm2 "pct exec 104 -- cat /var/log/traefik/traefik.log | grep -i 'acme\|certificate\|error'"
 ```
 ### Manual Certificate Renewal
 Certificates auto-renew. To force renewal:
 ```bash
 # Delete acme.json and restart Traefik (will regenerate all certs)
 ssh pm2 "pct exec 104 -- rm /etc/traefik/ssl/acme.json && systemctl restart traefik"
 ```
 **WARNING**: Only do this if necessary, as Let's Encrypt has rate limits!
 ## Certificate Request Flow
 1. New service added to `/etc/traefik/conf.d/*.yaml`
 2. Traefik detects new route requiring HTTPS
 3. Checks if certificate exists in acme.json
 4. If not, initiates DNS-01 challenge:
   - Creates TXT record via Namecheap API: `_acme-challenge.subdomain.kavcorp.com`
   - Waits for DNS propagation (up to 1 hour)
   - Polls DNS servers every 30 seconds
   - Let's Encrypt verifies TXT record
   - Certificate issued and stored in acme.json
 5. Certificate served for HTTPS connections
 ## Next Steps
 When adding new services:
 1. Add route configuration to `/etc/traefik/conf.d/media-services.yaml` (or create new file)
 2. Traefik will automatically request certificate on first HTTPS request
 3. Monitor logs for any DNS propagation issues
 4. Certificate will be cached and auto-renewed before expiration
 ## Notes
 - Traefik v3.6.1 in use
 - DNS-01 challenge allows wildcard certificates
 - Certificates valid for 90 days, auto-renewed at 60 days
 - Rate limit: 50 certificates per domain per week (Let's Encrypt)
--- a/scripts/cleanup/README.md
+++ b/scripts/cleanup/README.md
@@ -0,0 +1,155 @@
 # Media Organization Script
 ## Purpose
 This script identifies and organizes media files by comparing them against what Radarr and Sonarr are actively managing. Files that are not managed by either service are moved to a processing folder for manual review.
 ## Location
 Script: `/home/kavren/proxmox-infra/scripts/cleanup/organize-media.py`
 ## Usage
 ### On pm2 (where media is mounted)
 The script needs to be run on pm2 where the media directories are mounted.
 ```bash
 # Copy script to pm2
 scp /home/kavren/proxmox-infra/scripts/cleanup/organize-media.py pm2:/root/organize-media.py
 # Run in DRY RUN mode (recommended first)
 ssh pm2 "python3 /root/organize-media.py"
 # Run with execution (actually move files)
 ssh pm2 "python3 /root/organize-media.py --execute"
 # Run quietly (only show summary)
 ssh pm2 "python3 /root/organize-media.py --quiet"
 ```
 ## What It Does
 1. **Queries Radarr API** (http://10.4.2.16:7878)
   - Gets all movies and their file paths
   - Identifies which files are actively managed
 2. **Queries Sonarr API** (http://10.4.2.15:8989)
   - Gets all TV series and their episode files
   - Identifies which files are actively managed
 3. **Scans Media Directories**
   - `/media/movies` - all video files
   - `/media/tv` - all video files
   - `/media/anime` - all video files
   - Supported extensions: .mkv, .mp4, .avi, .m4v, .ts, .wmv, .flv, .webm
 4. **Categorizes Files**
   - **Managed**: Files that exist in Radarr/Sonarr (kept in place)
   - **Unmanaged**: Files not in Radarr/Sonarr (marked for moving)
 5. **Processes Unmanaged Files** (when --execute is used)
   - Creates `/media/processing/from-movies/`, `/media/processing/from-tv/`, `/media/processing/from-anime/`
   - Moves unmanaged files preserving relative directory structure
   - Creates log file: `/media/processing/cleanup-log-{timestamp}.txt`
 6. **Reports Empty Directories**
   - Lists directories that would be empty after cleanup
   - Does NOT automatically delete them (for safety)
 ## Safety Features
 - **DRY RUN by default**: Shows what would happen without actually moving files
 - **Requires --execute flag**: Must explicitly enable actual file operations
 - **Detailed logging**: All operations logged with timestamps
 - **Preserves structure**: Maintains relative paths when moving files
 - **Permission handling**: Gracefully handles access errors
 - **Empty directory detection**: Only reports, doesn't delete
 ## Output
 The script provides:
 - Real-time progress updates (unless --quiet is used)
 - Summary report showing:
  - Total files scanned
  - Files managed by Radarr/Sonarr
  - Unmanaged files found
  - Breakdown by media type
  - Empty directories detected
 - Log file written to `/media/processing/cleanup-log-{timestamp}.txt`
 ## Example Output
 ```
 ================================================================================
 SUMMARY REPORT
 ================================================================================
 Mode: DRY RUN MODE
 Total files scanned: 2847
 Files managed by Radarr/Sonarr: 2847
 Unmanaged files found: 0
 Unmanaged files by category:
  movies: 0 files
  tv: 0 files
  anime: 0 files
 ================================================================================
 ```
 ## Configuration
 The script has hardcoded configuration at the top:
 ```python
 RADARR_URL = "http://10.4.2.16:7878"
 RADARR_API_KEY = "5e6796988abf4d6d819a2b506a44f422"
 SONARR_URL = "http://10.4.2.15:8989"
 SONARR_API_KEY = "b331fe18ec2144148a41645d9ce8b249"
 MEDIA_DIRS = {
    "movies": "/media/movies",
    "tv": "/media/tv",
    "anime": "/media/anime"
 }
 PROCESSING_DIR = "/media/processing"
 VIDEO_EXTENSIONS = {'.mkv', '.mp4', '.avi', '.m4v', '.ts', '.wmv', '.flv', '.webm'}
 ```
 ## Troubleshooting
 ### Permission Errors
 If you see permission errors, ensure the script is running as root on pm2:
 ```bash
 ssh pm2 "whoami"  # Should show 'root'
 ```
 ### API Connection Errors
 If the script can't connect to Radarr/Sonarr:
 - Verify the services are running
 - Check the URLs and API keys are correct
 - Ensure network connectivity from pm2 to the services
 ### Missing Directories
 If media directories don't exist, the script will log warnings and skip them.
 ## Maintenance
 After running with --execute and reviewing files in `/media/processing/`:
 1. Review the moved files
 2. Add them to Radarr/Sonarr if needed
 3. Delete if they're truly unwanted
 4. Review empty directory list from log
 5. Manually remove empty directories if desired
 ## Future Enhancements
 Possible improvements:
 - Add support for custom media directories via CLI arguments
 - Add configuration file support
 - Add ability to automatically delete empty directories
 - Add dry-run output to file for review
 - Add email notifications on completion
--- a/scripts/cleanup/organize-media.py
+++ b/scripts/cleanup/organize-media.py
@@ -0,0 +1,409 @@
 #!/usr/bin/env python3
 """
 Media Organization Script
 Compares media files against Radarr/Sonarr managed files and moves unmanaged files to processing folder.
 """
 import argparse
 import json
 import os
 import shutil
 import sys
 from datetime import datetime
 from pathlib import Path
 from typing import Dict, List, Set, Tuple
 import urllib.request
 import urllib.error
 # Configuration
 RADARR_URL = "http://10.4.2.16:7878"
 RADARR_API_KEY = "5e6796988abf4d6d819a2b506a44f422"
 SONARR_URL = "http://10.4.2.15:8989"
 SONARR_API_KEY = "b331fe18ec2144148a41645d9ce8b249"
 MEDIA_DIRS = {
    "movies": "/mnt/pve/elantris-media/movies",
    "tv": "/mnt/pve/elantris-media/tv",
    "anime": "/mnt/pve/elantris-media/anime"
 }
 # Path translation: Radarr/Sonarr see /media/* but files are at /mnt/pve/elantris-media/*
 PATH_MAPPING = {
    "/media/movies": "/mnt/pve/elantris-media/movies",
    "/media/tv": "/mnt/pve/elantris-media/tv",
    "/media/anime": "/mnt/pve/elantris-media/anime"
 }
 PROCESSING_DIR = "/mnt/pve/elantris-media/processing"
 VIDEO_EXTENSIONS = {'.mkv', '.mp4', '.avi', '.m4v', '.ts', '.wmv', '.flv', '.webm'}
 class MediaOrganizer:
    def __init__(self, dry_run: bool = True, verbose: bool = True):
        self.dry_run = dry_run
        self.verbose = verbose
        self.managed_files: Set[str] = set()
        self.unmanaged_files: Dict[str, List[Path]] = {
            "movies": [],
            "tv": [],
            "anime": []
        }
        self.stats = {
            "total_scanned": 0,
            "managed": 0,
            "unmanaged": 0,
            "moved": 0,
            "errors": 0
        }
        self.log_entries: List[str] = []
    def log(self, message: str, level: str = "INFO"):
        """Log a message to console and internal log"""
        timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        log_entry = f"[{timestamp}] [{level}] {message}"
        self.log_entries.append(log_entry)
        if self.verbose:
            print(log_entry)
    def translate_path(self, path: str) -> str:
        """Translate Radarr/Sonarr paths to actual filesystem paths"""
        for api_path, real_path in PATH_MAPPING.items():
            if path.startswith(api_path):
                return path.replace(api_path, real_path, 1)
        return path
    def api_request(self, url: str, api_key: str, endpoint: str) -> dict:
        """Make an API request to Radarr or Sonarr"""
        full_url = f"{url}/api/v3/{endpoint}"
        headers = {"X-Api-Key": api_key}
        try:
            req = urllib.request.Request(full_url, headers=headers)
            with urllib.request.urlopen(req, timeout=30) as response:
                return json.loads(response.read().decode())
        except urllib.error.URLError as e:
            self.log(f"API request failed for {full_url}: {e}", "ERROR")
            return None
        except json.JSONDecodeError as e:
            self.log(f"Failed to decode JSON response from {full_url}: {e}", "ERROR")
            return None
    def get_radarr_files(self) -> Set[str]:
        """Get all file paths managed by Radarr"""
        self.log("Querying Radarr for managed movie files...")
        managed_files = set()
        movies = self.api_request(RADARR_URL, RADARR_API_KEY, "movie")
        if not movies:
            self.log("Failed to retrieve movies from Radarr", "ERROR")
            return managed_files
        for movie in movies:
            # Get the movie file path if it exists
            if movie.get("hasFile") and "movieFile" in movie:
                file_path = movie["movieFile"].get("path")
                if file_path:
                    # Translate API path to real filesystem path
                    real_path = self.translate_path(file_path)
                    managed_files.add(real_path)
                    self.log(f"  Radarr manages: {file_path} -> {real_path}", "DEBUG")
        self.log(f"Found {len(managed_files)} files managed by Radarr")
        return managed_files
    def get_sonarr_files(self) -> Set[str]:
        """Get all file paths managed by Sonarr"""
        self.log("Querying Sonarr for managed TV series files...")
        managed_files = set()
        series = self.api_request(SONARR_URL, SONARR_API_KEY, "series")
        if not series:
            self.log("Failed to retrieve series from Sonarr", "ERROR")
            return managed_files
        for show in series:
            series_id = show.get("id")
            if not series_id:
                continue
            # Get episode files for this series
            episode_files = self.api_request(
                SONARR_URL,
                SONARR_API_KEY,
                f"episodefile?seriesId={series_id}"
            )
            if episode_files:
                for episode_file in episode_files:
                    file_path = episode_file.get("path")
                    if file_path:
                        # Translate API path to real filesystem path
                        real_path = self.translate_path(file_path)
                        managed_files.add(real_path)
                        self.log(f"  Sonarr manages: {file_path} -> {real_path}", "DEBUG")
        self.log(f"Found {len(managed_files)} files managed by Sonarr")
        return managed_files
    def scan_directory(self, directory: Path, media_type: str) -> List[Path]:
        """Scan a directory recursively for video files"""
        self.log(f"Scanning {directory} for video files...")
        video_files = []
        if not directory.exists():
            self.log(f"Directory does not exist: {directory}", "WARNING")
            return video_files
        try:
            for root, dirs, files in os.walk(directory):
                for file in files:
                    file_path = Path(root) / file
                    if file_path.suffix.lower() in VIDEO_EXTENSIONS:
                        video_files.append(file_path)
                        self.stats["total_scanned"] += 1
        except PermissionError as e:
            self.log(f"Permission denied accessing {directory}: {e}", "ERROR")
            self.stats["errors"] += 1
        except Exception as e:
            self.log(f"Error scanning {directory}: {e}", "ERROR")
            self.stats["errors"] += 1
        self.log(f"Found {len(video_files)} video files in {directory}")
        return video_files
    def categorize_files(self):
        """Scan media directories and categorize files as managed or unmanaged"""
        self.log("\n" + "="*80)
        self.log("STEP 1: Querying Radarr and Sonarr for managed files")
        self.log("="*80)
        # Get managed files from Radarr and Sonarr
        radarr_files = self.get_radarr_files()
        sonarr_files = self.get_sonarr_files()
        self.managed_files = radarr_files | sonarr_files
        self.log(f"\nTotal managed files: {len(self.managed_files)}")
        self.log("\n" + "="*80)
        self.log("STEP 2: Scanning media directories")
        self.log("="*80)
        # Scan each media directory
        for media_type, directory in MEDIA_DIRS.items():
            dir_path = Path(directory)
            video_files = self.scan_directory(dir_path, media_type)
            # Categorize each file
            for file_path in video_files:
                file_str = str(file_path)
                if file_str in self.managed_files:
                    self.stats["managed"] += 1
                    self.log(f"  MANAGED: {file_path}", "DEBUG")
                else:
                    self.stats["unmanaged"] += 1
                    self.unmanaged_files[media_type].append(file_path)
                    self.log(f"  UNMANAGED: {file_path}", "DEBUG")
    def create_processing_structure(self):
        """Create processing directory structure"""
        self.log("\n" + "="*80)
        self.log("STEP 3: Creating processing directory structure")
        self.log("="*80)
        processing_path = Path(PROCESSING_DIR)
        for media_type in MEDIA_DIRS.keys():
            subdir = processing_path / f"from-{media_type}"
            if self.dry_run:
                self.log(f"[DRY RUN] Would create directory: {subdir}")
            else:
                try:
                    subdir.mkdir(parents=True, exist_ok=True)
                    self.log(f"Created directory: {subdir}")
                except Exception as e:
                    self.log(f"Failed to create directory {subdir}: {e}", "ERROR")
                    self.stats["errors"] += 1
    def move_unmanaged_files(self):
        """Move unmanaged files to processing folder"""
        self.log("\n" + "="*80)
        self.log("STEP 4: Moving unmanaged files to processing folder")
        self.log("="*80)
        processing_path = Path(PROCESSING_DIR)
        for media_type, files in self.unmanaged_files.items():
            if not files:
                self.log(f"No unmanaged files found in {media_type}")
                continue
            self.log(f"\nProcessing {len(files)} unmanaged files from {media_type}...")
            source_dir = Path(MEDIA_DIRS[media_type])
            dest_base = processing_path / f"from-{media_type}"
            for file_path in files:
                try:
                    # Preserve relative path structure
                    relative_path = file_path.relative_to(source_dir)
                    dest_path = dest_base / relative_path
                    if self.dry_run:
                        self.log(f"[DRY RUN] Would move: {file_path}")
                        self.log(f"           To: {dest_path}")
                    else:
                        # Create destination directory if needed
                        dest_path.parent.mkdir(parents=True, exist_ok=True)
                        # Move the file
                        shutil.move(str(file_path), str(dest_path))
                        self.log(f"Moved: {file_path} -> {dest_path}")
                        self.stats["moved"] += 1
                except Exception as e:
                    self.log(f"Failed to move {file_path}: {e}", "ERROR")
                    self.stats["errors"] += 1
    def find_empty_directories(self) -> List[Path]:
        """Find directories that would be empty after moving files"""
        self.log("\n" + "="*80)
        self.log("STEP 5: Identifying empty directories")
        self.log("="*80)
        empty_dirs = []
        for media_type, directory in MEDIA_DIRS.items():
            dir_path = Path(directory)
            if not dir_path.exists():
                continue
            try:
                for root, dirs, files in os.walk(dir_path, topdown=False):
                    root_path = Path(root)
                    # Skip if this is the root media directory
                    if root_path == dir_path:
                        continue
                    # Check if directory is empty or would be empty
                    try:
                        contents = list(root_path.iterdir())
                        if not contents:
                            empty_dirs.append(root_path)
                            self.log(f"Empty directory: {root_path}")
                    except PermissionError:
                        self.log(f"Permission denied checking {root_path}", "WARNING")
            except Exception as e:
                self.log(f"Error finding empty directories in {directory}: {e}", "ERROR")
        return empty_dirs
    def write_log_file(self):
        """Write log file to processing directory"""
        timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
        log_path = Path(PROCESSING_DIR) / f"cleanup-log-{timestamp}.txt"
        try:
            if self.dry_run:
                self.log(f"\n[DRY RUN] Would write log file to: {log_path}")
            else:
                with open(log_path, 'w') as f:
                    f.write('\n'.join(self.log_entries))
                self.log(f"\nLog file written to: {log_path}")
        except Exception as e:
            self.log(f"Failed to write log file: {e}", "ERROR")
    def print_summary(self, empty_dirs: List[Path]):
        """Print summary report"""
        self.log("\n" + "="*80)
        self.log("SUMMARY REPORT")
        self.log("="*80)
        mode = "DRY RUN MODE" if self.dry_run else "EXECUTION MODE"
        self.log(f"\nMode: {mode}")
        self.log(f"\nTotal files scanned: {self.stats['total_scanned']}")
        self.log(f"Files managed by Radarr/Sonarr: {self.stats['managed']}")
        self.log(f"Unmanaged files found: {self.stats['unmanaged']}")
        if not self.dry_run:
            self.log(f"Files successfully moved: {self.stats['moved']}")
        if self.stats['errors'] > 0:
            self.log(f"Errors encountered: {self.stats['errors']}", "WARNING")
        self.log("\nUnmanaged files by category:")
        for media_type, files in self.unmanaged_files.items():
            self.log(f"  {media_type}: {len(files)} files")
        if empty_dirs:
            self.log(f"\nEmpty directories found: {len(empty_dirs)}")
            self.log("(These directories can be manually removed if desired)")
        self.log("\n" + "="*80)
    def run(self):
        """Main execution method"""
        self.log("="*80)
        self.log("MEDIA ORGANIZATION SCRIPT")
        self.log("="*80)
        self.log(f"Mode: {'DRY RUN' if self.dry_run else 'EXECUTE'}")
        self.log(f"Timestamp: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
        # Step 1 & 2: Categorize files
        self.categorize_files()
        # Step 3: Create processing structure
        self.create_processing_structure()
        # Step 4: Move unmanaged files
        self.move_unmanaged_files()
        # Step 5: Find empty directories
        empty_dirs = self.find_empty_directories()
        # Print summary
        self.print_summary(empty_dirs)
        # Write log file
        self.write_log_file()
        return self.stats
 def main():
    parser = argparse.ArgumentParser(
        description="Organize media files by comparing against Radarr/Sonarr managed files"
    )
    parser.add_argument(
        "--execute",
        action="store_true",
        help="Actually move files (default is dry run mode)"
    )
    parser.add_argument(
        "--quiet",
        action="store_true",
        help="Reduce verbosity (only show summary)"
    )
    args = parser.parse_args()
    # Create organizer instance
    organizer = MediaOrganizer(
        dry_run=not args.execute,
        verbose=not args.quiet
    )
    # Run the organization
    stats = organizer.run()
    # Exit with appropriate code
    if stats["errors"] > 0:
        sys.exit(1)
    else:
        sys.exit(0)
 if __name__ == "__main__":
    main()