Files

kavren 120c2ec809 Initial commit: KavCorp infrastructure documentation

- CLAUDE.md: Project configuration for Claude Code
- docs/: Infrastructure documentation
  - INFRASTRUCTURE.md: Service map, storage, network
  - CONFIGURATIONS.md: Service configs and credentials
  - CHANGELOG.md: Change history
  - DECISIONS.md: Architecture decisions
  - TASKS.md: Task tracking
- scripts/: Automation scripts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2025-12-07 22:07:01 -05:00

7.9 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Repository Purpose

Infrastructure documentation and management repository for the KavCorp Proxmox cluster - a 5-node homelab cluster running self-hosted services. This repository supports migration from Docker containers to Proxmox LXCs where appropriate.

Cluster Architecture

Cluster Name: KavCorp Nodes: 5 (pm1, pm2, pm3, pm4, elantris) Network: 10.4.2.0/24 Primary Management Node: pm2 (10.4.2.6)

Node IP Mapping

pm1: 10.4.2.2
pm2: 10.4.2.6 (primary for new LXC deployment)
pm3: 10.4.2.3
pm4: 10.4.2.5
elantris: 10.4.2.14 (largest node, 128GB RAM, ZFS storage)

Common Commands

Cluster Management

# Access cluster (use pm2 as primary management node)
ssh pm2

# View cluster status
pvecm status
pvecm nodes

# List all VMs/LXCs across cluster
pvesh get /cluster/resources --type vm --output-format json

# List all nodes
pvesh get /cluster/resources --type node --output-format json

# List storage
pvesh get /cluster/resources --type storage --output-format json

LXC Management

# List LXCs on a specific node
pct list

# Get LXC configuration
pvesh get /nodes/<node>/lxc/<vmid>/config
pct config <vmid>

# Start/stop/restart LXC
pct start <vmid>
pct stop <vmid>
pct restart <vmid>

# Execute command in LXC
pct exec <vmid> -- <command>

# Enter LXC console
pct enter <vmid>

# Create LXC from template
pct create <vmid> <template> --hostname <name> --cores <n> --memory <mb> --rootfs <storage>:<size>

Network Configuration

# View network interfaces
ip -br addr show

# Network config location
/etc/network/interfaces

# Standard bridge: vmbr0 (connected to eno1 physical interface)
# Gateway: 10.4.2.254

Storage Architecture

Storage Pools

Local Storage (per-node):

local: Directory storage on each node, for backups/templates/ISOs (~100GB each)
local-lvm: LVM thin pool on each node, for VM/LXC disks (~350-375GB each)

ZFS Pools:

el-pool: ZFS pool on elantris (24TB), used for large data storage

NFS Mounts (shared):

KavNas: Primary NFS share from Synology NAS (10.4.2.13), ~23TB - used for backups, ISOs, and LXC storage
elantris-downloads: NFS share from elantris, ~23TB - used for media downloads

Storage Recommendations

New LXC containers: Use KavNas for rootfs (NFS, easily backed up)
High-performance workloads: Use local-lvm on the host node
Large data storage: Use elantris-downloads or el-pool
Templates and ISOs: Store in KavNas or node's local

Service Categories (by tags)

arr: Media automation (*arr stack - Sonarr, Radarr, Prowlarr, Bazarr, Whisparr)
media: Media servers (Jellyfin, Jellyseerr, Kometa)
proxy: Reverse proxy (Traefik)
authenticator: Authentication (Authelia)
nvr: Network Video Recorder (Shinobi)
docker: Docker host LXCs (docker-pm2, docker-pm4) and VMs (docker-pm3)
proxmox-helper-scripts: Deployed via community scripts
community-script: Deployed via ProxmoxVE Helper Scripts

Migration Strategy

Goal: Move services from Docker containers to dedicated LXCs where it makes sense.

Good candidates for LXC migration:

Single-purpose services
Services with simple dependencies
Stateless applications
Services that benefit from isolation

Keep in Docker:

Complex multi-container stacks
Services requiring Docker-specific features
Temporary/experimental services

Current Docker Hosts:

VM 109: docker-pm3 (on pm3, 4 CPU, 12GB RAM)
LXC 110: docker-pm4 (on pm4, 4 CPU, 8GB RAM)
LXC 113: docker-pm2 (on pm2, 4 CPU, 8GB RAM)
LXC 107: dockge (on pm3, 12 CPU, 8GB RAM) - Docker management UI

IP Address Allocation

Infrastructure Services:

10.4.2.10: traefik (LXC 104)
10.4.2.13: KavNas (Synology NAS)
10.4.2.14: elantris

Media Stack:

10.4.2.15: sonarr (LXC 105)
10.4.2.16: radarr (LXC 108)
10.4.2.17: prowlarr (LXC 114)
10.4.2.18: bazarr (LXC 119)
10.4.2.19: whisparr (LXC 117)
10.4.2.20: jellyseerr (LXC 115)
10.4.2.21: kometa (LXC 120)
10.4.2.22: jellyfin (LXC 121)

Other Services:

10.4.2.23: authelia (LXC 116)
10.4.2.24: notifiarr (LXC 118)

Note: Update docs/network.md when allocating new IPs

Documentation Structure

CRITICAL: Always read docs/README.md first to understand the documentation system.

Core Documentation Files (ALWAYS UPDATE, NEVER CREATE NEW)

docs/INFRASTRUCTURE.md - Single source of truth
- CHECK THIS FIRST for node IPs, service locations, storage paths
- Update whenever infrastructure changes
docs/CONFIGURATIONS.md - Service configurations
- API keys, config snippets, copy/paste ready configs
- Update when service configs change
docs/DECISIONS.md - Architecture decisions
- Why we made choices, common patterns, troubleshooting
- Update when making decisions or discovering patterns
docs/TASKS.md - Current work tracking
- Active, pending, blocked, and completed tasks
- Update at start and end of work sessions
docs/CHANGELOG.md - Historical record
- Date-stamped entries for all changes
- Update after completing any significant work

Documentation Workflow

MANDATORY - Before ANY work session:

Read docs/README.md - Understand the documentation system
Check docs/INFRASTRUCTURE.md - Get current infrastructure state
Check docs/TASKS.md - See what's already in progress or pending

MANDATORY - During work:

When you need node IPs, service locations, or paths → Read docs/INFRASTRUCTURE.md
When you need config snippets or API keys → Read docs/CONFIGURATIONS.md
When wondering "why is it done this way?" → Read docs/DECISIONS.md
When you discover a pattern or make a decision → Immediately update docs/DECISIONS.md
When you encounter issues → Check docs/DECISIONS.md Known Issues section first

MANDATORY - After completing ANY work:

Update the relevant core doc:
- Infrastructure change? → Update docs/INFRASTRUCTURE.md
- Config change? → Update docs/CONFIGURATIONS.md
- New pattern/decision? → Update docs/DECISIONS.md
Add dated entry to docs/CHANGELOG.md describing what changed
Update docs/TASKS.md to mark work complete or add new tasks
Update "Last Updated" date in docs/README.md

STRICTLY FORBIDDEN:

Creating new documentation files without explicit user approval
Leaving documentation outdated after making changes
Creating session-specific notes files (use CHANGELOG for history)
Skipping documentation updates "to save time"
Assuming you remember infrastructure details (always check docs)

When to Update Which File

You just did...	Update this file
Added/removed a service	`INFRASTRUCTURE.md` (service map)
Changed an IP address	`INFRASTRUCTURE.md` (service map)
Modified service config	`CONFIGURATIONS.md` (add/update config snippet)
Changed API key	`CONFIGURATIONS.md` (update credentials)
Made architectural decision	`DECISIONS.md` (add to decisions section)
Discovered troubleshooting pattern	`DECISIONS.md` (add to common patterns)
Hit a recurring issue	`DECISIONS.md` (add to known issues)
Completed a task	`TASKS.md` (mark complete) + `CHANGELOG.md` (add entry)
Started new work	`TASKS.md` (add to in progress)
ANY significant change	`CHANGELOG.md` (always add dated entry)

Scripts

scripts/provisioning/: LXC/VM creation scripts
scripts/backup/: Backup automation scripts
scripts/monitoring/: Monitoring and health check scripts

Workflow Notes

New LXCs are primarily deployed on pm2
Use ProxmoxVE Helper Scripts (https://helper-scripts.com) for common services
Always tag LXCs appropriately for organization
Document service URLs and access details in docs/services.md
Keep inventory documentation in sync with changes

7.9 KiB Raw Blame History