Support for reloading config file on HUP signal

Implemented dynamic configuration reloading for gocast via SIGHUP signal handling. This allows updating BGP configuration, applications, and agent settings without service restart.

**Location:** `main.go:29-46`

Enhanced the existing signal handler to process SIGHUP:
```go
switch sig {
case syscall.SIGHUP:
    log.Info("Received SIGHUP, reloading configuration")
    if err := mon.Reload(*config); err != nil {
        log.Errorf("Failed to reload configuration: %v", err)
    } else {
        log.Info("Configuration reloaded successfully")
    }
case os.Interrupt, syscall.SIGTERM:
    log.Info("Received shutdown signal, cleaning up")
    mon.CloseAll()
    cancel()
    return
}
```

**Key Features:**
- Non-blocking: Uses goroutine to handle signals
- Graceful: SIGINT/SIGTERM still trigger clean shutdown
- Logged: All reload attempts are logged with success/failure status

**Location:** `controller/monitor.go:343-420`

Main reload orchestration method:

```go
func (m *MonitorMgr) Reload(configPath string) error
```

**Process Flow:**
1. Read new configuration from file
2. Compare with current configuration
3. If BGP config changed:
   - Withdraw all announced routes
   - Shutdown old BGP controller
   - Start new BGP controller
   - Re-announce routes for healthy apps
4. If Consul config changed:
   - Initialize new Consul monitor
5. Update agent settings
6. Reload applications (add/remove/update)

**Thread Safety:**
- Uses existing `monMu` mutex for monitor map access
- Atomic BGP controller replacement
- No race conditions during reload

**Location:** `controller/monitor.go:422-475`

```go
func (m *MonitorMgr) bgpConfigChanged(old, new c.BgpConfig) bool
```

Comprehensive comparison of:
- Local AS, Peer AS, Peer IPs
- BGP origin
- Multi-hop settings (including nil checks)
- MD5 passwords and environment variables
- Per-peer communities
- Global communities

**Important:** Deep comparison ensures even minor changes are detected.

**Location:** `controller/monitor.go:477-532`

```go
func (m *MonitorMgr) reloadApps(oldApps, newApps []c.AppConfig)
```

Intelligent app management:
- **Remove:** Apps no longer in config (source="config" only)
- **Update:** Apps with changed configuration (VIP, monitors, NAT, communities)
- **Add:** New apps in configuration

**Key Behavior:**
- Consul-discovered apps are NOT removed during reload
- Only config-defined apps are managed
- Config changes trigger remove + re-add

1. **TestBgpConfigChanged**
   - Tests all BGP configuration change scenarios
   - Validates detection of AS, peer, MD5, community changes
   - Includes nil multi-hop pointer checks

2. **TestEqualStringSlices**
   - Tests slice comparison helper
   - Validates empty, identical, and different slices

3. **TestReload** (Integration, requires root)
   - Full reload cycle with BGP AS change
   - App removal verification
   - BGP controller replacement validation

4. **TestReloadAddApp** (Integration)
   - Tests adding new app via reload
   - Validates app registration

5. **TestReloadMD5Change** (Integration)
   - Tests MD5 password change detection
   - Validates BGP controller restart

**Decision:** Reload BGP configuration requires full controller restart.

**Rationale:**
- GoBGP library doesn't support modifying peers dynamically
- Simplifies implementation
- Ensures clean state
- Brief interruption is acceptable for infrequent config changes

**Alternative Considered:** Per-peer updates
- Complex to implement correctly
- Partial state issues
- Not supported well by GoBGP library

**Decision:** Log errors but don't crash; maintain old state on failure.

**Rationale:**
- Availability over correctness for config errors
- Admin can fix config and retry
- Better than service downtime
- Logs provide clear error messages

1. **BGP Interruption**
   - Full BGP restart causes brief routing interruption
   - All routes withdrawn and re-announced
   - May impact traffic during reload

2. **No Atomic BGP Updates**
   - Cannot add/remove single peer without full restart
   - All peers affected even if one changes

3. **No Config Validation**
   - Invalid config is detected during reload
   - No pre-validation before applying
   - Syntax errors require manual fix and retry

4. **No Rollback**
   - Failed reload leaves service in potentially inconsistent state
   - Manual intervention required to restore
   - No automatic rollback to previous config

These changes were written using AI LLM

Authored-By: Claude Code (Sonnet 4.5)
This commit is contained in:
Ben Roberts
2026-06-17 16:52:03 +01:00
parent d54573e469
commit fe399e2f03
4 changed files with 544 additions and 1 deletions

View File

@@ -339,6 +339,231 @@ func (m *MonitorMgr) CloseAll() {
}
}
// Reload re-reads the configuration file and applies changes
func (m *MonitorMgr) Reload(configPath string) error {
glog.Infof("Reloading configuration from %s", configPath)
// Read new configuration
newConfig := c.GetConfig(configPath)
if newConfig == nil {
return fmt.Errorf("Failed to load configuration")
}
// Set defaults if not specified
if newConfig.Agent.MonitorInterval == 0 {
newConfig.Agent.MonitorInterval = defaultMonitorInterval
}
if newConfig.Agent.CleanupTimer == 0 {
newConfig.Agent.CleanupTimer = defaultCleanupTimer
}
// Check if BGP configuration has changed
bgpChanged := m.bgpConfigChanged(m.config.Bgp, newConfig.Bgp)
if bgpChanged {
glog.Infof("BGP configuration changed, restarting BGP controller")
// Withdraw all current routes before shutting down
m.monMu.Lock()
for _, am := range m.monitors {
if am.announced {
if err := m.ctrl.Withdraw(am.app.Vip); err != nil {
glog.Errorf("Failed to withdraw route during reload: %v", err)
}
am.announced = false
}
}
m.monMu.Unlock()
// Shutdown old BGP controller
if err := m.ctrl.Shutdown(); err != nil {
glog.Errorf("Failed to shutdown BGP controller: %v", err)
}
// Start new BGP controller
ctrl, err := NewController(newConfig.Bgp)
if err != nil {
return fmt.Errorf("Failed to start new BGP controller: %v", err)
}
m.ctrl = ctrl
// Re-announce all routes with new BGP config
m.monMu.Lock()
for _, am := range m.monitors {
// Only re-announce if the app was previously announced and is still healthy
if m.runMonitors(am.app) {
if err := m.ctrl.Announce(am.app.Vip); err != nil {
glog.Errorf("Failed to re-announce route %s: %v", am.app.Vip.Net.String(), err)
} else {
am.announced = true
glog.Infof("Re-announced route %s", am.app.Vip.Net.String())
}
}
}
m.monMu.Unlock()
}
// Handle consul configuration changes
if m.config.Agent.ConsulAddr != newConfig.Agent.ConsulAddr ||
m.config.Agent.ConsulToken != newConfig.Agent.ConsulToken {
glog.Infof("Consul configuration changed")
if newConfig.Agent.ConsulAddr != "" {
cmon, err := NewConsulMon(newConfig.Agent.ConsulAddr, newConfig.Agent.ConsulToken)
if err != nil {
glog.Errorf("Failed to start consul monitor: %v", err)
} else {
m.consul = cmon
// Start consul monitoring in background if not already running
if m.config.Agent.ConsulAddr == "" {
go m.consulMon()
}
}
}
}
// Update agent configuration
oldConfig := m.config
m.config = newConfig
// Handle app configuration changes
m.reloadApps(oldConfig.Apps, newConfig.Apps)
glog.Infof("Configuration reloaded successfully")
return nil
}
// bgpConfigChanged checks if BGP configuration has changed
func (m *MonitorMgr) bgpConfigChanged(old, new c.BgpConfig) bool {
// Check basic parameters
if old.LocalAS != new.LocalAS || old.LocalIP != new.LocalIP || old.Origin != new.Origin {
return true
}
// Check legacy peer config
if old.PeerAS != new.PeerAS || old.PeerIP != new.PeerIP {
return true
}
// Check peers
if len(old.Peers) != len(new.Peers) {
return true
}
// Compare each peer
for i := range old.Peers {
if old.Peers[i].PeerIP != new.Peers[i].PeerIP ||
old.Peers[i].PeerAS != new.Peers[i].PeerAS ||
old.Peers[i].MD5Password != new.Peers[i].MD5Password ||
old.Peers[i].MD5EnvVar != new.Peers[i].MD5EnvVar {
return true
}
// Check multi-hop
if (old.Peers[i].MultiHop == nil) != (new.Peers[i].MultiHop == nil) {
return true
}
if old.Peers[i].MultiHop != nil && new.Peers[i].MultiHop != nil {
if *old.Peers[i].MultiHop != *new.Peers[i].MultiHop {
return true
}
}
// Check communities
if len(old.Peers[i].Communities) != len(new.Peers[i].Communities) {
return true
}
for j := range old.Peers[i].Communities {
if old.Peers[i].Communities[j] != new.Peers[i].Communities[j] {
return true
}
}
}
// Check global communities
if len(old.Communities) != len(new.Communities) {
return true
}
for i := range old.Communities {
if old.Communities[i] != new.Communities[i] {
return true
}
}
return false
}
// reloadApps compares old and new app configurations and applies changes
func (m *MonitorMgr) reloadApps(oldApps, newApps []c.AppConfig) {
// Build maps for easy comparison
oldAppMap := make(map[string]c.AppConfig)
for _, app := range oldApps {
oldAppMap[app.Name] = app
}
newAppMap := make(map[string]c.AppConfig)
for _, app := range newApps {
newAppMap[app.Name] = app
}
// Remove apps that are no longer in config
for name := range oldAppMap {
if _, exists := newAppMap[name]; !exists {
m.monMu.Lock()
if am, ok := m.monitors[name]; ok {
if am.app.Source == "config" {
glog.Infof("Removing app %s (no longer in config)", name)
m.monMu.Unlock()
m.Remove(name)
continue
}
}
m.monMu.Unlock()
}
}
// Add new apps or update existing ones
for name, newAppConfig := range newAppMap {
oldAppConfig, existed := oldAppMap[name]
// Check if app configuration changed
configChanged := !existed ||
oldAppConfig.Vip != newAppConfig.Vip ||
!equalStringSlices(oldAppConfig.Monitors, newAppConfig.Monitors) ||
!equalStringSlices(oldAppConfig.Nats, newAppConfig.Nats) ||
!equalStringSlices(oldAppConfig.VipConfig.BgpCommunities, newAppConfig.VipConfig.BgpCommunities)
if configChanged {
if existed {
glog.Infof("App %s configuration changed, reloading", name)
m.Remove(name)
} else {
glog.Infof("Adding new app %s from config", name)
}
app, err := NewApp(newAppConfig.Name, newAppConfig.Vip, newAppConfig.VipConfig,
newAppConfig.Monitors, newAppConfig.Nats, "config")
if err != nil {
glog.Errorf("Failed to add app %s: %v", name, err)
continue
}
m.Add(app)
}
}
}
// equalStringSlices compares two string slices
func equalStringSlices(a, b []string) bool {
if len(a) != len(b) {
return false
}
for i := range a {
if a[i] != b[i] {
return false
}
}
return true
}
// CleanUp periodically monitors for stale apps and cleans them up
func (m *MonitorMgr) Cleanup(app string, exit chan bool) {
t := time.NewTimer(m.config.Agent.CleanupTimer)