Support for reloading config file on HUP signal
Implemented dynamic configuration reloading for gocast via SIGHUP signal handling. This allows updating BGP configuration, applications, and agent settings without service restart.
**Location:** `main.go:29-46`
Enhanced the existing signal handler to process SIGHUP:
```go
switch sig {
case syscall.SIGHUP:
log.Info("Received SIGHUP, reloading configuration")
if err := mon.Reload(*config); err != nil {
log.Errorf("Failed to reload configuration: %v", err)
} else {
log.Info("Configuration reloaded successfully")
}
case os.Interrupt, syscall.SIGTERM:
log.Info("Received shutdown signal, cleaning up")
mon.CloseAll()
cancel()
return
}
```
**Key Features:**
- Non-blocking: Uses goroutine to handle signals
- Graceful: SIGINT/SIGTERM still trigger clean shutdown
- Logged: All reload attempts are logged with success/failure status
**Location:** `controller/monitor.go:343-420`
Main reload orchestration method:
```go
func (m *MonitorMgr) Reload(configPath string) error
```
**Process Flow:**
1. Read new configuration from file
2. Compare with current configuration
3. If BGP config changed:
- Withdraw all announced routes
- Shutdown old BGP controller
- Start new BGP controller
- Re-announce routes for healthy apps
4. If Consul config changed:
- Initialize new Consul monitor
5. Update agent settings
6. Reload applications (add/remove/update)
**Thread Safety:**
- Uses existing `monMu` mutex for monitor map access
- Atomic BGP controller replacement
- No race conditions during reload
**Location:** `controller/monitor.go:422-475`
```go
func (m *MonitorMgr) bgpConfigChanged(old, new c.BgpConfig) bool
```
Comprehensive comparison of:
- Local AS, Peer AS, Peer IPs
- BGP origin
- Multi-hop settings (including nil checks)
- MD5 passwords and environment variables
- Per-peer communities
- Global communities
**Important:** Deep comparison ensures even minor changes are detected.
**Location:** `controller/monitor.go:477-532`
```go
func (m *MonitorMgr) reloadApps(oldApps, newApps []c.AppConfig)
```
Intelligent app management:
- **Remove:** Apps no longer in config (source="config" only)
- **Update:** Apps with changed configuration (VIP, monitors, NAT, communities)
- **Add:** New apps in configuration
**Key Behavior:**
- Consul-discovered apps are NOT removed during reload
- Only config-defined apps are managed
- Config changes trigger remove + re-add
1. **TestBgpConfigChanged**
- Tests all BGP configuration change scenarios
- Validates detection of AS, peer, MD5, community changes
- Includes nil multi-hop pointer checks
2. **TestEqualStringSlices**
- Tests slice comparison helper
- Validates empty, identical, and different slices
3. **TestReload** (Integration, requires root)
- Full reload cycle with BGP AS change
- App removal verification
- BGP controller replacement validation
4. **TestReloadAddApp** (Integration)
- Tests adding new app via reload
- Validates app registration
5. **TestReloadMD5Change** (Integration)
- Tests MD5 password change detection
- Validates BGP controller restart
**Decision:** Reload BGP configuration requires full controller restart.
**Rationale:**
- GoBGP library doesn't support modifying peers dynamically
- Simplifies implementation
- Ensures clean state
- Brief interruption is acceptable for infrequent config changes
**Alternative Considered:** Per-peer updates
- Complex to implement correctly
- Partial state issues
- Not supported well by GoBGP library
**Decision:** Log errors but don't crash; maintain old state on failure.
**Rationale:**
- Availability over correctness for config errors
- Admin can fix config and retry
- Better than service downtime
- Logs provide clear error messages
1. **BGP Interruption**
- Full BGP restart causes brief routing interruption
- All routes withdrawn and re-announced
- May impact traffic during reload
2. **No Atomic BGP Updates**
- Cannot add/remove single peer without full restart
- All peers affected even if one changes
3. **No Config Validation**
- Invalid config is detected during reload
- No pre-validation before applying
- Syntax errors require manual fix and retry
4. **No Rollback**
- Failed reload leaves service in potentially inconsistent state
- Manual intervention required to restore
- No automatic rollback to previous config
These changes were written using AI LLM
Authored-By: Claude Code (Sonnet 4.5)
This commit is contained in:
@@ -339,6 +339,231 @@ func (m *MonitorMgr) CloseAll() {
|
||||
}
|
||||
}
|
||||
|
||||
// Reload re-reads the configuration file and applies changes
|
||||
func (m *MonitorMgr) Reload(configPath string) error {
|
||||
glog.Infof("Reloading configuration from %s", configPath)
|
||||
|
||||
// Read new configuration
|
||||
newConfig := c.GetConfig(configPath)
|
||||
if newConfig == nil {
|
||||
return fmt.Errorf("Failed to load configuration")
|
||||
}
|
||||
|
||||
// Set defaults if not specified
|
||||
if newConfig.Agent.MonitorInterval == 0 {
|
||||
newConfig.Agent.MonitorInterval = defaultMonitorInterval
|
||||
}
|
||||
if newConfig.Agent.CleanupTimer == 0 {
|
||||
newConfig.Agent.CleanupTimer = defaultCleanupTimer
|
||||
}
|
||||
|
||||
// Check if BGP configuration has changed
|
||||
bgpChanged := m.bgpConfigChanged(m.config.Bgp, newConfig.Bgp)
|
||||
|
||||
if bgpChanged {
|
||||
glog.Infof("BGP configuration changed, restarting BGP controller")
|
||||
|
||||
// Withdraw all current routes before shutting down
|
||||
m.monMu.Lock()
|
||||
for _, am := range m.monitors {
|
||||
if am.announced {
|
||||
if err := m.ctrl.Withdraw(am.app.Vip); err != nil {
|
||||
glog.Errorf("Failed to withdraw route during reload: %v", err)
|
||||
}
|
||||
am.announced = false
|
||||
}
|
||||
}
|
||||
m.monMu.Unlock()
|
||||
|
||||
// Shutdown old BGP controller
|
||||
if err := m.ctrl.Shutdown(); err != nil {
|
||||
glog.Errorf("Failed to shutdown BGP controller: %v", err)
|
||||
}
|
||||
|
||||
// Start new BGP controller
|
||||
ctrl, err := NewController(newConfig.Bgp)
|
||||
if err != nil {
|
||||
return fmt.Errorf("Failed to start new BGP controller: %v", err)
|
||||
}
|
||||
m.ctrl = ctrl
|
||||
|
||||
// Re-announce all routes with new BGP config
|
||||
m.monMu.Lock()
|
||||
for _, am := range m.monitors {
|
||||
// Only re-announce if the app was previously announced and is still healthy
|
||||
if m.runMonitors(am.app) {
|
||||
if err := m.ctrl.Announce(am.app.Vip); err != nil {
|
||||
glog.Errorf("Failed to re-announce route %s: %v", am.app.Vip.Net.String(), err)
|
||||
} else {
|
||||
am.announced = true
|
||||
glog.Infof("Re-announced route %s", am.app.Vip.Net.String())
|
||||
}
|
||||
}
|
||||
}
|
||||
m.monMu.Unlock()
|
||||
}
|
||||
|
||||
// Handle consul configuration changes
|
||||
if m.config.Agent.ConsulAddr != newConfig.Agent.ConsulAddr ||
|
||||
m.config.Agent.ConsulToken != newConfig.Agent.ConsulToken {
|
||||
glog.Infof("Consul configuration changed")
|
||||
if newConfig.Agent.ConsulAddr != "" {
|
||||
cmon, err := NewConsulMon(newConfig.Agent.ConsulAddr, newConfig.Agent.ConsulToken)
|
||||
if err != nil {
|
||||
glog.Errorf("Failed to start consul monitor: %v", err)
|
||||
} else {
|
||||
m.consul = cmon
|
||||
// Start consul monitoring in background if not already running
|
||||
if m.config.Agent.ConsulAddr == "" {
|
||||
go m.consulMon()
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Update agent configuration
|
||||
oldConfig := m.config
|
||||
m.config = newConfig
|
||||
|
||||
// Handle app configuration changes
|
||||
m.reloadApps(oldConfig.Apps, newConfig.Apps)
|
||||
|
||||
glog.Infof("Configuration reloaded successfully")
|
||||
return nil
|
||||
}
|
||||
|
||||
// bgpConfigChanged checks if BGP configuration has changed
|
||||
func (m *MonitorMgr) bgpConfigChanged(old, new c.BgpConfig) bool {
|
||||
// Check basic parameters
|
||||
if old.LocalAS != new.LocalAS || old.LocalIP != new.LocalIP || old.Origin != new.Origin {
|
||||
return true
|
||||
}
|
||||
|
||||
// Check legacy peer config
|
||||
if old.PeerAS != new.PeerAS || old.PeerIP != new.PeerIP {
|
||||
return true
|
||||
}
|
||||
|
||||
// Check peers
|
||||
if len(old.Peers) != len(new.Peers) {
|
||||
return true
|
||||
}
|
||||
|
||||
// Compare each peer
|
||||
for i := range old.Peers {
|
||||
if old.Peers[i].PeerIP != new.Peers[i].PeerIP ||
|
||||
old.Peers[i].PeerAS != new.Peers[i].PeerAS ||
|
||||
old.Peers[i].MD5Password != new.Peers[i].MD5Password ||
|
||||
old.Peers[i].MD5EnvVar != new.Peers[i].MD5EnvVar {
|
||||
return true
|
||||
}
|
||||
|
||||
// Check multi-hop
|
||||
if (old.Peers[i].MultiHop == nil) != (new.Peers[i].MultiHop == nil) {
|
||||
return true
|
||||
}
|
||||
if old.Peers[i].MultiHop != nil && new.Peers[i].MultiHop != nil {
|
||||
if *old.Peers[i].MultiHop != *new.Peers[i].MultiHop {
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
// Check communities
|
||||
if len(old.Peers[i].Communities) != len(new.Peers[i].Communities) {
|
||||
return true
|
||||
}
|
||||
for j := range old.Peers[i].Communities {
|
||||
if old.Peers[i].Communities[j] != new.Peers[i].Communities[j] {
|
||||
return true
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Check global communities
|
||||
if len(old.Communities) != len(new.Communities) {
|
||||
return true
|
||||
}
|
||||
for i := range old.Communities {
|
||||
if old.Communities[i] != new.Communities[i] {
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
return false
|
||||
}
|
||||
|
||||
// reloadApps compares old and new app configurations and applies changes
|
||||
func (m *MonitorMgr) reloadApps(oldApps, newApps []c.AppConfig) {
|
||||
// Build maps for easy comparison
|
||||
oldAppMap := make(map[string]c.AppConfig)
|
||||
for _, app := range oldApps {
|
||||
oldAppMap[app.Name] = app
|
||||
}
|
||||
|
||||
newAppMap := make(map[string]c.AppConfig)
|
||||
for _, app := range newApps {
|
||||
newAppMap[app.Name] = app
|
||||
}
|
||||
|
||||
// Remove apps that are no longer in config
|
||||
for name := range oldAppMap {
|
||||
if _, exists := newAppMap[name]; !exists {
|
||||
m.monMu.Lock()
|
||||
if am, ok := m.monitors[name]; ok {
|
||||
if am.app.Source == "config" {
|
||||
glog.Infof("Removing app %s (no longer in config)", name)
|
||||
m.monMu.Unlock()
|
||||
m.Remove(name)
|
||||
continue
|
||||
}
|
||||
}
|
||||
m.monMu.Unlock()
|
||||
}
|
||||
}
|
||||
|
||||
// Add new apps or update existing ones
|
||||
for name, newAppConfig := range newAppMap {
|
||||
oldAppConfig, existed := oldAppMap[name]
|
||||
|
||||
// Check if app configuration changed
|
||||
configChanged := !existed ||
|
||||
oldAppConfig.Vip != newAppConfig.Vip ||
|
||||
!equalStringSlices(oldAppConfig.Monitors, newAppConfig.Monitors) ||
|
||||
!equalStringSlices(oldAppConfig.Nats, newAppConfig.Nats) ||
|
||||
!equalStringSlices(oldAppConfig.VipConfig.BgpCommunities, newAppConfig.VipConfig.BgpCommunities)
|
||||
|
||||
if configChanged {
|
||||
if existed {
|
||||
glog.Infof("App %s configuration changed, reloading", name)
|
||||
m.Remove(name)
|
||||
} else {
|
||||
glog.Infof("Adding new app %s from config", name)
|
||||
}
|
||||
|
||||
app, err := NewApp(newAppConfig.Name, newAppConfig.Vip, newAppConfig.VipConfig,
|
||||
newAppConfig.Monitors, newAppConfig.Nats, "config")
|
||||
if err != nil {
|
||||
glog.Errorf("Failed to add app %s: %v", name, err)
|
||||
continue
|
||||
}
|
||||
m.Add(app)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// equalStringSlices compares two string slices
|
||||
func equalStringSlices(a, b []string) bool {
|
||||
if len(a) != len(b) {
|
||||
return false
|
||||
}
|
||||
for i := range a {
|
||||
if a[i] != b[i] {
|
||||
return false
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
// CleanUp periodically monitors for stale apps and cleans them up
|
||||
func (m *MonitorMgr) Cleanup(app string, exit chan bool) {
|
||||
t := time.NewTimer(m.config.Agent.CleanupTimer)
|
||||
|
||||
Reference in New Issue
Block a user