Support for reloading config file on HUP signal

Implemented dynamic configuration reloading for gocast via SIGHUP signal handling. This allows updating BGP configuration, applications, and agent settings without service restart.

**Location:** `main.go:29-46`

Enhanced the existing signal handler to process SIGHUP:
```go
switch sig {
case syscall.SIGHUP:
    log.Info("Received SIGHUP, reloading configuration")
    if err := mon.Reload(*config); err != nil {
        log.Errorf("Failed to reload configuration: %v", err)
    } else {
        log.Info("Configuration reloaded successfully")
    }
case os.Interrupt, syscall.SIGTERM:
    log.Info("Received shutdown signal, cleaning up")
    mon.CloseAll()
    cancel()
    return
}
```

**Key Features:**
- Non-blocking: Uses goroutine to handle signals
- Graceful: SIGINT/SIGTERM still trigger clean shutdown
- Logged: All reload attempts are logged with success/failure status

**Location:** `controller/monitor.go:343-420`

Main reload orchestration method:

```go
func (m *MonitorMgr) Reload(configPath string) error
```

**Process Flow:**
1. Read new configuration from file
2. Compare with current configuration
3. If BGP config changed:
   - Withdraw all announced routes
   - Shutdown old BGP controller
   - Start new BGP controller
   - Re-announce routes for healthy apps
4. If Consul config changed:
   - Initialize new Consul monitor
5. Update agent settings
6. Reload applications (add/remove/update)

**Thread Safety:**
- Uses existing `monMu` mutex for monitor map access
- Atomic BGP controller replacement
- No race conditions during reload

**Location:** `controller/monitor.go:422-475`

```go
func (m *MonitorMgr) bgpConfigChanged(old, new c.BgpConfig) bool
```

Comprehensive comparison of:
- Local AS, Peer AS, Peer IPs
- BGP origin
- Multi-hop settings (including nil checks)
- MD5 passwords and environment variables
- Per-peer communities
- Global communities

**Important:** Deep comparison ensures even minor changes are detected.

**Location:** `controller/monitor.go:477-532`

```go
func (m *MonitorMgr) reloadApps(oldApps, newApps []c.AppConfig)
```

Intelligent app management:
- **Remove:** Apps no longer in config (source="config" only)
- **Update:** Apps with changed configuration (VIP, monitors, NAT, communities)
- **Add:** New apps in configuration

**Key Behavior:**
- Consul-discovered apps are NOT removed during reload
- Only config-defined apps are managed
- Config changes trigger remove + re-add

1. **TestBgpConfigChanged**
   - Tests all BGP configuration change scenarios
   - Validates detection of AS, peer, MD5, community changes
   - Includes nil multi-hop pointer checks

2. **TestEqualStringSlices**
   - Tests slice comparison helper
   - Validates empty, identical, and different slices

3. **TestReload** (Integration, requires root)
   - Full reload cycle with BGP AS change
   - App removal verification
   - BGP controller replacement validation

4. **TestReloadAddApp** (Integration)
   - Tests adding new app via reload
   - Validates app registration

5. **TestReloadMD5Change** (Integration)
   - Tests MD5 password change detection
   - Validates BGP controller restart

**Decision:** Reload BGP configuration requires full controller restart.

**Rationale:**
- GoBGP library doesn't support modifying peers dynamically
- Simplifies implementation
- Ensures clean state
- Brief interruption is acceptable for infrequent config changes

**Alternative Considered:** Per-peer updates
- Complex to implement correctly
- Partial state issues
- Not supported well by GoBGP library

**Decision:** Log errors but don't crash; maintain old state on failure.

**Rationale:**
- Availability over correctness for config errors
- Admin can fix config and retry
- Better than service downtime
- Logs provide clear error messages

1. **BGP Interruption**
   - Full BGP restart causes brief routing interruption
   - All routes withdrawn and re-announced
   - May impact traffic during reload

2. **No Atomic BGP Updates**
   - Cannot add/remove single peer without full restart
   - All peers affected even if one changes

3. **No Config Validation**
   - Invalid config is detected during reload
   - No pre-validation before applying
   - Syntax errors require manual fix and retry

4. **No Rollback**
   - Failed reload leaves service in potentially inconsistent state
   - Manual intervention required to restore
   - No automatic rollback to previous config

These changes were written using AI LLM

Authored-By: Claude Code (Sonnet 4.5)
This commit is contained in:
Ben Roberts
2026-06-17 16:52:03 +01:00
parent d54573e469
commit fe399e2f03
4 changed files with 544 additions and 1 deletions

View File

@@ -63,6 +63,22 @@ Alternatively, if `gocast_nat=protocol:port` is specified, then GoCast will crea
Example: `gocast_nat=tcp:53` and `gocast_nat=udp:53`
## Configuration Reload
GoCast supports dynamic configuration reloading without service restart. Send a `SIGHUP` signal to reload the configuration:
```bash
kill -HUP $(pidof gocast)
```
**What gets reloaded:**
- BGP configuration (peers, AS numbers, MD5 passwords, communities)
- Application definitions (add/remove/update apps)
- Agent settings (Consul, timers, intervals)
**Important:** Reloading BGP configuration causes existing BGP sessions to be restarted, resulting in brief routing interruption. Routes are automatically re-announced after reload.
Consul-discovered apps are not removed during reload.
## Docker support
The docker image at mayuresh82/gocast can be used to run GoCast inside a container. In order for GoCast to manipulate the host network stack correctly, the container needs to run with NET_ADMIN capablity and host mode networking. For example:
```

View File

@@ -339,6 +339,231 @@ func (m *MonitorMgr) CloseAll() {
}
}
// Reload re-reads the configuration file and applies changes
func (m *MonitorMgr) Reload(configPath string) error {
glog.Infof("Reloading configuration from %s", configPath)
// Read new configuration
newConfig := c.GetConfig(configPath)
if newConfig == nil {
return fmt.Errorf("Failed to load configuration")
}
// Set defaults if not specified
if newConfig.Agent.MonitorInterval == 0 {
newConfig.Agent.MonitorInterval = defaultMonitorInterval
}
if newConfig.Agent.CleanupTimer == 0 {
newConfig.Agent.CleanupTimer = defaultCleanupTimer
}
// Check if BGP configuration has changed
bgpChanged := m.bgpConfigChanged(m.config.Bgp, newConfig.Bgp)
if bgpChanged {
glog.Infof("BGP configuration changed, restarting BGP controller")
// Withdraw all current routes before shutting down
m.monMu.Lock()
for _, am := range m.monitors {
if am.announced {
if err := m.ctrl.Withdraw(am.app.Vip); err != nil {
glog.Errorf("Failed to withdraw route during reload: %v", err)
}
am.announced = false
}
}
m.monMu.Unlock()
// Shutdown old BGP controller
if err := m.ctrl.Shutdown(); err != nil {
glog.Errorf("Failed to shutdown BGP controller: %v", err)
}
// Start new BGP controller
ctrl, err := NewController(newConfig.Bgp)
if err != nil {
return fmt.Errorf("Failed to start new BGP controller: %v", err)
}
m.ctrl = ctrl
// Re-announce all routes with new BGP config
m.monMu.Lock()
for _, am := range m.monitors {
// Only re-announce if the app was previously announced and is still healthy
if m.runMonitors(am.app) {
if err := m.ctrl.Announce(am.app.Vip); err != nil {
glog.Errorf("Failed to re-announce route %s: %v", am.app.Vip.Net.String(), err)
} else {
am.announced = true
glog.Infof("Re-announced route %s", am.app.Vip.Net.String())
}
}
}
m.monMu.Unlock()
}
// Handle consul configuration changes
if m.config.Agent.ConsulAddr != newConfig.Agent.ConsulAddr ||
m.config.Agent.ConsulToken != newConfig.Agent.ConsulToken {
glog.Infof("Consul configuration changed")
if newConfig.Agent.ConsulAddr != "" {
cmon, err := NewConsulMon(newConfig.Agent.ConsulAddr, newConfig.Agent.ConsulToken)
if err != nil {
glog.Errorf("Failed to start consul monitor: %v", err)
} else {
m.consul = cmon
// Start consul monitoring in background if not already running
if m.config.Agent.ConsulAddr == "" {
go m.consulMon()
}
}
}
}
// Update agent configuration
oldConfig := m.config
m.config = newConfig
// Handle app configuration changes
m.reloadApps(oldConfig.Apps, newConfig.Apps)
glog.Infof("Configuration reloaded successfully")
return nil
}
// bgpConfigChanged checks if BGP configuration has changed
func (m *MonitorMgr) bgpConfigChanged(old, new c.BgpConfig) bool {
// Check basic parameters
if old.LocalAS != new.LocalAS || old.LocalIP != new.LocalIP || old.Origin != new.Origin {
return true
}
// Check legacy peer config
if old.PeerAS != new.PeerAS || old.PeerIP != new.PeerIP {
return true
}
// Check peers
if len(old.Peers) != len(new.Peers) {
return true
}
// Compare each peer
for i := range old.Peers {
if old.Peers[i].PeerIP != new.Peers[i].PeerIP ||
old.Peers[i].PeerAS != new.Peers[i].PeerAS ||
old.Peers[i].MD5Password != new.Peers[i].MD5Password ||
old.Peers[i].MD5EnvVar != new.Peers[i].MD5EnvVar {
return true
}
// Check multi-hop
if (old.Peers[i].MultiHop == nil) != (new.Peers[i].MultiHop == nil) {
return true
}
if old.Peers[i].MultiHop != nil && new.Peers[i].MultiHop != nil {
if *old.Peers[i].MultiHop != *new.Peers[i].MultiHop {
return true
}
}
// Check communities
if len(old.Peers[i].Communities) != len(new.Peers[i].Communities) {
return true
}
for j := range old.Peers[i].Communities {
if old.Peers[i].Communities[j] != new.Peers[i].Communities[j] {
return true
}
}
}
// Check global communities
if len(old.Communities) != len(new.Communities) {
return true
}
for i := range old.Communities {
if old.Communities[i] != new.Communities[i] {
return true
}
}
return false
}
// reloadApps compares old and new app configurations and applies changes
func (m *MonitorMgr) reloadApps(oldApps, newApps []c.AppConfig) {
// Build maps for easy comparison
oldAppMap := make(map[string]c.AppConfig)
for _, app := range oldApps {
oldAppMap[app.Name] = app
}
newAppMap := make(map[string]c.AppConfig)
for _, app := range newApps {
newAppMap[app.Name] = app
}
// Remove apps that are no longer in config
for name := range oldAppMap {
if _, exists := newAppMap[name]; !exists {
m.monMu.Lock()
if am, ok := m.monitors[name]; ok {
if am.app.Source == "config" {
glog.Infof("Removing app %s (no longer in config)", name)
m.monMu.Unlock()
m.Remove(name)
continue
}
}
m.monMu.Unlock()
}
}
// Add new apps or update existing ones
for name, newAppConfig := range newAppMap {
oldAppConfig, existed := oldAppMap[name]
// Check if app configuration changed
configChanged := !existed ||
oldAppConfig.Vip != newAppConfig.Vip ||
!equalStringSlices(oldAppConfig.Monitors, newAppConfig.Monitors) ||
!equalStringSlices(oldAppConfig.Nats, newAppConfig.Nats) ||
!equalStringSlices(oldAppConfig.VipConfig.BgpCommunities, newAppConfig.VipConfig.BgpCommunities)
if configChanged {
if existed {
glog.Infof("App %s configuration changed, reloading", name)
m.Remove(name)
} else {
glog.Infof("Adding new app %s from config", name)
}
app, err := NewApp(newAppConfig.Name, newAppConfig.Vip, newAppConfig.VipConfig,
newAppConfig.Monitors, newAppConfig.Nats, "config")
if err != nil {
glog.Errorf("Failed to add app %s: %v", name, err)
continue
}
m.Add(app)
}
}
}
// equalStringSlices compares two string slices
func equalStringSlices(a, b []string) bool {
if len(a) != len(b) {
return false
}
for i := range a {
if a[i] != b[i] {
return false
}
}
return true
}
// CleanUp periodically monitors for stale apps and cleans them up
func (m *MonitorMgr) Cleanup(app string, exit chan bool) {
t := time.NewTimer(m.config.Agent.CleanupTimer)

293
controller/reload_test.go Normal file
View File

@@ -0,0 +1,293 @@
package controller
import (
"io/ioutil"
"os"
"testing"
"time"
config "github.com/mayuresh82/gocast/config"
"github.com/stretchr/testify/assert"
)
func TestBgpConfigChanged(t *testing.T) {
a := assert.New(t)
mon := &MonitorMgr{}
// Test 1: No changes
cfg1 := config.BgpConfig{
LocalAS: 12345,
LocalIP: "192.168.1.100",
Origin: "igp",
Peers: []config.PeerConfig{
{PeerIP: "10.10.10.1", PeerAS: 6789},
},
Communities: []string{"100:100"},
}
cfg2 := cfg1
a.False(mon.bgpConfigChanged(cfg1, cfg2), "Identical configs should not be considered changed")
// Test 2: LocalAS changed
cfg3 := cfg1
cfg3.LocalAS = 54321
a.True(mon.bgpConfigChanged(cfg1, cfg3), "LocalAS change should be detected")
// Test 3: Peer IP changed
cfg4 := cfg1
cfg4.Peers = []config.PeerConfig{
{PeerIP: "10.10.10.2", PeerAS: 6789},
}
a.True(mon.bgpConfigChanged(cfg1, cfg4), "Peer IP change should be detected")
// Test 4: MD5 password changed
cfg5 := cfg1
cfg5.Peers = []config.PeerConfig{
{PeerIP: "10.10.10.1", PeerAS: 6789, MD5Password: "secret"},
}
a.True(mon.bgpConfigChanged(cfg1, cfg5), "MD5 password change should be detected")
// Test 5: Community added
cfg6 := cfg1
cfg6.Communities = []string{"100:100", "200:200"}
a.True(mon.bgpConfigChanged(cfg1, cfg6), "Community addition should be detected")
// Test 6: Peer added
cfg7 := cfg1
cfg7.Peers = []config.PeerConfig{
{PeerIP: "10.10.10.1", PeerAS: 6789},
{PeerIP: "10.10.10.2", PeerAS: 6789},
}
a.True(mon.bgpConfigChanged(cfg1, cfg7), "Peer addition should be detected")
// Test 7: MultiHop changed
multiHopTrue := true
cfg8 := cfg1
cfg8.Peers = []config.PeerConfig{
{PeerIP: "10.10.10.1", PeerAS: 6789, MultiHop: &multiHopTrue},
}
a.True(mon.bgpConfigChanged(cfg1, cfg8), "MultiHop change should be detected")
}
func TestEqualStringSlices(t *testing.T) {
a := assert.New(t)
a.True(equalStringSlices([]string{}, []string{}), "Empty slices should be equal")
a.True(equalStringSlices([]string{"a", "b"}, []string{"a", "b"}), "Identical slices should be equal")
a.False(equalStringSlices([]string{"a"}, []string{"a", "b"}), "Different length slices should not be equal")
a.False(equalStringSlices([]string{"a", "b"}, []string{"a", "c"}), "Different content should not be equal")
}
func TestReload(t *testing.T) {
if os.Getenv("CI") != "" {
t.Skip("Skipping reload test in CI environment")
}
a := assert.New(t)
// Create initial config file
initialConfig := `
agent:
listen_addr: :8080
monitor_interval: 10s
cleanup_timer: 15m
bgp:
local_as: 12345
peer_as: 6789
local_ip: 192.168.1.100
origin: igp
communities:
- 100:100
apps:
- name: test-app
vip: 1.1.1.1/32
monitors:
- exec:echo
`
// Create temporary config file
tmpfile, err := ioutil.TempFile("", "gocast-test-*.yaml")
a.NoError(err)
defer os.Remove(tmpfile.Name())
_, err = tmpfile.Write([]byte(initialConfig))
a.NoError(err)
tmpfile.Close()
// Initialize monitor with initial config
conf := config.GetConfig(tmpfile.Name())
mon := NewMonitor(conf)
defer mon.CloseAll()
// Wait a bit for initialization
time.Sleep(100 * time.Millisecond)
// Verify initial state
a.Equal(12345, mon.ctrl.localAS)
m := mon.monitors["test-app"]
a.NotNil(m, "Initial app should be loaded")
// Update config file with new BGP AS and remove app
updatedConfig := `
agent:
listen_addr: :8080
monitor_interval: 10s
cleanup_timer: 15m
bgp:
local_as: 54321
peer_as: 6789
local_ip: 192.168.1.100
origin: igp
communities:
- 200:200
`
err = ioutil.WriteFile(tmpfile.Name(), []byte(updatedConfig), 0644)
a.NoError(err)
// Reload configuration
err = mon.Reload(tmpfile.Name())
a.NoError(err)
// Wait for reload to complete
time.Sleep(200 * time.Millisecond)
// Verify new state
a.Equal(54321, mon.ctrl.localAS)
a.Equal([]string{"200:200"}, mon.ctrl.communities)
// Verify app was removed
mon.monMu.Lock()
_, exists := mon.monitors["test-app"]
mon.monMu.Unlock()
a.False(exists, "App should be removed after reload")
}
func TestReloadAddApp(t *testing.T) {
if os.Getenv("CI") != "" {
t.Skip("Skipping reload test in CI environment")
}
a := assert.New(t)
// Create initial config without apps
initialConfig := `
agent:
listen_addr: :8080
monitor_interval: 10s
bgp:
local_as: 12345
peer_as: 6789
local_ip: 192.168.1.100
origin: igp
`
tmpfile, err := ioutil.TempFile("", "gocast-test-*.yaml")
a.NoError(err)
defer os.Remove(tmpfile.Name())
_, err = tmpfile.Write([]byte(initialConfig))
a.NoError(err)
tmpfile.Close()
conf := config.GetConfig(tmpfile.Name())
mon := NewMonitor(conf)
defer mon.CloseAll()
time.Sleep(100 * time.Millisecond)
// Verify no apps initially
mon.monMu.Lock()
initialCount := len(mon.monitors)
mon.monMu.Unlock()
a.Equal(0, initialCount)
// Add app to config
updatedConfig := `
agent:
listen_addr: :8080
monitor_interval: 10s
bgp:
local_as: 12345
peer_as: 6789
local_ip: 192.168.1.100
origin: igp
apps:
- name: new-app
vip: 2.2.2.2/32
monitors:
- exec:echo
`
err = ioutil.WriteFile(tmpfile.Name(), []byte(updatedConfig), 0644)
a.NoError(err)
// Reload
err = mon.Reload(tmpfile.Name())
a.NoError(err)
time.Sleep(200 * time.Millisecond)
// Verify app was added
mon.monMu.Lock()
_, exists := mon.monitors["new-app"]
mon.monMu.Unlock()
a.True(exists, "New app should be added after reload")
}
func TestReloadMD5Change(t *testing.T) {
if os.Getenv("CI") != "" {
t.Skip("Skipping reload test in CI environment")
}
a := assert.New(t)
// Set environment variable for MD5 password
os.Setenv("BGP_TEST_PASSWORD", "initial_secret")
defer os.Unsetenv("BGP_TEST_PASSWORD")
initialConfig := `
agent:
listen_addr: :8080
bgp:
local_as: 12345
local_ip: 192.168.1.100
peers:
- peer_ip: 10.10.10.1
peer_as: 6789
md5_env_var: BGP_TEST_PASSWORD
origin: igp
`
tmpfile, err := ioutil.TempFile("", "gocast-test-*.yaml")
a.NoError(err)
defer os.Remove(tmpfile.Name())
_, err = tmpfile.Write([]byte(initialConfig))
a.NoError(err)
tmpfile.Close()
conf := config.GetConfig(tmpfile.Name())
mon := NewMonitor(conf)
defer mon.CloseAll()
time.Sleep(100 * time.Millisecond)
// Update environment variable
os.Setenv("BGP_TEST_PASSWORD", "updated_secret")
// Reload (MD5 env var change should trigger BGP reload)
err = mon.Reload(tmpfile.Name())
a.NoError(err)
// Note: We can't easily verify the MD5 password changed without
// actually establishing BGP sessions, but we can verify reload succeeded
a.NotNil(mon.ctrl)
}

11
main.go
View File

@@ -33,7 +33,16 @@ func main() {
go func() {
for {
sig := <-signalChan
if sig == os.Interrupt || sig == syscall.SIGTERM {
switch sig {
case syscall.SIGHUP:
log.Info("Received SIGHUP, reloading configuration")
if err := mon.Reload(*config); err != nil {
log.Errorf("Failed to reload configuration: %v", err)
} else {
log.Info("Configuration reloaded successfully")
}
case os.Interrupt, syscall.SIGTERM:
log.Info("Received shutdown signal, cleaning up")
mon.CloseAll()
cancel()
return