Implemented dynamic configuration reloading for gocast via SIGHUP signal handling. This allows updating BGP configuration, applications, and agent settings without service restart.
**Location:** `main.go:29-46`
Enhanced the existing signal handler to process SIGHUP:
```go
switch sig {
case syscall.SIGHUP:
log.Info("Received SIGHUP, reloading configuration")
if err := mon.Reload(*config); err != nil {
log.Errorf("Failed to reload configuration: %v", err)
} else {
log.Info("Configuration reloaded successfully")
}
case os.Interrupt, syscall.SIGTERM:
log.Info("Received shutdown signal, cleaning up")
mon.CloseAll()
cancel()
return
}
```
**Key Features:**
- Non-blocking: Uses goroutine to handle signals
- Graceful: SIGINT/SIGTERM still trigger clean shutdown
- Logged: All reload attempts are logged with success/failure status
**Location:** `controller/monitor.go:343-420`
Main reload orchestration method:
```go
func (m *MonitorMgr) Reload(configPath string) error
```
**Process Flow:**
1. Read new configuration from file
2. Compare with current configuration
3. If BGP config changed:
- Withdraw all announced routes
- Shutdown old BGP controller
- Start new BGP controller
- Re-announce routes for healthy apps
4. If Consul config changed:
- Initialize new Consul monitor
5. Update agent settings
6. Reload applications (add/remove/update)
**Thread Safety:**
- Uses existing `monMu` mutex for monitor map access
- Atomic BGP controller replacement
- No race conditions during reload
**Location:** `controller/monitor.go:422-475`
```go
func (m *MonitorMgr) bgpConfigChanged(old, new c.BgpConfig) bool
```
Comprehensive comparison of:
- Local AS, Peer AS, Peer IPs
- BGP origin
- Multi-hop settings (including nil checks)
- MD5 passwords and environment variables
- Per-peer communities
- Global communities
**Important:** Deep comparison ensures even minor changes are detected.
**Location:** `controller/monitor.go:477-532`
```go
func (m *MonitorMgr) reloadApps(oldApps, newApps []c.AppConfig)
```
Intelligent app management:
- **Remove:** Apps no longer in config (source="config" only)
- **Update:** Apps with changed configuration (VIP, monitors, NAT, communities)
- **Add:** New apps in configuration
**Key Behavior:**
- Consul-discovered apps are NOT removed during reload
- Only config-defined apps are managed
- Config changes trigger remove + re-add
1. **TestBgpConfigChanged**
- Tests all BGP configuration change scenarios
- Validates detection of AS, peer, MD5, community changes
- Includes nil multi-hop pointer checks
2. **TestEqualStringSlices**
- Tests slice comparison helper
- Validates empty, identical, and different slices
3. **TestReload** (Integration, requires root)
- Full reload cycle with BGP AS change
- App removal verification
- BGP controller replacement validation
4. **TestReloadAddApp** (Integration)
- Tests adding new app via reload
- Validates app registration
5. **TestReloadMD5Change** (Integration)
- Tests MD5 password change detection
- Validates BGP controller restart
**Decision:** Reload BGP configuration requires full controller restart.
**Rationale:**
- GoBGP library doesn't support modifying peers dynamically
- Simplifies implementation
- Ensures clean state
- Brief interruption is acceptable for infrequent config changes
**Alternative Considered:** Per-peer updates
- Complex to implement correctly
- Partial state issues
- Not supported well by GoBGP library
**Decision:** Log errors but don't crash; maintain old state on failure.
**Rationale:**
- Availability over correctness for config errors
- Admin can fix config and retry
- Better than service downtime
- Logs provide clear error messages
1. **BGP Interruption**
- Full BGP restart causes brief routing interruption
- All routes withdrawn and re-announced
- May impact traffic during reload
2. **No Atomic BGP Updates**
- Cannot add/remove single peer without full restart
- All peers affected even if one changes
3. **No Config Validation**
- Invalid config is detected during reload
- No pre-validation before applying
- Syntax errors require manual fix and retry
4. **No Rollback**
- Failed reload leaves service in potentially inconsistent state
- Manual intervention required to restore
- No automatic rollback to previous config
These changes were written using AI LLM
Authored-By: Claude Code (Sonnet 4.5)
The BGP controller now supports announcing routes to multiple BGP peers for redundancy and resilience. If one peer fails, route announcements continue to succeed for other healthy peers.
```yaml
bgp:
local_as: 12345
local_ip: 192.168.1.100 # optional
peers:
- peer_ip: 10.10.10.1
peer_as: 6789
communities: # per-peer communities (optional)
- 100:100
- peer_ip: 10.10.10.2
peer_as: 6789
communities:
- 100:101
multi_hop: true # optional, defaults to true for eBGP
communities: # global communities applied to all peers
- 1000:1000
origin: igp
```
```yaml
bgp:
local_as: 12345
peer_as: 6789
peer_ip: 10.10.10.1
communities:
- 100:100
origin: igp
```
Legacy configurations are automatically converted to the new format internally, ensuring backward compatibility.
Routes are announced to all configured peers. If announcement to one peer fails, the operation continues for other peers. Errors are aggregated and returned, but partial success is allowed.
Communities are merged in the following order:
1. **Global communities** (defined at `bgp.communities`)
2. **Per-peer communities** (defined at `bgp.peers[].communities`)
3. **Per-route communities** (defined at `apps[].vip_config.bgp_communities`)
Example: If global communities are `[1000:1000]`, peer communities are `[100:100]`, and route communities are `[5000:5000]`, the announced route will have all three: `[1000:1000, 100:100, 5000:5000]`.
- **Default behavior**: Multi-hop is disabled by default
- **Enable**: Set `multi_hop: true` per peer to explicitly enable multi-hop BGP
The `/info` endpoint now returns an array of peer information instead of a single peer object:
**Before:**
```json
{
"conf": {
"neighbor_address": "10.10.10.1",
"peer_as": 6789
},
"state": {...}
}
```
**After:**
```json
[
{
"conf": {
"neighbor_address": "10.10.10.1",
"peer_as": 6789
},
"state": {...}
},
{
"conf": {
"neighbor_address": "10.10.10.2",
"peer_as": 6789
},
"state": {...}
}
]
```
- `config/config.go`: Added `PeerConfig` struct and `Peers` slice to `BgpConfig`
- `controller/bgp.go`: Refactored to support multiple peers with best-effort semantics
- `controller/monitor.go`: Updated `GetInfo()` to return slice of peers
- `server/server.go`: Updated info handler to return array of peers
1. **Controller struct** now stores `[]PeerConfig` instead of single peer fields
2. **Announce/Withdraw** methods loop through all peers with error aggregation
3. **getApiPath** accepts a `PeerConfig` parameter for per-peer community merging
4. **addPeer** determines multi-hop settings per peer
5. **PeerInfo** returns information for all configured peers
6. **Shutdown** gracefully shuts down all peer sessions
The implementation includes comprehensive test coverage:
1. **TestLegacyConfigConversion** - Verifies backward compatibility by testing that legacy single-peer configs are automatically converted to multi-peer format
2. **TestMultiPeerConfig** - Tests that new multi-peer configurations are properly loaded with multiple peers
3. **TestNoPeersConfigError** - Ensures proper error handling when no peers are configured
4. **TestCommunityMerging** - Validates that global, per-peer, and per-route communities are correctly merged in order
5. **TestMultiHopConfiguration** - Tests multi-hop BGP settings with various scenarios:
- Default behavior (multi-hop disabled)
- Explicit multi-hop disable
- Explicit multi-hop enable
6. **TestBestEffortAnnouncement** - Verifies that announcements succeed even when individual peers may have issues
7. **TestWithdrawMultiplePeers** - Tests route withdrawal across multiple peers
8. **TestPeerInfoMultiplePeers** - Validates that peer information is correctly returned for all configured peers
- **TestBgpNew** - Full integration test with actual BGP listeners (requires root, skipped in CI)
- **TestMultiPeerAnnouncement** - Tests actual route announcements to multiple BGP listeners (requires root, skipped in CI)
Existing configurations using `peer_ip` and `peer_as` continue to work without modification.
To add a second peer for resilience:
```yaml
bgp:
local_as: 12345
# Keep existing config for backward compatibility, or remove these lines
# peer_as: 6789
# peer_ip: 10.10.10.1
# Add new multi-peer config
peers:
- peer_ip: 10.10.10.1
peer_as: 6789
- peer_ip: 10.10.10.2 # redundant peer
peer_as: 6789
communities:
- 100:100
origin: igp
```
All operations (Announce, Withdraw, Shutdown) use best-effort error handling:
- Operations continue even if individual peers fail
- Errors are collected and returned as aggregated error messages
- Format: `"announcement errors: [peer 10.10.10.1: error message, peer 10.10.10.2: error message]"`
These changes were authored via AI LLM.
Authored-By: Claude Code (Sonnet 4.5)