ovn_maintenance_worker deleting a healthy router external gw LRP
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| neutron |
New
|
Undecided
|
Unassigned | ||
Bug Description
## Summary
In a Kolla-Ansible OVN deployment, `neutron_
After that:
- Neutron still shows the router external gateway
- the Neutron gateway port still exists and is `ACTIVE`
- the OVN gateway `Logical_
- new floating IP create/repair fails with `AttributeError: 'NoneType' object has no attribute 'options'`
## High Level Description
I am trying to use floating IPs behind an OVN Neutron router in a multi-node Kolla-Ansible deployment. A healthy router can be broken by the periodic `neutron_
## Pre-conditions
- reporter privileges: using admin when trying to reproduce the issue
- deployment: multi-node Kolla-Ansible
- OpenStack networking backend: ML2/OVN
- external/provider network on `physnet1`
- OVN gateway scheduling enabled
Test objects used in reproduction:
- `test_router_1`: `febfe173-
- `test_gateway_
- `test_floating_
- `test_floating_
## Step-by-step Reproduction
1. Create an isolated router with an external gateway on `public`.
2. Add a tenant subnet behind that router.
3. Create a tenant port and associate `test_floating_
4. Verify `test_floating_
- gateway LRP `lrp-test_
- a `dnat_and_snat` NAT row for `test_floating_
5. Wait for the periodic `neutron_
6. Create a second tenant port and associate `test_floating_
## Expected Output
- the maintenance worker must not break a healthy router
- if maintenance rewrites the external gateway, `lrp-test_
- `test_floating_
- floating IP create/repair must not crash when the gateway LRP lookup returns `None`
## Actual Output
- maintenance first marks `test_router_1` inconsistent and rewrites its external gateway
- after that rewrite, OVN no longer has `lrp-test_
- `test_floating_
- no OVN NAT row is created for `test_floating_
- Neutron raises `AttributeError: 'NoneType' object has no attribute 'options'`
- Neutron still shows the router external gateway and the gateway port still exists and is `ACTIVE`
## Version
- OpenStack release: `2025.2`
- deployment mechanism: `Kolla-Ansible`
- Linux distro on a controller: `debian 13`
- kernel on a controller: `6.12.74+
- OVN version: `ovn-nbctl 25.09.0`
- Open vSwitch version: `ovs-vsctl (Open vSwitch) 3.6.0`
## Environment
- multi-node control/network deployment
- service-to-service interaction involved:
- Neutron API / OVN maintenance worker
- OVN northbound database
- floating IP NAT programming on a router external gateway
## Perceived Severity
High.
The maintenance worker can corrupt a healthy router external gateway and leave floating IPs unusable until operator repair. This affects core north-south connectivity.
## Unknowns / Troubleshooting Notes
- The exact reason `test_router_1` was initially classified as inconsistent is unknown.
- The maintenance worker saw at least one bookkeeping problem on the gateway port during the repair:
- `No revision row found for 68f0bf00-
- Not sure why that `router_ports` revision row was missing:
- Most routers in the environment appear to continue working normally. The problem seems to affect routers that the maintenance worker decides to repair, not every router in the deployment. User reported it when deploying Kubernetes cluster using cluster-api.
## Relevant Log
### Maintenance worker starts router repair
`2026-04-13 16:28:30`
- `Maintenance task: Synchronizing Neutron and OVN databases started`
- `Number of inconsistencies found at create/update: networks=1, subnets=1, routers=1, router_ports=61, floatingips=2`
- `Fixing resource test_router_1 (type: routers)`
### Maintenance worker issues gateway rewrite
`2026-04-13 16:28:31`
- `DeleteLRouterE
- `AddLRouterPort
### OVN monitor sees deletes
Same window:
- delete `Logical_
- delete NAT row
- delete `Gateway_Chassis`
- delete `Logical_
### FIP path then crashes
`2026-04-13 16:29:07`
- `AttributeError: 'NoneType' object has no attribute 'options'`
