OCPBUGS-75932: [release-4.19] fix(cpo): Correct route labeling logic for HCP router infrastructure#7643
OCPBUGS-75932: [release-4.19] fix(cpo): Correct route labeling logic for HCP router infrastructure#7643csrwng wants to merge 1 commit intoopenshift:release-4.19from
Conversation
|
Skipping CI for Draft Pull Request. |
|
Important Review skippedAuto reviews are limited based on label configuration. 🚫 Excluded labels (none allowed) (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the
✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
|
/test verify |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: csrwng The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@csrwng: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/jira cherry-pick OCPBUGS-75931 |
|
@csrwng: Jira Issue OCPBUGS-75931 has been cloned as Jira Issue OCPBUGS-75932. Will retitle bug to link to clone. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@csrwng: This pull request references Jira Issue OCPBUGS-75932, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
During upgrades from 4.18 to 4.19, an unexpected LoadBalancer service named "router" was created, blocking upgrades on platforms with limited LoadBalancer IPs. This occurred because the previous fix (f0f8b08) used incorrect logic to determine when to create the HCP router infrastructure. Root Cause: Previous fixes (f0f8b08 and ee45d1b) treated route labeling as a per-service decision, checking if individual services had DNS hostnames. This created router infrastructure even when routes should use the management cluster router. For PublicAndPrivate clusters with KAS LoadBalancer + OAuth Route with hostname: - Old logic: IsPublicWithDNS() returned true (OAuth has DNS) - Result: Created router LB when not needed -> upgrade blocked - Routes got labeled inconsistently based on per-service DNS config Why the previous approach was wrong: 1. Route labeling should be a cluster-level infrastructure decision, not per-service 2. Router infrastructure availability is determined by KAS publishing strategy 3. Checking if ANY service has DNS doesn't indicate if HCP router exists 4. Could label routes for HCP router even when no HCP router infrastructure exists Correct Solution: Routes should be labeled for HCP router based on HCP router infrastructure availability, which is determined by the KAS publishing strategy: - Label routes for HCP router when: 1. Cluster uses PrivateLink (AWS PublicAndPrivate or Private), OR 2. Cluster is public with dedicated DNS for KAS (KAS uses Route with hostname) - For PrivateLink with KAS LoadBalancer: - External route (OAuth) uses management cluster router - Internal routes (Konnectivity and Ignition) are handled by HCP router Implementation: - Added util.LabelHCPRoutes() as single source of truth for labeling decisions - Updated all route reconciliation (OAuth, Konnectivity, Ignition) to use unified logic - Fixed router service creation to align with labeling: only create when routes need it - Removed incorrect per-service DNS functions (UseDedicatedDNSForOAuth, etc.) - Removed IsPublicWithDNS() functions that checked if ANY service had DNS - Removed validation that relied on IsPublicWithDNS() Changes: - support/util/visibility.go: Added LabelHCPRoutes(), removed IsPublicWithDNS functions - support/util/expose.go: Removed per-service DNS helper functions - hostedcontrolplane_controller.go: Use LabelHCPRoutes() for all route labeling - v2/ignitionserver/route.go: Use LabelHCPRoutes() - v2/router/component.go: Use LabelHCPRoutes() - hostedcluster_controller.go: Removed incorrect validation Result: For PublicAndPrivate + KAS LoadBalancer + OAuth with hostname: - OAuth route NOT labeled -> uses management cluster router - Internal Router LB created -> only used for internal routes (Konnectivity and Ignition) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
74a7d50 to
fa9b608
Compare
Manual backport of #7642