Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-48469: GCP: Update /etc/hosts file when ClusterHostedDNS is enabled #4800

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

sadasu
Copy link
Contributor

@sadasu sadasu commented Jan 22, 2025

Append /etc/hosts files with entries to resolve cluster api and api-int URLS. /etc/hosts will provide resolution for these URLs until kubelet joins the cluster and runs its CoreDNS pod which will then take over resolution of those 2 URLs

- What I did

- How to verify it

- Description for the changelog

@sadasu sadasu changed the title GCP: Update /etc/hosts file when ClusterHostedDNS is enabled WIP: OCPBUGS-48469: GCP: Update /etc/hosts file when ClusterHostedDNS is enabled Jan 22, 2025
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 22, 2025
@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Jan 22, 2025
@openshift-ci-robot
Copy link
Contributor

@sadasu: This pull request references Jira Issue OCPBUGS-48469, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.19.0) matches configured target version for branch (4.19.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @jianli-wei

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Append /etc/hosts files with entries to resolve cluster api and api-int URLS. /etc/hosts will provide resolution for these URLs until kubelet joins the cluster and runs its CoreDNS pod which will then take over resolution of those 2 URLs

- What I did

- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sadasu sadasu force-pushed the gcp-update-etc-hosts branch 9 times, most recently from 1bc96ca to 3280ff8 Compare January 23, 2025 21:50
apiServerIntURL={{ .Infra.Status.APIServerInternalURL }}
# Add the/etc/hosts configuration file
mkdir -p /etc/hosts/conf.d
cat <<EOF | tee /etc/hosts/conf.d/etc-hosts.conf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could maybe name the file api.conf or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping that naming it as etc-hosts.conf would be it obvious that this file contains some configuration for /etc/hosts. Happy to call it api.conf is it contains information about resolving the API/-Int urls.

{{ else }}
exit 0
{{ end }}
if [ -z "${apiIntLBIPs}" ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be:

Suggested change
if [ -z "${apiIntLBIPs}" ]; then
if [ -z "{{$apiIntLBIPs}}" ]; then

?

We appear to be missing a line like:

  apiIntLBIPs={{$apiIntLBIPs}}

to get the template variable that we defined on line 19 into a bash variable.

mkdir -p /etc/hosts/conf.d
cat <<EOF | tee /etc/hosts/conf.d/etc-hosts.conf
# Added by OpenShift
${apiLBIPs[0]} ${apiServerURL}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are treating apiLBIPs as a bash array here. That's fine if it is, but bash arrays always seem like kind of a pain to set up to me. It might be simpler to do this in the template:

Suggested change
${apiLBIPs[0]} ${apiServerURL}
{{$apiLBIPs[0]}} ${apiServerURL}

@@ -0,0 +1,11 @@
mode: 0755
path: "/usr/local/bin/update-etc-hosts"
contents:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially a simpler way, avoiding the script and systemd service:

path: "/etc/hosts"
append:
  - inline: |
    {{ if and (eq .Infra.Status.PlatformStatus.Type "GCP") (.Infra.Status.PlatformStatus.GCP) (.Infra.Status.PlatformStatus.GCP.CloudLoadBalancerConfig) (eq .Infra.Status.PlatformStatus.GCP.CloudLoadBalancerConfig.DNSType "ClusterHosted") }}
    {{ $apiIntLBIPs := cloudPlatformAPIIntLoadBalancerIPs . }}
    {{ if len $apiIntLBIPs > 0 }}
    {{ $apiLBIPs := cloudPlatformAPILoadBalancerIPs . }}
    {{ if len $apiLBIPs > 0 }}{{ $apiLBIPs[0] }}{{ else }}{{ $apiIntLBIPs[0] }}{{ end }} {{ .Infra.Status.APIServerURL }}
    {{ $apiIntLBIPs[0] }} {{ .Infra.Status.APIServerInternalURL }}
    {{ end }}
    {{ end }}

Copy link
Contributor Author

@sadasu sadasu Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The systemd service is providing us a way to time the running of this script before kubelet.

To check if feature to run in-cluster DNS on GCP and AWS is enabled
by checking if the value of
`PlatformStatus.GCP.CloudLoadBalancerConfig.DNSType` is set to
`ClusterHosted`.
@sadasu sadasu force-pushed the gcp-update-etc-hosts branch 2 times, most recently from 15c0227 to 3d451a6 Compare January 27, 2025 19:21
@sadasu sadasu changed the title WIP: OCPBUGS-48469: GCP: Update /etc/hosts file when ClusterHostedDNS is enabled OCPBUGS-48469: GCP: Update /etc/hosts file when ClusterHostedDNS is enabled Jan 27, 2025
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 27, 2025
@sadasu sadasu force-pushed the gcp-update-etc-hosts branch 4 times, most recently from 7ff1573 to 91d44ca Compare January 27, 2025 19:57
Append /etc/hosts files with entries to resolve cluster api and
api-int URLS. /etc/hosts will provide resolution for these URLs
until kubelet joins the cluster and runs its CoreDNS pod which
will then take over resolution of those 2 URLs
Added tests to accomodate GCP in-cluster DNS config
@sadasu sadasu force-pushed the gcp-update-etc-hosts branch from 91d44ca to 1b13a9c Compare January 27, 2025 21:57
Copy link
Contributor

@yuqi-zhang yuqi-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logically seems fine, although I am unsure how to test, so let me know if you'd like any QE pre-merge testing on this

@@ -777,6 +778,33 @@ func cloudPlatformIngressLoadBalancerIPs(cfg RenderConfig) (interface{}, error)
}
}

// cloudPlatformLBIPAvailable returns true when DNSType is set to `ClusterHosted`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, based on the comment I'd expect some check for clusterhosted in the function. I guess it's implicit since the service enablement is dependent on this field?

(I know we do the same elsewhere in the template rendering, so I'm fine with it as is)

Copy link
Contributor

openshift-ci bot commented Jan 28, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sadasu, yuqi-zhang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 28, 2025
Copy link
Contributor

openshift-ci bot commented Jan 28, 2025

@sadasu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-op-single-node 1b13a9c link true /test e2e-gcp-op-single-node
ci/prow/e2e-hypershift 1b13a9c link true /test e2e-hypershift
ci/prow/e2e-gcp-op 1b13a9c link true /test e2e-gcp-op
ci/prow/e2e-azure-ovn-upgrade-out-of-change 1b13a9c link false /test e2e-azure-ovn-upgrade-out-of-change

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@gpei
Copy link

gpei commented Jan 28, 2025

@sadasu Hi, I just tried to use the latest commit to test the installation of GCP custom DNS. The gcp-update-etc-hosts.service couldn't be started on masters because of the following syntax error:

[core@gpei-0128-gcpdns-5zs6h-master-0 ~]$ journalctl -u gcp-update-etc-hosts.service --no-pager
Jan 28 05:58:05 gpei-0128-gcpdns-5zs6h-master-0 systemd[1]: Starting Update Default GCP /etc/hosts...
Jan 28 05:58:05 gpei-0128-gcpdns-5zs6h-master-0 bash[1233]: /bin/bash: -c: line 1: syntax error near unexpected token `then'
Jan 28 05:58:05 gpei-0128-gcpdns-5zs6h-master-0 bash[1233]: /bin/bash: -c: line 1: `    apiIntLBIPs=[10.0.0.2]      apiLBIPs=[34.54.248.13]    if [ -z $apiLBIPs ]; then    apiLBIPs=$apiIntLBIPs  fi  apiServerURL=https://api.gpei-0128-gcpdns.qe.gcp.devcluster.openshift.com:6443  apiServerIntURL=https://api-int.gpei-0128-gcpdns.qe.gcp.devcluster.openshift.com:6443  mkdir -p /etc/conf.d  cat <<EOF | tee /etc/conf.d/etc-hosts.conf              EOF  /usr/local/bin/update-etc-hosts'
Jan 28 05:58:05 gpei-0128-gcpdns-5zs6h-master-0 systemd[1]: gcp-update-etc-hosts.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Jan 28 05:58:05 gpei-0128-gcpdns-5zs6h-master-0 systemd[1]: gcp-update-etc-hosts.service: Failed with result 'exit-code'.
Jan 28 05:58:05 gpei-0128-gcpdns-5zs6h-master-0 systemd[1]: Failed to start Update Default GCP /etc/hosts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants