Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Failure Domains for Worker nodes feature for Nutanix provider #8837

Merged
merged 4 commits into from
Oct 14, 2024

Conversation

adiantum
Copy link
Contributor

@adiantum adiantum commented Oct 9, 2024

Description of changes:
Add Failure Domains for Worker nodes feature for Nutanix provider.
Example configuration:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: NutanixDatacenterConfig
metadata:
  name: eksa-work-fds
spec:
  credentialRef:
    kind: Secret
    name: nutanix-credentials
  endpoint: prismcentral.ci.ntnxsherlock.com
  port: 9440
  failureDomains:
  - cluster:
      name: e2e-1
      type: name
    name: pe1
    subnets:
    - name: vlan1-1
      type: name
    workerMachineGroups:
    - md-0
  - cluster:
      name: e2e-2
      type: name
    name: pe2
    subnets:
    - name: vlan1-2
      type: name
    workerMachineGroups:
    - md-0

Implementation details: add workerMachineGroups parameter to existing failureDomains. In workerMachineGroups should be list of EKS-A Clusters workerNodeGroupConfigurations names.

Testing (if applicable):

$ eksctl anywhere create cluster -f ./cluster-work-fds.yaml -v10 --bundles-override bin/local-bundle-release.yaml

2024-10-09T02:05:06.123Z	V6	Executing command	{"cmd": "/usr/bin/docker version --format {{.Client.Version}}"}
2024-10-09T02:05:06.143Z	V6	Executing command	{"cmd": "/usr/bin/docker info --format '{{json .MemTotal}}'"}
2024-10-09T02:05:06.186Z	V4	Reading bundles manifest	{"url": "bin/local-bundle-release.yaml"}
2024-10-09T02:05:06.205Z	V4	Using CAPI provider versions	{"Core Cluster API": "v1.8.3+35f247e", "Kubeadm Bootstrap": "v1.8.3+e2b444c", "Kubeadm Control Plane": "v1.8.3+85e0fa6", "External etcd Bootstrap": "v1.0.13+398c779", "External etcd Controller": "v1.0.23+76c47d8", "Cluster API Provider Nutanix": "v1.4.0+b6c034f"}
2024-10-09T02:05:06.338Z	V5	Retrier:	{"timeout": "2562047h47m16.854775807s", "backoffFactor": null}
2024-10-09T02:05:06.338Z	V2	Pulling docker image	{"image": "public.ecr.aws/l0g8r8j6/eks-anywhere-cli-tools:v0.20.7-eks-a-v0.21.0-dev-build.240"}
2024-10-09T02:05:06.338Z	V6	Executing command	{"cmd": "/usr/bin/docker pull public.ecr.aws/l0g8r8j6/eks-anywhere-cli-tools:v0.20.7-eks-a-v0.21.0-dev-build.240"}
2024-10-09T02:05:07.552Z	V5	Retry execution successful	{"retries": 1, "duration": "1.214491119s"}
2024-10-09T02:05:07.552Z	V3	Initializing long running container	{"name": "eksa_1728439506338083996", "image": "public.ecr.aws/l0g8r8j6/eks-anywhere-cli-tools:v0.20.7-eks-a-v0.21.0-dev-build.240"}
2024-10-09T02:05:07.552Z	V6	Executing command	{"cmd": "/usr/bin/docker run -d --name eksa_1728439506338083996 --network host -w /home/ubuntu/eksa-tests/worker-fds -v /var/run/docker.sock:/var/run/docker.sock -v /home/ubuntu/eksa-tests/worker-fds:/home/ubuntu/eksa-tests/worker-fds -v /home/ubuntu/eksa-tests/worker-fds:/home/ubuntu/eksa-tests/worker-fds --entrypoint sleep public.ecr.aws/l0g8r8j6/eks-anywhere-cli-tools:v0.20.7-eks-a-v0.21.0-dev-build.240 infinity"}
2024-10-09T02:05:08.515Z	V1	Using the eksa controller to create the management cluster
2024-10-09T02:05:08.515Z	V4	Task start	{"task_name": "setup-validate"}
2024-10-09T02:05:08.515Z	V0	Performing setup and validations
2024-10-09T02:05:08.515Z	V0	ValidateClusterSpec for Nutanix datacenter	{"NutanixDatacenter": "eksa-work-fds"}
2024-10-09T02:05:16.363Z	V0	✅ Nutanix Provider setup is valid
2024-10-09T02:05:16.363Z	V0	✅ Validate OS is compatible with registry mirror configuration
2024-10-09T02:05:16.363Z	V0	✅ Validate certificate for registry mirror
2024-10-09T02:05:16.363Z	V0	✅ Validate authentication for git provider
2024-10-09T02:05:16.363Z	V0	✅ Validate cluster's eksaVersion matches EKS-A version
2024-10-09T02:05:16.363Z	V4	Task finished	{"task_name": "setup-validate", "duration": "7.848345557s"}
2024-10-09T02:05:16.363Z	V4	----------------------------------
2024-10-09T02:05:16.363Z	V4	Task start	{"task_name": "bootstrap-cluster-init"}
2024-10-09T02:05:16.363Z	V0	Creating new bootstrap cluster
2024-10-09T02:05:16.364Z	V4	Creating kind cluster	{"name": "eksa-work-fds-eks-a-cluster", "kubeconfig": "eksa-work-fds/generated/eksa-work-fds.kind.kubeconfig"}
2024-10-09T02:05:16.364Z	V6	Executing command	{"cmd": "/usr/bin/docker exec -i eksa_1728439506338083996 kind create cluster --name eksa-work-fds-eks-a-cluster --kubeconfig eksa-work-fds/generated/eksa-work-fds.kind.kubeconfig --image public.ecr.aws/l0g8r8j6/kubernetes-sigs/kind/node:v1.30.4-eks-d-1-30-15-eks-a-v0.21.0-dev-build.240 --config eksa-work-fds/generated/kind_tmp.yaml"}
2024-10-09T02:05:42.155Z	V5	Retrier:	{"timeout": "2562047h47m16.854775807s", "backoffFactor": null}
2024-10-09T02:05:42.155Z	V6	Executing command	{"cmd": "/usr/bin/docker exec -i eksa_1728439506338083996 kubectl get namespace eksa-system --kubeconfig eksa-work-fds/generated/eksa-work-fds.kind.kubeconfig"}
2024-10-09T02:05:42.300Z	V9	docker	{"stderr": "Error from server (NotFound): namespaces \"eksa-system\" not found\n"}
2024-10-09T02:05:42.300Z	V6	Executing command	{"cmd": "/usr/bin/docker exec -i eksa_1728439506338083996 kubectl create namespace eksa-system --kubeconfig eksa-work-fds/generated/eksa-work-fds.kind.kubeconfig"}
2024-10-09T02:05:42.432Z	V5	Retry execution successful	{"retries": 1, "duration": "276.957039ms"}
2024-10-09T02:05:42.432Z	V4	Task finished	{"task_name": "bootstrap-cluster-init", "duration": "26.068595898s"}
2024-10-09T02:05:42.432Z	V4	----------------------------------
2024-10-09T02:05:42.432Z	V4	Task start	{"task_name": "update-secrets-create"}
2024-10-09T02:05:42.432Z	V4	Task finished	{"task_name": "update-secrets-create", "duration": "1.491µs"}
2024-10-09T02:05:42.432Z	V4	----------------------------------
2024-10-09T02:05:42.432Z	V4	Task start	{"task_name": "install-capi-components-bootstrap"}
2024-10-09T02:05:42.432Z	V0	Provider specific pre-capi-install-setup on bootstrap cluster
2024-10-09T02:05:42.432Z	V6	Executing command	{"cmd": "/usr/bin/docker exec -i eksa_1728439506338083996 kubectl apply -f - --kubeconfig eksa-work-fds/generated/eksa-work-fds.kind.kubeconfig"}
2024-10-09T02:05:43.167Z	V0	Installing cluster-api providers on bootstrap cluster
...
2024-10-09T02:19:44.735Z	V5	Retrier:	{"timeout": "2562047h47m16.854775807s", "backoffFactor": null}
2024-10-09T02:19:44.735Z	V4	Deleting kind cluster	{"name": "eksa-work-fds-eks-a-cluster"}
2024-10-09T02:19:44.735Z	V6	Executing command	{"cmd": "/usr/bin/docker exec -i eksa_1728439506338083996 kind delete cluster --name eksa-work-fds-eks-a-cluster"}
2024-10-09T02:19:45.858Z	V5	Retry execution successful	{"retries": 1, "duration": "1.123019002s"}
2024-10-09T02:19:45.858Z	V0	🎉 Cluster created!
2024-10-09T02:19:45.858Z	V4	Task finished	{"task_name": "delete-kind-cluster", "duration": "1.515133532s"}
2024-10-09T02:19:45.858Z	V4	----------------------------------
2024-10-09T02:19:45.858Z	V4	Task start	{"task_name": "install-curated-packages"}
--------------------------------------------------------------------------------------
The Amazon EKS Anywhere Curated Packages are only available to customers with the
Amazon EKS Anywhere Enterprise Subscription
--------------------------------------------------------------------------------------
2024-10-09T02:19:45.858Z	V0	Enabling curated packages on the cluster
...
2024-10-09T02:20:17.908Z	V6	Executing command	{"cmd": "/usr/bin/docker exec -i eksa_1728439506338083996 kubectl get --ignore-not-found -o json --kubeconfig eksa-work-fds/eksa-work-fds-eks-a-cluster.kubeconfig namespace --namespace default eksa-packages-eksa-work-fds"}
2024-10-09T02:20:18.049Z	V6	found namespace	{"namespace": "eksa-packages-eksa-work-fds"}
2024-10-09T02:20:18.049Z	V4	Task finished	{"task_name": "install-curated-packages", "duration": "32.191131896s"}
2024-10-09T02:20:18.049Z	V4	----------------------------------
2024-10-09T02:20:18.049Z	V4	Tasks completed	{"duration": "15m9.534257846s"}
2024-10-09T02:20:18.050Z	V3	Cleaning up long running container	{"name": "eksa_1728439506338083996"}
2024-10-09T02:20:18.050Z	V6	Executing command	{"cmd": "/usr/bin/docker rm -f -v eksa_1728439506338083996"}

Screenshot 2024-10-09 at 04 16 22

Documentation added/planned (if applicable):
Feature will be documented after merge

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@eks-distro-bot
Copy link
Collaborator

Hi @adiantum. Thanks for your PR.

I'm waiting for a aws member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@eks-distro-bot eks-distro-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Oct 9, 2024
@sp1999
Copy link
Member

sp1999 commented Oct 9, 2024

/ok-to-test

Copy link

codecov bot commented Oct 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.77%. Comparing base (4d1408c) to head (2a8bf41).
Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8837      +/-   ##
==========================================
+ Coverage   73.74%   73.77%   +0.02%     
==========================================
  Files         578      578              
  Lines       36788    36832      +44     
==========================================
+ Hits        27130    27173      +43     
- Misses       7928     7929       +1     
  Partials     1730     1730              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@adiantum
Copy link
Contributor Author

/retest

2 similar comments
@adiantum
Copy link
Contributor Author

/retest

@adiantum
Copy link
Contributor Author

/retest

pkg/providers/nutanix/validator.go Outdated Show resolved Hide resolved
pkg/providers/nutanix/validator.go Outdated Show resolved Hide resolved
pkg/providers/nutanix/testdata/expected_wn.yaml Outdated Show resolved Hide resolved
pkg/providers/nutanix/config/md-template.yaml Outdated Show resolved Hide resolved
pkg/providers/nutanix/template.go Show resolved Hide resolved
pkg/providers/nutanix/validator_test.go Outdated Show resolved Hide resolved
Copy link
Member

@abhinavmpandey08 abhinavmpandey08 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@eks-distro-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavmpandey08

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@eks-distro-bot eks-distro-bot merged commit 57aff53 into aws:main Oct 14, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm ok-to-test size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants