Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Kubernetes discovery - targets not being removed after pods tear down #634

Open
grzesuav opened this issue Sep 2, 2024 · 5 comments · May be fixed by #689
Open

[Bug] Kubernetes discovery - targets not being removed after pods tear down #634

grzesuav opened this issue Sep 2, 2024 · 5 comments · May be fixed by #689
Assignees
Labels
bug Something isn't working

Comments

@grzesuav
Copy link

grzesuav commented Sep 2, 2024

Current Behavior

Currently in the topology/target selection I can still see old/nonexisting targets in the vew, even like 5 minutes after pods stopped

Expected Behavior

Non-running pods are removed

Steps To Reproduce

k get rs
❯ k get rs
NAME                           DESIRED   CURRENT   READY   AGE
registry-556c9d5446            2         2         2       17m
registry-6878b7c78b            0         0         0       70m
registry-f459568bf             0         0         0       9d
image

as you can see, replicaset f459568bf is quite old, and it does not currently any running pod

❯ k get pods
NAME                                 READY   STATUS    RESTARTS   AGE
registry-556c9d5446-bzm2m            2/2     Running   0          18m
registry-556c9d5446-xh2nq            2/2     Running   0          20m

Environment

- OS: AKSUbuntu
- Environment: AKS 1.31
- Version: Cryoostat 3.0

Anything else?

No response

@grzesuav grzesuav added bug Something isn't working needs-triage Needs thorough attention from code reviewers labels Sep 2, 2024
@andrewazores andrewazores changed the title [Bug] Cryostat discovery - targets not being removed after pods tear down [Bug] Kubernetes discovery - targets not being removed after pods tear down Sep 2, 2024
@andrewazores
Copy link
Member

andrewazores commented Sep 2, 2024

@grzesuav are there any exceptions that appear in the Cryostat container logs at the time (or within some seconds after) you scale down or delete one of these deployments?

And could you paste the output from:

$ kubectl get -o yaml endpoints

@grzesuav
Copy link
Author

grzesuav commented Sep 2, 2024

❯ k get endpoints registry -o yaml
apiVersion: v1
kind: Endpoints
metadata:
  annotations:
    endpoints.kubernetes.io/last-change-trigger-time: "2024-09-02T15:26:47Z
  name: registry
  namespace: registry
subsets:
- addresses:
  - ip: 10.184.uuu.xxx
    nodeName: aks-nodepool0609-redacted
    targetRef:
      kind: Pod
      name: registry-8b68c85b8-mjt7n
      namespace: registry
      uid: 852fcb5b-redacted
  - ip: 10.184.fff.rrr
    nodeName: aks-nodepool0609-redacted
    targetRef:
      kind: Pod
      name: registry-8b68c85b8-sqktd
      namespace: registry
      uid: 03b7ba07-redacted
  notReadyAddresses:
  - ip: 10.184.yyy.xxx
    nodeName: aks-nodepool0609-redacted
    targetRef:
      kind: Pod
      name: registry-8b68c85b8-2zj9n
      namespace: registry
      uid: 66499772-redacted
  ports:
  - name: http
    port: 9000
    protocol: TCP
  - name: jfr-jmx
    port: 9091
    protocol: TCP
  - name: http-prometheus
    port: 9090
    protocol: TCP
image
 k get pods -o wide
NAME                                 READY   STATUS    RESTARTS   AGE     IP              NODE                                   NOMINATED NODE   READINESS GATES
registry-8b68c85b8-2zj9n             2/2     Running   0          38s     10.184.   aks-nodepool0609-redacted   <none>           <none>
registry-8b68c85b8-mjt7n             2/2     Running   0          6m22s   10.184.   aks-nodepool0609-redacted   <none>           <none>
registry-8b68c85b8-sqktd             2/2     Running   0          6m43s   10.184.    aks-nodepool0609-redacted   <none>           <none>

@grzesuav
Copy link
Author

grzesuav commented Sep 2, 2024

I see some various errors in cryostat logs, will continue tomorrow to provide more details

@grzesuav
Copy link
Author

grzesuav commented Sep 3, 2024

So actually today I see the same targets as in #634 (comment) where the pods aren't ther for many hours.
Explore-logs-2024-09-03 12_26_01.txt

attaching logs which apper now when I am trying to connect.

@andrewazores what is the name of the kubernetes discovery logger ? Maybe I can filter logs related to that to find something intertesting ?

@andrewazores
Copy link
Member

I think the Logger's name should be io.cryostat.discovery.KubeApiDiscovery.

Possibly related: #353 , #396

@andrewazores andrewazores removed the needs-triage Needs thorough attention from code reviewers label Oct 9, 2024
@andrewazores andrewazores self-assigned this Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Backlog
Status: Backlog
Development

Successfully merging a pull request may close this issue.

2 participants