Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for panels state=Done #19

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

intermittentnrg
Copy link
Collaborator

Alternative method to use-panel-events and exposure-time options.
I think it still needs ~0.2 second exposure time.
Also not tested with older versions of Grafana.

I have been using grafanimate with some customizations to post videos of my Grafana to X/Twitter
https://x.com/IntermittentNRG/status/1712826595977674850

Submitted for your consideration.

@amotl
Copy link
Contributor

amotl commented Oct 15, 2023

Hi @intermittentnrg,

I would not have expected that this program is useful any longer, and I am very happy to hear that it apparently still works, now even better with your patch.

Thank you so much for submitting this improvement, I love it.

With kind regards,
Andreas.

@amotl
Copy link
Contributor

amotl commented Oct 15, 2023

Thoughts

Alternative method to use-panel-events and exposure-time options.

Without validating your patch yet, if Grafana (now?) offers a getPanelData() method and a corresponding .state property for each panel, it is absolutely the right approach to inquire that, in order to find out about whether data loading has finished.

While I believe it works like your patch demonstrates it, I am thinking about if we could mount it at the place where the synchronization between the Grafana/JavaScript and Python domains happens, also getting rid of the busy/delayed Python loop, which is currently polling the whole stack.

Details

At those spots, we staged an event-based synchronization mechanism, which, under the hood, also uses polling, but is based on Marionette's marionette_driver.wait.Wait primitive.

It does the "all data loaded" check within the JavaScript domain, and emits a synthesized all-data-received event, which toggles the hasAllData state, which the Python domain is monitoring.

https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/grafana-studio.js#L214-L215

https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/grafana-studio.js#L231-L291

https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/grafana-studio.js#L158-L161

https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/grafana.py#L101-L115

@amotl
Copy link
Contributor

amotl commented Oct 15, 2023

Alternative method to use-panel-events and exposure-time options.

The current code does the "all data loaded" check within the JavaScript domain, and emits a synthesized all-data-received event.

I think if your code could fit there, and emits this event appropriately, we could get rid of the manual exposure timing in the Python domain completely, which is effectively a time.sleep.

https://github.com/panodata/grafanimate/blob/7b68228247a962272565dec4084a01e1112e0ffa/grafanimate/animations.py#L103-L105

By doing so, grafanimate will become both more robust, and efficient, like originally intended. 1

Do you think you may have the capacity to squeeze your code into that box, effectively replacing the ingredients of Grafana Studio's onDashboardRefresh? If you don't, please tell me, so I will pick it up on the next iteration.

Footnotes

  1. And somehow working on Grafana 5, IIRC. The original variant did not use any time.sleep() calls at all, and exclusively relied on Marionette's Wait synchronization primitive. That's how it should be.

@intermittentnrg
Copy link
Collaborator Author

It still works! But I think grafana-studio.js is mostly not doing anything after Grafana changed from Angular to React.
There is still some support for Angular in Grafana so there are no errors, tho this will be removed in a later Grafana release.

I have made several more changes, but they're mostly hardcoded and need to be done as options.

I will look at your feedback and make changes to the PR.

@intermittentnrg intermittentnrg mentioned this pull request Oct 16, 2023
@intermittentnrg
Copy link
Collaborator Author

Also what about support for old grafana? Should old use-panel-events and similar code be deleted to remove things that don't work or don't do anything in current grafana?

@amotl
Copy link
Contributor

amotl commented Oct 16, 2023

Hi again,

I think grafana-studio.js is mostly not doing anything after Grafana changed from Angular to React.
There is still some support for Angular in Grafana so there are no errors, tho this will be removed in a later Grafana release.

I am not completely following to understand how grafanimate would properly work, if the code in grafana-studio.js does not work. Some of it may be optional, like styling the interface at runtime, but others, like opening and navigating to a dashboard, and driving the time range, is certainly not optional?

Also what about support for old grafana?

I don't think we need to be backwards-compatible. To be fair, we can run another maintenance release before bringing in breaking changes, so we can build upon that if there is demand.

Should old use-panel-events and similar code be deleted to remove things that don't work or don't do anything in current grafana?

I think it will be good to migrate the event handling to your proposal. Other things that don't work would also need to be modernized.

With kind regards,
Andreas.

@intermittentnrg
Copy link
Collaborator Author

others, like opening and navigating to a dashboard, and driving the time range, is certainly not optional

Ok I glossed over those parts mainly looking at the styling. So I mean just the stuff that refers to elements and angular.

Also I will look at updating grafana-studio.js as suggested, but also it's kinda neat to keep the code mostly in python I think? But I understand your point why it's better in JS.

@amotl
Copy link
Contributor

amotl commented Oct 16, 2023

I will love to have most code in Python. But it is not a good idea to send JavaScript code from Python each time you want to invoke it, as it needs to be parsed and compiled each and every time. Better to load it into the browser in a regular way, using JavaScript, and invoke it using Python/Marionette.

In other words, grafana-studio.js is a minimal SDK supporting the Python code to be able to just call into it conveniently, nothing more.

@intermittentnrg
Copy link
Collaborator Author

Regarding grafana-studio.js. All CSS class names are now dynamic and will change when React components are updated. Discussed in grafana/grafana#71662

@amotl
Copy link
Contributor

amotl commented Apr 21, 2024

Thanks for the heads up, @intermittentnrg. We will need to find a different solution. Do you have any suggestions?

@intermittentnrg
Copy link
Collaborator Author

I came up with this selector for removing padding around panels:

$(".scrollbar-view > div").css("padding", "0");

Using :has() psuedo selector and similar trickery can also work, but I'm not sure it can work for all cases. My styling needs are simple tho.

The recommendation by Grafana is to use plugins? / adding files and rebuilding Grafana? Not really keen on this approach, I use their official docker image.

Anyway should we perhaps remove all broken css manipulation from grafana-studio.js? It can live on/die in git history.

The kiosk enabling doesn't work either but appending &kiosk=1 to query string works.

@amotl
Copy link
Contributor

amotl commented Apr 21, 2024

Hi @intermittentnrg. We will be happy to accept any patches to modernize grafanimate, making it compatible with recent versions of Grafana. We haven't been able to keep up with maintenance, and we are about to migrate it to https://github.com/grafana-toolbox. Let us know if we should add you as a collaborator on this project while we are already on the refactoring, so we can share future maintenance, if you like that idea.

@intermittentnrg
Copy link
Collaborator Author

Thank you! I will still need your advice. I have used python on and off over the years, but don't usually use it.

I'm now experiencing skipped frames in my videos, not entirely sure why but I suspect waiting for state=Done is not enough.

I've just setup a pipeline in my local Jenkins, turns out rendering a 1-2 minute video takes over an hour. But it's possibly slower now as it's running on a Raspberry Pi 4b.
grafanimate job in jenkins
Some failures from marionette/firefox is sometimes not booting correctly in container, unsure why.

Using scenarios-env.py that I created

def scenario():
    return AnimationScenario(
        grafana_url=os.environ["GRAFANA_URL"],
        dashboard_uid=os.environ["DASHBOARD_UID"],
        sequences=[
            AnimationSequence(
                start=parse(os.environ['START']),
                stop=parse(os.environ['STOP']),
                recurrence=RecurrenceInfo(
                    frequency=HOURLY,
                    interval=int(os.environ['STEP_HOURS']),
                    duration=timedelta(days=int(os.environ['WINDOW_DAYS'])),
                    #every=None
                ),
                #every=None,
                mode=SequencingMode.WINDOW
            )
        ],
    )

@amotl
Copy link
Contributor

amotl commented Apr 21, 2024

I've just setup a pipeline in my local Jenkins, turns out rendering a 1-2 minute video takes over an hour.

I guess it is because synchronization with Grafana is currently utterly broken, never has been stable, and we currently don't have a good solution. Or do we? Sorry if I lost track about any recent improvements from your pen, please educate me if I'm wrong.

Because synchronization is broken, rendering each frame will take so long because it will fully consume the timeout each time for each frame again. This makes usage unbearable.

If it works well on your workstation, but does not on your Raspberry Pi, it is yet another sign that this subsystem would need to be improved significantly. Until it is, please don't run it on a Pi.

@amotl
Copy link
Contributor

amotl commented Apr 21, 2024

I've just setup a pipeline in my local Jenkins, turns out rendering a 1-2 minute video takes over an hour.

I guess it is because synchronization with Grafana is currently utterly broken, never has been stable, and we currently don't have a good solution.

Another thing comes to mind: Didn't the code introduce manual exposure times some time ago?
Anyway, all of that should probably not be discussed on behalf of this PR draft 1. a) Should it? b) Maybe GH-16 is more appropriate if it's the same thing you are observing?

Footnotes

  1. The thing is, when this PR will be merged or otherwise closed, this discussion, making up a part of improving synchronization matters, will be out-of-band to the other conversation, so things would become more fragmented.

@amotl
Copy link
Contributor

amotl commented Apr 21, 2024

I guess it is because synchronization with Grafana is currently utterly broken

Got it. Reading up on the conversation we had on this PR, I see that we may have stopped over at #19 (comment):

Do you think you may have the capacity to squeeze your code into that box, effectively replacing the ingredients of Grafana Studio's onDashboardRefresh? If you don't, please tell me, so I will pick it up on the next iteration.

@intermittentnrg
Copy link
Collaborator Author

I have these 2 checks.

panels = Object.values(window.wrappedJSObject.grafanaRuntime.getPanelData())
return panels && panels.every(function(o) {return o?.state=='Done'})
return $('[aria-label="Panel loading bar"]').length == 0

But they maybe only check if data request is complete, and not that the canvas has been redrawn.

It mostly works! But I'm noticing some skipped frames when running on Pi, I have a kubernetes cluster with 4x Pi. And it's probably fine on my desktop PC. But if it works consistently on Pi then it must be fully solved right?

Could compare the screenshot image if it's different to previous? But probably not a nice way to do it.

@intermittentnrg
Copy link
Collaborator Author

Also had a lot of startup errors, timeout and JS errors. Just installed vnc server inside the docker image to see what's going on as logs and gecko.log weren't helpful...

Inspired by docker-seleinum which has this super useful VNC feature: https://github.com/SeleniumHQ/docker-selenium?tab=readme-ov-file#using-a-vnc-client

@amotl amotl removed their assignment May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants