Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposing xarray upstream entry-points #271

Open
LecrisUT opened this issue Jul 12, 2024 · 7 comments
Open

Proposing xarray upstream entry-points #271

LecrisUT opened this issue Jul 12, 2024 · 7 comments

Comments

@LecrisUT
Copy link

Having to use import pint_xarray is a bit clunky especially since it does not have explicit usage and can be deleted by some linters. How about proposing xarray upstream to expose a new entry-points to which pint-xarray can hook. They already have xarray.backends, but this feels like it doesn't fit there.

I am opening an issue here because I am not sure about the naming convention to propose, or how to give an example of how the hook should look like, e.g. at which stage should these entry-points be imported at.

@keewis
Copy link
Collaborator

keewis commented Jul 16, 2024

this idea has come up before, see pydata/xarray#7348. You could imagine loading the entrypoint library whenever this particular attribute is accessed, but __getattr__ on Dataset and DataArray is already complicated enough. So not sure whether the removed line is worth the effort.

For now, I'm simply adding noqa: F401 comments to the import, which makes sure tools like ruff don't auto-remove it.

@LecrisUT
Copy link
Author

You could imagine loading the entrypoint library whenever this particular attribute is accessed, but __getattr__ on Dataset and DataArray is already complicated enough

I was considering a different interface: whenever the module is loaded (e.g. xarray) loop through the modules and load the entrypoint. E.g. define a function to load entrypoints (example) and then run early when the relevant module is loaded (example). The entry-point just points to the package/module file and effectively it just does the import.

@keewis
Copy link
Collaborator

keewis commented Jul 16, 2024

that could work, with the downside that now the import time has increased simply by the presence of the library. Given that people have repeatedly complained about long import times (with pint also being pretty slow), I don't think this would be accepted.

@TomNicholas
Copy link
Member

If I understand this correctly it basically involves the new entry point silently running completely arbitrary code at import time. This doesn't seem like a good idea to me.

Our existing entry points in Xarray plug into some well-defined interface, and only run in the context of some specific ABC. What you're suggesting here seems a lot more general and prone to abuse.

@dopplershift
Copy link

What about making the DatasetAccessor and DataArrayAccessor subclasses expose as entry points and avoid the need for the @xr.register_dataset_accessor decorator? I agree, I've always found it a little weird that I need to do an import of one library, just to then be able to do:

import mylibrary
nc = xr.open_dataset('foo.nc')
nc.mylibrary.myfunc()

It's not so much saving a 1-line import as it is avoiding the oddity that you need to do an import but then avoid using the thing you imported directly.

@TomNicholas
Copy link
Member

Rewriting the accessors to use entrypoints instead is an interesting idea... I'm still not quite sure I understand what this would look like but perhaps @dopplershift you could raise this upstream in Xarray for further discussion?

@LecrisUT
Copy link
Author

Our existing entry points in Xarray plug into some well-defined interface, and only run in the context of some specific ABC. What you're suggesting here seems a lot more general and prone to abuse.

Abuse wise it is equivalent if it points to a module or an attribute since the same import command is executed regardless. The only difference is if it should be automatically loaded at import or disable that import and control on xarray how the extensions are registered.

But in the end it does not matter as long as some process of automatically load the extensions is in place. Sure the import would be affected on all import xarray calls, but if the user installed the packages, don't they want it always loaded? At least with entry-points the import can be done later on, and you have control, e.g. disabling one/all plugins via env variable or by altering a global variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants