Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite recursion when automounting snapshots #16510

Open
mmaybee opened this issue Sep 5, 2024 · 0 comments
Open

Infinite recursion when automounting snapshots #16510

mmaybee opened this issue Sep 5, 2024 · 0 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@mmaybee
Copy link
Contributor

mmaybee commented Sep 5, 2024

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 20.04
Kernel Version 5.15.0-1066
Architecture x86_64
OpenZFS Version zfs-2.2.99

Describe the problem you're observing

After upgrading ZFS to a version which includes a fix to propogate dataset properties into mount options (commit
34118ea), automounts of snapshots will hang indefinitely. This problem is only present until the system is rebooted into the updated kernel (with the new ZFS version).

The following text from the commit indicates the reason for this issue:

Furthermore, don't run mount.zfs(8) helper for automounting snapshot.
The above change to make mount.zfs(8) to call 'zfs_mount_at'
apparently caused it to trigger an automount for the snapshot
directory. When the helper was invoked as a result of a snapshot
automount, an infinite recursion will occur.

The fix for this issue that was included in the commit was to add the -i flag to mounts requested from zfsctl_snapshot_mount in the kernel. This prevents mount.zfs from being invoked. However, before the new kernel is running we will not have this flag present and the userland changes made by this commit will cause this issue to occur.

Describe how to reproduce the problem

See above description of the problem.

Include any warning/errors/backtraces from the system logs

Here is an example of the issue as it manifests in the kernel logs:

[ 2659.491665] INFO: task mount.zfs:121090 blocked for more than 120 seconds.
[ 2659.493543]       Tainted: P           OE     5.15.0-1063-dx2024080801-aeedf7247-oracle #69~20.04.1
[ 2659.495864] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2659.498040] task:mount.zfs       state:D stack:    0 pid:121090 ppid:121089 flags:0x00000000
[ 2659.498044] Call Trace:
[ 2659.498045]  <TASK>
[ 2659.498046]  __schedule+0x2c8/0x860
[ 2659.498048]  ? __smp_call_single_queue+0x59/0x90
[ 2659.498050]  ? usleep_range_state+0x90/0x90
[ 2659.498052]  schedule+0x69/0x110
[ 2659.498054]  schedule_timeout+0x20b/0x2d0
[ 2659.498055]  ? try_to_wake_up+0x216/0x5b0
[ 2659.498057]  ? zio_destroy+0x5d/0x90 [zfs]
[ 2659.498190]  ? zio_done+0x69a/0xea0 [zfs]
[ 2659.498318]  ? usleep_range_state+0x90/0x90
[ 2659.498320]  __wait_for_common+0xa8/0x160
[ 2659.498322]  wait_for_completion+0x24/0x30
[ 2659.498324]  call_usermodehelper_exec+0x14c/0x180
[ 2659.498326]  call_usermodehelper+0x93/0xc0
[ 2659.498328]  zfsctl_snapshot_mount+0x1bc/0x360 [zfs]
[ 2659.498461]  zpl_snapdir_automount+0x10/0x40 [zfs]
[ 2659.498607]  __traverse_mounts+0x8c/0x230
[ 2659.498609]  ? __legitimize_path.isra.0+0x50/0x70
[ 2659.498611]  step_into+0x251/0x3b0
[ 2659.498614]  walk_component+0x70/0x1c0
[ 2659.498616]  path_lookupat.isra.0+0x6e/0x150
[ 2659.498618]  filename_lookup+0xcf/0x1a0
[ 2659.498620]  ? mntget+0x18/0x30
[ 2659.498622]  ? path_get+0x27/0x30
[ 2659.498624]  ? audit_alloc_name+0x125/0x150
[ 2659.498626]  ? getname_flags+0xc2/0x1f0
[ 2659.498628]  user_path_at_empty+0x3f/0x60
[ 2659.498629]  user_statfs+0x48/0xb0
[ 2659.498632]  __do_sys_statfs+0x28/0x60
[ 2659.498634]  ? __audit_syscall_entry+0xdb/0x130
[ 2659.498636]  ? syscall_trace_enter.isra.0+0x143/0x1c0
[ 2659.498638]  __x64_sys_statfs+0x16/0x20
[ 2659.498640]  x64_sys_call+0x1cbd/0x1fa0
[ 2659.498642]  do_syscall_64+0x54/0xb0
[ 2659.498644]  ? exit_to_user_mode_prepare+0x3c/0x1d0
[ 2659.498646]  ? syscall_exit_to_user_mode+0x2c/0x50
[ 2659.498648]  ? do_syscall_64+0x61/0xb0
@mmaybee mmaybee added the Type: Defect Incorrect behavior (e.g. crash, hang) label Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

1 participant