Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filesystem interaction #874

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

minhqdao
Copy link
Contributor

This PR introduces filesystem interactions by pragmatically invoking the system shell. It is part of the broader strategy outlined in #865 (see #865 (review)) to support file zipping and unzipping. These functionalities are essential for handling .npz files.

src/CMakeLists.txt Outdated Show resolved Hide resolved
Copy link
Contributor

@zoziha zoziha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this work, @minhqdao. Looking forward to the Stdlib starting to integrate filesystem-related routines. This is the beginning of a new category. In addition to the two noted issues, the function interfaces introduced by this PR should be documented in the specifications later.

(see also fpm::filesystem)

call test_failed(error, "Creating file 1 in directory '"//temp_list_dir//"' failed."); return
end if

call run('mkdir '//temp_list_dir//'/'//dir, iostat=stat)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an error in CI under Windows OS. This is because the shell command mkdir called may follow the mkdir command in CMD.exe, like mkdir \a\b\c\d. The path connector uses \ instead of /. It may be possible to implement a routine to determine the current operating system in Stdlib later.

A temporary solution can be to make two judgments. If the / scheme fails, then try the \ scheme.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple solution that I've been using and think would be useful to have in stdlib would be to add in the constants module something like:

#ifdef _WIN32
  character(len=1), parameter, public :: os_sep='\'
#else
  character(len=1), parameter, public :: os_sep='/'
#endif

Which then would enable the equivalent of python

import os
print(os.sep)

Copy link
Contributor

@zoziha zoziha Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C preprocessor in Windows-GFortran does not predefine the _WIN32 macro definition. This may require CMake to determine the operating system and define _WIN32. There are indeed two solutions:

  1. During compilation, solve this problem through the C preprocessor and CMake macro definitions.
  2. During runtime, solve this problem by implementing a routine to determine the operating system. (see also fpm::get_os_type)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C preprocessor approach is already in use in stdlib_system, so it could be one solution. The runtime approach is also very good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose the runtime approach because, as mentioned by @zoziha, Windows-GFortran does not set the _WIN32 macro. We had that problem in fpm already and I don't think the solution in stdlib_system works on Windows using gfortran.

stat = 0

if (.not. exists(temp_dir)) then
call run('mkdir '//temp_dir, stat)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that under Windows OS, mkdir temp will create a temp folder under the current path pwd. Is this appropriate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a very pragmatic solution, but I think it should be fine for now

src/stdlib_io_filesystem.f90 Outdated Show resolved Hide resolved
!>
!> Run a command in the shell.
!> [Specification](../page/specs/stdlib_io.html#run)
subroutine run(command, iostat, iomsg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe an option to redirect the command output (stdout, stderr) to a string variable would be very useful. It is being done in the fpm version of this function already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it slightly exceeds the scope of this PR but it can easily be done in a follow-up PR

test/io/test_filesystem.f90 Outdated Show resolved Hide resolved
@arjenmarkus
Copy link
Member

arjenmarkus commented Sep 19, 2024 via email

@minhqdao minhqdao marked this pull request as draft September 20, 2024 18:07
@minhqdao
Copy link
Contributor Author

Changes made:

  • Activated cpp by using capitalized .F90 suffix.
  • Check is_windows during runtime.
  • Add platform-agnostic path_separator (/ vs. \).
  • Use rmdir \s\q on Windows to remove directories.
  • Use dir \b on Windows.
  • All tested.
  • Add documentation.

@zoziha @jalvesz @arjenmarkus @perazz

@minhqdao minhqdao marked this pull request as ready for review September 21, 2024 07:29
!>
!> Whether the operating system is Windows.
!> [Specification](../page/specs/stdlib_io.html#is_windows)
logical function is_windows()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fpm, the method of using local save variables seems very advisable to avoid repeatedly accessing environment variables when is_windows is frequently called.

@jalvesz
Copy link
Contributor

jalvesz commented Oct 2, 2024

Hi I went through the PR again and would like to make a few comments.

The first one is purely cosmetic: I think that all procedures should have end function or end subroutine, same for the module. While a standalone end works, it is not a common practice and it felt somehow unnatural. For the sake of coherence with the rest of the library, I would suggest to keep the full closing statement.

The second point: I would like to come back to the preprocessing... I know that it is a sad state of affairs that gfortran does not expose gcc defined macros. Yet, on one hand, I find terribly worrisome to add all those operations just to detirme the OS and then to get the proper path separator. I find it hard to justify all those runtime operations for something that can be a compile-time constant. So I would like to come back to proposing setting the path separator as a simple character(1), parameter, public constant which value is defined by the C preprocessor mechanism. Same for is_windows which can be a logical, parameter, public constant.

This does indeed impose: either relying on CMake, or documenting that one should pass the appropriate definition flags if using plain Makefiles or fpm. This build-time "burden" seems to me more reasonable compared to the runtime operations.

I've opened a discussion to see if fpm could also help improving this preprocessing situation here fortran-lang/fpm#1084

While I find it "tolerable" for fpm having to rely on the runtime implementation as a tool for building, I'm expecting stdlib to be used for more computationally intensive tasks. Thus, it seems to me better to shift the issue to the building documentation and not the library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants