Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide a scratchspace for AtlasRep data if necessary #1026

Conversation

ThomasBreuer
Copy link
Member

If AtlasRep does not know a writable directory for downloaded data, i.e., if the value of its user preference AtlasRepDataDirectory is an empty string, provide a scratchspace and adjust the user preference.

For that, we check for the empty string when GAP has been started (which may trigger that AtlasRep gets loaded)
and in GAP.Packages.load calls.

Thus this mechanism will fail if one starts GAP without packages and then loads GAP packages only via the GAP function LoadPackage.

If AtlasRep does not know a writable directory for downloaded data,
i.e., if the value of its user preference AtlasRepDataDirectory is
an empty string, provide a scratchspace and adjust the user preference.

For that, we check for the empty string when GAP has been started
(which may trigger that AtlasRep gets loaded)
and in `GAP.Packages.load` calls.

Thus this mechanism will fail if one starts GAP without packages
and then loads GAP packages only via the GAP function `LoadPackage`.
@ThomasBreuer ThomasBreuer added the kind: enhancement New feature or request label Aug 21, 2024
Copy link

codecov bot commented Aug 21, 2024

Codecov Report

Attention: Patch coverage is 65.38462% with 9 lines in your changes missing coverage. Please review.

Project coverage is 75.83%. Comparing base (48061b6) to head (9f2cada).
Report is 18 commits behind head on master.

Files with missing lines Patch % Lines
src/packages.jl 65.21% 8 Missing ⚠️
src/wrappers.jl 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1026      +/-   ##
==========================================
+ Coverage   75.77%   75.83%   +0.05%     
==========================================
  Files          51       51              
  Lines        4182     4204      +22     
==========================================
+ Hits         3169     3188      +19     
- Misses       1013     1016       +3     
Files with missing lines Coverage Δ
src/GAP.jl 86.20% <100.00%> (+0.11%) ⬆️
src/wrappers.jl 97.18% <50.00%> (-1.37%) ⬇️
src/packages.jl 70.54% <65.21%> (-0.37%) ⬇️

... and 1 file with indirect coverage changes

@ThomasBreuer
Copy link
Member Author

The feature proposed here is relevant if the computed default for AtlasRepDataDirectory is empty, which means that the directory of the AtlasRep package itself is not writable.
The package directories are in a Julia artifact, and the idea is to make them read-only.

(In my installation, the directories in question are in fact writable, thus the problem of an empty computed default for AtlasRepDataDirectory does not occur for me.)

The proposed change means that we may ignore the user's intention not to store downloaded files, by setting AtlasRepDataDirectory to an empty string in the gap.ini file. What is a solutoin for this problem?

Copy link
Member

@fingolfin fingolfin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

# and AtlasRep.

# The path to the scratchspace, will be filled by `init_atlasrep()`
atlasrep_download_cache = ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is type unstable. Better is something like this:

const atlasrep_download_cache = Ref{String}()

...

!isassigned(atlasrep_download_cache) || return

...

atlasrep_download_cache[] = String(val)

and so on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O.k.
(I had just copied the code from the example in Scratch.jl.)

atlasrep = GapObj("atlasrep")
AtlasRepDataDirectory = GapObj("AtlasRepDataDirectory")
val = Wrappers.UserPreference(atlasrep, AtlasRepDataDirectory)
if Wrappers.IsString(val)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is not a string, then I think the only other value (baring a mistake in gap.ini) is fail, which means AtlasRep is not loaded yet and the user did not set a preference.

So isn't that a situation in which we should provide a path, too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, if AtlasRep does not get loaded at all then GAP.jl need not create a scratchspace at all.
As long as the preference AtlasRepDataDirectory is not set (value fail), we need not provide a default for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But creating a scratch space is super cheap. If the price for avoiding it is complicated and error-prone logic to detect whether the scratch space is really needed, I would just forget about that and always create the scratch space.

val = Wrappers.UserPreference(atlasrep, AtlasRepDataDirectory)
if Wrappers.IsString(val)
# A default is set.
if length(val) == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some tests with plain GAP locally: if I have no AtlasRepDataDirectory user pref set, and start a regular GAP session which auto-loads atlasrep, then AtlasRepDataDirectory is set to a default value (a path inside the atlasrep package directory).

So if I read it right, then with atlasrep autoloaded (which is the case right now with GAP.jl), this code would be never triggered?!

Can we perhaps instead somehow distinguish between "the user set a value (so honor that)" and "some kind of default value that we should override". E.g. perhaps one way to do that would be to perform just that check in gap/systemfile.g (so that it is run before any packages are loaded), i.e. basically: check if UserPreference("atlasrep", "AtlasRepDataDirectory") returns a value different from fail or not.

Anyway, another potential issue: What if we set a path to a scratch dir, but then the user calls WriteGapIniFile. Then the scratch path is put into the resulting gap.ini -- but it may be invalid the next time we run (and load gap.ini). So perhaps we need some logic that detects if the AtlasRepDataDirectory value "looks like a Julia scratch space path", and in that case, update it anyway?

In closing, I think for 99.9% of all users, they are not aware of this setting and don't care about it and with that in mind the simples solution would be to just call SetUserPreference unconditionally... but I realize this is annoying for a handful of power users (such as yourself), so I guess we can't really do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The computed default for AtlasRepDataDirectory is a path inside the package directory if this is writable, and in this case I want to keep this default.
  • Also in the case that the gap.ini file sets a path (such that no computed default gets set), I want to keep this path because the user wants it.
  • If the path is not writable (which will happen if the artifact in question is read-only --currently it seems to be writable) then the computed default is an empty string (meaning that downloaded data are not cached), and then the idea is that GAP.jl can provide a scratchspace as a way to enable caching.

Accidentally storing the path to the scratchspace in a gap.ini file is an issue, good point.
More generally, we can override any value of AtlasRepDataDirectory that is not a valid (writable) directory path.

What I had mentioned already is that the currently proposed code ignores that a user may set an empty string in order to disable caching. One solution would be to check in the case of an empty string whether the package directory is writable, and if yes to conclude that the empty string is intentional, and to keep it.

Concerning the fact that the replacement of the default does not get triggered if one starts GAP without packages and later on loads GAP packages only on the GAP side, we could argue that the situation is not worse as it was before, and if one does not use the features from the Julia side than one does not get its advantages.

(I had thought whether a clean solution would be to put the relevant Julia code into the function for the default computation in AtlasRep's PackageInfo.g file. This would remove the problem that one might load all packages only with GAP's LoadPackage. The problem with the scratchspace path stored in gap.ini cannot be solved then, but it could be documented in AtlasRep that storing the computed value is not a good idea.)

# If the AtlasRep package is loaded afterwards (perhaps indirectly)
# then provide its scratchspace if necessary.
if atlasrep_download_cache == "" && Wrappers.IsPackageLoaded(GapObj("AtlasRep"))
init_atlasrep()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not catch the cases were a user calls GAP code that does LoadPackage("atlasrep").

I don't think we need this if we call SetUserPreference earlier and in more cases, see below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we set the value of the preference before AtlasRep gets loaded then we have to set it unconditionally, and create the scratchspace even if the package will not be loaded.

I am not sure how to deal with several available package versions: As long as the package is not loaded, we do not know which version will be loaded, and there is only one path which we can set.

@ThomasBreuer
Copy link
Member Author

Coming back to the idea to move the code to the GAP side, here is GAP code for the default component of the declaration of the user preference AtlasRepDataDirectory.

  function()
      local val, dir, julia, subdir, file;

      # In general, the default of a user preference gets computed 
      # also if a value is already stored.
      # (This can be regarded as a bug in the handling of user preferences.)
      # Here we return a stored value without running the computation.
      val:= UserPreference( "AtlasRep", "AtlasRepDataDirectory" );
      if val <> fail then
        return val;
      fi;

      dir:= DirectoriesPackageLibrary( "atlasrep", "" );
      if ForAll( [ "dataext", "datagens", "dataword" ],
                 subdir -> IsWritableFile( Filename( dir, subdir ) ) ) then
        # The package directory is the first default.
        return Filename( dir, "" );
      elif IsPackageLoaded( "JuliaInterface" ) then
        # Create/fetch a Julia scratchspace.
        ValueGlobal( "JuliaImportPackage" )( "Scratch" );
        julia:= ValueGlobal( "Julia" );
        val:= julia.Scratch.( "get_scratch!" )( julia.GAP,
                  julia.String( "atlasrep_cache" ) );
        val:= julia.GAP.GapObj( val );
        if not IsDirectoryPath( val ) then
          # Something went wrong.
          return "";
        fi;

        # Create writable subdirectories if necessary.
        dir:= Directory( val );
        for subdir in [ "datagens", "dataword", "dataext" ] do
          file:= Filename( dir, subdir );
          if not IsDirectoryPath( file ) then
            CreateDir( file );
          fi;
        od;

        return val;
      else
        return "";
      fi;
    end;

This changed function makes the current pull request obsolete.
@fingolfin I will add this function to AtlasRep and make a new release if you agree with this change.

@ThomasBreuer
Copy link
Member Author

One more thought:
If scratchspaces are of interest also for other GAP packages then GAP.jl could advertise a GAP function that creates/fetches a scratchspace.

@fingolfin
Copy link
Member

I think ValueGlobal( "JuliaImportPackage" )( "Scratch" ); will only work if the user has Scratch added to the active environment at the time that code is run.

However we could change GAP.jl to do import Scratch inside either the main module GAP or inside GAP.Setup, and then GAP packages could access GAP.Scratch resp. GAP.Setup.Scratch... but of course that requires a new GAP.jl release. At which point we might instead add a helper function GAP.get_scratch!(key) or perhaps GAP.get_scratch!(gap_package_name, key) which abstracts this away a bit. And then also add a GAP function GetJuliaScratchSpace(key) to JuliaInterface which takes care of the argument/return value conversion? I guess that's essentially what you had in mind in your last comment :-).

@fingolfin
Copy link
Member

Of course that doesn't solve the immediate problem of supporting this without a new GAP.jl release. I note the following trick:

julia> GAP.eval(:(import Scratch))

julia> GAP.Scratch
Scratch

resp. from GAP:

gap> Julia.GAP.eval(JuliaEvalString(":(import Scratch ; Scratch)"));
<Julia: Scratch>

@ThomasBreuer
Copy link
Member Author

@fingolfin thank you for your comments.

Concerning the remark

I think ValueGlobal( "JuliaImportPackage" )( "Scratch" ); will only work if the user has Scratch added to the active environment at the time that code is run.

I do not catch the point.
When IsPackageLoaded( "JuliaInterface" ) is true then Julia is running and has loaded GAP.jl, and Scratch.jl is a dependency of GAP.jl. How can it happen that Scratch.jl cannot be made available with JuliaImportPackage?
The code

Julia.GAP.eval(JuliaEvalString(":(import Scratch ; Scratch)"));

from your proposal is essentially what JuliaImportPackage tries, except that JuliaImportPackage does not return the module. (Should it perhaps do this?) What am I missing?

Note that the current pull request and the change of the default function in AtlasRep are independent.

  • I think the only clean solution is to change the default function in AtlasRep, and once we agree on how this function should look like, I will release a new AtlasRep version.
  • For situations where this change is not (yet) available, the current pull request describes a hack that works if the user loads AtlasRep on GAP startup or if the user loads some GAP package using GAP.jl's Packages.load.

@fingolfin
Copy link
Member

When IsPackageLoaded( "JuliaInterface" ) is true then Julia is running and has loaded GAP.jl, and Scratch.jl is a dependency of GAP.jl. How can it happen that Scratch.jl cannot be made available with JuliaImportPackage? The code

Julia.GAP.eval(JuliaEvalString(":(import Scratch ; Scratch)"));

from your proposal is essentially what JuliaImportPackage tries, except that JuliaImportPackage does not return the module. (Should it perhaps do this?) What am I missing?

What you are missing is the issue of scope / of different Julia environments.

The fact that GAP.jl's Project.toml lists Scratch means that source code of GAP.jl can do using Scratch.

But if you start up julia, and make sure that GAP is installed but Scratch is not (inside that enviroment, I mean -- so you ]add GAP and ]rm Scratch), then using Scratch will give an error when entered in the REPL.

The difference between JuliaImportPackage and my code is that the former executes the import in the Main package (so essentially it mimics what a user would get when doing this in the REPL), while my code executes the code inside the GAP module (that's what GAP.eval does, or more generally, MODULE.eval for any module MODULE).

@ThomasBreuer
Copy link
Member Author

@fingolfin Thanks. I have fixed the proposed default component of the declaration of the user preference AtlasRepDataDirectory, and released a new version of the AtlasRep package.

From this point of view, the current pull request is obsolete:
It requires a release of GAP.jl, but my understanding is that the next release of GAP.jl will rely on a new release of GAP itself, and then the effect of hacking GAP.Packages.load will be that the behaviour gets worse.
If we want a solution in Oscar.jl-dev before the new AtlasRep version is available in Oscar then a simpler hack there (unconditionally use a scratch space) might make sense, without changing GAP.jl.

The only improvement that remains for GAP.jl may be the idea to provide a small helper function for simplifying the Julia code in GAP functions that create/fetch Julia scratchspaces, but this can be done in a new pull request.

@ThomasBreuer
Copy link
Member Author

With the changed default for the user preference AtlasRepDataDirectory (and with #1030 merged), this hack is not needed anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants