Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Preferences facility to handle different download strategies & cache locations #151

Open
vijayiyer05 opened this issue Oct 16, 2022 · 14 comments
Assignees

Comments

@vijayiyer05
Copy link
Collaborator

No description provided.

@vijayiyer05
Copy link
Collaborator Author

Created issue for tracking progress on this new facility

@vijayiyer05
Copy link
Collaborator Author

vijayiyer05 commented Oct 16, 2022

I think code could be leaner using properties directly vs preferences. See Alternate Code below

Probably the best would be to locate the actual class in a subpackage, e.g. “bot.instance.Preferences”. Then can have a root folder function called “getPreferences” that simply calls the getSingleton factory function. The user can then view object directly; and can assign it to an output var if they want to set it. The last part has to be taught probably, but is pretty lightweight.

Alternate Code

classdef Preferences < handle

    properties
        CacheDirectory  (1,1) string = ""   % Directory for local caching of downloaded data and files
        DialogMode      (1,1) string {mustBeMember(DialogMode,["Dialog Box" "Command Window"])}= "Dialog Box" % How to show dialogs with user. Options: Dialog Box or Command Window
        DownloadFrom    (1,1) string {mustBeMember(DownloadFrom,["API" "S3"])} = "API"         % Where to download data from, i.e web api or s3 bucket
        UseCacheOnCloud  (1,1) logical = false         % Whether to use cache if working on a AWS cloud computer
    end
    
    methods (Access = private)
        function obj = Preferences()
            % <default> 
        end    
    end

    methods (Static, Hidden)

        function obj = getSingleton()

            persistent objStore

            if isempty(objStore)
                objStore = bot.Preferences();
            end
            
            obj = objStore;
            
        end
    end


end

@vijayiyer05
Copy link
Collaborator Author

Note that the suggested default cache directory (see comment in #149) could be implemented right in the property definition line, either inline or by calling a local function

@vijayiyer05 vijayiyer05 changed the title Implement Preferences facility to handle different download strategies & cache preferences Implement Preferences facility to handle different download strategies & cache locations Oct 16, 2022
@vijayiyer05
Copy link
Collaborator Author

@ehennestad

In my first test of the MATLAB Online case, the only available preference showing by default is the cache directory.

Are some preferences not shown based on detected environment??

The "DownloadFrom" and "UseCacheOnCloud" preferences seem pertinent here too. I was able to set it even though it didn't appear. I could that S3 item file retrieval is faster than API download. Nice!

@vijayiyer05
Copy link
Collaborator Author

Should likely rename UseCacheOnCloud to UseCacheWithS3Mount, since MATLAB Online is a use case that's on the cloud but without an S3 mount (at least at this time).

@ehennestad
Copy link
Collaborator

In my first test of the MATLAB Online case, the only available preference showing by default is the cache directory.

This was probably an effect of preferences being added to the preference group "on demand". This will be fixed in an upcoming commit, where all the preferences are initialized with their default values on the first-time construction of the Preference singleton instance.

@ehennestad
Copy link
Collaborator

ehennestad commented Oct 17, 2022

Should likely rename UseCacheOnCloud to UseCacheWithS3Mount, since MATLAB Online is a use case that's on the cloud but without an S3 mount (at least at this time).

This is a good point. I have found yet another scenario where it should be possible to mount the S3 bucket as a file system on a local machine. I am not sure if there are any benefits, and when I tested it, my Mac became unresponsive to the point where I had to reboot (this might be related to the libraries I used).

I think this (local mount) could be interesting to look into with regards to using h5read directly with files in an S3 bucket. Accessing files using e.g dir / copyfile / h5 read using the s3 protocol (s3://...) appears to have some significant delays, and maybe this would be faster if the bucket is mounted locally.

@vijayiyer05
Copy link
Collaborator Author

Accessing files using e.g dir / copyfile / h5 read using the s3 protocol (s3://...) appears to have some significant delays, and maybe this would be faster if the bucket is mounted locally.
Have you updated to R2022b as yet? I believe this release has relevant updates for H5 over S3.

@vijayiyer05
Copy link
Collaborator Author

vijayiyer05 commented Oct 18, 2022

@ehennestad If you move towards an object-based representation of the preferences (vs getpref/setpref) as suggested above for cleaner code, then one option could be to use saveobj/loadobj and you could just as well use the preferences directory accessible via prefdir

@vijayiyer05
Copy link
Collaborator Author

vijayiyer05 commented Dec 4, 2022

Minor detail. Since the preferences are available via a root-level function (getPreferences), it doesn't seem quite right for the implementation to be in the +internal subpackage.

For future consideration: could make a subpackage called +advanced aimed at power-users. Then this class could be bot.advanced.Preferences.

@ehennestad
Copy link
Collaborator

ehennestad commented Dec 17, 2022

Todo:

  • Remove code in getPreferences that returns preference value of selected preference
  • Hide some preferences that are not implemented yet (AutoDownloadFiles and DownloadMode)

@ehennestad
Copy link
Collaborator

ehennestad commented Dec 17, 2022

@vijayiyer05 Suggestion to move Preferences from +internal to +util ?

@ehennestad
Copy link
Collaborator

ehennestad commented Jan 9, 2024

Starting this conversation again as we can now add a Preference for how to access data ( @vijayiyer05 )

Option 1: Download and cache entire files in a local directory.
This seems like the preferred option when working at a local computer with limited internet bandwidth and large storage space

Option 2: Read data (individual linked file properties) directly from h5/nwb files stored at remote locations (S3 bucket)
This seems like the preferred option on MATLAB Online (and possibly other cloud system) with good bandwidth and limited storage space.

(Option 3: Read data (individual linked properties) as in Option 2 and cache these values in local cache)

Working name for this Preference: DownloadMode with options "File" and "Variable"

Note: This is currently mostly relevant for ophys sessions from the Visual Behavior Dataset where I have used the new HasLinkedFile/LinkedFile classes.

@ehennestad
Copy link
Collaborator

Alternative preference name:

DownloadRemoteFiles with value true/false

If true files are downloaded and cached, if false files are read directly from remote location

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants