-
-
Notifications
You must be signed in to change notification settings - Fork 827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scraper and plugin manager #4242
Conversation
small improvement may be to index the domains/urls in scrapers so that you can search by that when looking up a scraper could even enumerate the tables based off this value as I think in most cases people are looking for a specific domain not necessarily scraper package
selecting a domain would install the whole package but I think this would be a better UX |
For plugins can the path be separate from the Index URL? say I wanted to add performerBodyCalculator to a hosted/existing index.yml with the following - id: performerBodyCalculator
name: Performer Body Calculator
description: Tags performers based on existing metadata, with tags matching the performers body type
version: 1.0-b455ac6
date: 2023-10-26 17:26:15 +0000
path: https://github.com/stg-annon/performerBodyCalculator/releases/download/v1.0/performerBodyCalculator-1.0.zip
sha256: 57A899ACC459383C4E74A5E7118D675EFA97556A5B0F4E0E327A0CF6EA8FA32A should this be possible? it would make for easier development so we could manage a community index that could pull from various places after a review of the entry into the index |
encountered an issue, some scrapers have many YMLs to a single py script, Algolia_* is a prime example of this, how do we want to deal with things like this or Discussion in discord lead to the conclusion that dependencies is likely the solution here, question would be on implementation, in the simplest case one package depends on another within the same repo
name: "Site"
requires:
- algolia
sceneByURL:
- action: script
url:
- site.com/en/video/
script:
- python
- ../algolia/Algolia.py
- site
name: "Algolia Interface Package"
requires:
- py_common
name: "py_common Module" this existing dependency example shows the need to examine chains of dependencies within a given repo Dependencies Across Sources?should we allow for/support dependencies across repos and if so what would that look like Say this scraper is from another Source/Repo name: "Non Community Scraper"
requires:
- CommunityScrapers/algolia |
For Algolia, I would just put all of the related ymls into a single package. I haven't yet got a solution for dependencies, though the short term solution would be to bundle the dependencies into the package. Obviously has issues with redundancy, but that's a less major issue. |
I encountered a few small issues while testing:
I have to say, great work with this already!! Installing/updating scrapers has been a major hassle and I love to see this addressed. Along with the UI plugin API v24 shapes up to be an absolute gamechanger for versatility and usability. |
But this doesn't solve the Cropper.js issue, as an example. It's far nicer if we have Cropper.js in it's own plugin with a depends. Nor do we want more than one py_common, which has to be configured (needs a settings rewrite anyway) |
I think the solution here is a very limited dependency implementation where you can only depend on packages within the same Source and the dependency is the package ID within that source requires:
- py_common going though each dependency and recursively installing each pluginID/scraperID in the this makes it more reliant on Source/Package maintainers organizing things in a way that works but should be relatively easy to implement on the stash side of things this will not do any cleanup of dependencies if a package is uninstalled it essentially just automatically does what the user would have to do when a package calls for a dependency i.e.
user then goes and check the box for Implementationsay at manager.go#L156 we add a function call to reqs, err := m.ListPackageRequirements(remoteURL, id)
if err != nil {
return fmt.Errorf("retrieving requirements: %w", err)
}
if reqs != nul {
for reqId := range reqs {
m.Install(remoteURL, reqId)
}
} Apologies if I am off base here I don't know the nuances of go so I could be very wrong with how this would work Edit: |
I'm confused why zip is being used? Where are these zips coming from? (I see the package manager unpacks them... but the creation is the part sticking in my craw.) Might be nicer if it could just get all files in/under a directory, and just verify each: The problem is I see, similar to the way Userscripts are, it looks like the dev has to rebuild in order to get the item updated. Meaning every merge patch of the Community Repos, we'll have to rebuild? |
ZIPs are transportable and can be hashed as a single file, they are generated and hosted on GithubPages to not run afoul of GitHubs TOS, its more of an automated process with Actions not a manual process This approach helps with version control as once the package is archived it will stay that way with that hash unless updated, so there is no concern of alterations after a package is approved and added by maintainers (Plugin v1.2.3 == sha256 hash) the way WP has implemented it is to retroactively work with the current CommunityRepos, but as I describe in one of my comments we could separate out the index from the hosting of the packages themselves #4242 (comment) |
Remove fs repository in favour of file:// url
The current assumption that packages are installed in their own folder with name as |
the config would apply to all packages not a config for each package, similar to a how we configure a plugins folder or scrapers folder within stash this would be a "managed plugins folder" It really should not change any assumptions about the packages besides where we start to look for them, the idea is that we shift the folder down one level under the "root" scraper/plugin folder allowing for the root to still be manually organized without cluttering that folder I saw two options to do this
|
Added the ability to set local paths for package sources, and changed the caching behaviour so that it stores package lists in the cache directory (if set) and only re-downloads if it is newer. Should be good for another round of testing. |
This sill appears to be an issue #4242 (comment) |
Should be addressed now. |
Yup looks good |
How do we want to deal with everything works as intended with the values defined in the Source
|
The |
Create index file for stashapp/stash#4242 feature
Wouldn't you want installed plugins to either be hidden or "greyed out"/marked as installed in the Available Plugins section of the manager? |
* Add package manager * Add SettingModal validate * Reverse modal button order * Add plugin package management * Refactor ClearableInput
Adds a package manager to the scraper and plugin settings pages.
These allow the configuration of multiple package sources.
A package source URL can be a local path or URL, and the URL must return a yaml index file containing all of the contained packages in the source:
I have example sources deployed at the following URLs:
https://withoutpants.github.io/CommunityScrapers/develop/index.yml
https://withoutpants.github.io/CommunityScripts/develop/index.yml
The package manager unzips package zips into their own directory named for the package id in the applicable local directory (plugins or scrapers). It also writes a
manifest
file to track the package version and files:The main branches of https://github.com/WithoutPants/CommunityScripts and https://github.com/WithoutPants/CommunityScrapers have been modified to build the package sources from the source files. This should be considered a proof of concept and for testing purposes only.
Resolves #623