Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Netkan source for forum threads #2213

Open
HebaruSan opened this issue Dec 13, 2017 · 4 comments
Open

[Feature] Netkan source for forum threads #2213

HebaruSan opened this issue Dec 13, 2017 · 4 comments
Labels
Enhancement New features or functionality Infrastructure Issues affecting everything around CKAN (the GitHub repos, build process, CI, ...) Netkan Issues affecting the netkan data

Comments

@HebaruSan
Copy link
Member

HebaruSan commented Dec 13, 2017

Problem

SearchAndRescue has had errors on http://status.ksp-ckan.org/ for some time now, and the latest indexed version is out of date. (Fixed by KSP-CKAN/NetKAN#6092)

This mod is hosted on DropBox because the author prefers not to use SpaceDock or GitHub. This requires manual maintenance of the metadata. (There's a SearchAndRescue.netkan file, which essentially just automates the process of populating download_size and download_hash, because everything else has to be filled in manually.)

Suggestion

I was trying to think of ways to improve this, and hit upon the idea of trying to get download links from forum threads, with a value like this in a netkan file:

    "$kref": "#/ckan/forum/123456-Topic/dropbox.com",

Proposed format, broken out by pieces of text between forward slashes:

  • Standard #/ckan kref prefix
  • forum to indicate the link is on a KSP forum thread
  • 123456-Topic to indicate the thread-specific part of the thread's URL, to be appended to https://forum.kerbalspaceprogram.com/index.php?/topic/
  • dropbox.com to specify a link search string to be matched

Netkan could:

  1. Download the HTML for the forum thread (or even better, use an API if one exists)
  2. Parse it looking for links
  3. Return the first link that matches the search string from the kref
  4. Download and process the file as normal to generate a ckan file

This might be somewhat more automated than the current process for a mod like SearchAndRescue.

Caveats

This method would probably be a bit error-prone. It would be sensitive to the exact formatting of a post; an author might rearrange their list of downloads and find that the wrong ones were now being checked. But as long as the requirements were simple and clear, it ought to be possible to keep a thread formatted in a parseable way.

Less clear are the expectations that users might develop. Authors might expect that dependencies or version requirements could be pulled from their threads, which probably isn't feasible given the requirement of free form natural language processing. We could try inventing a simplified metadata language for specifying such things, but that could turn this into a very large project with requirements for reporting syntax errors, etc.

CKAN's currently indexed downloads are overwhelmingly on SpaceDock, GitHub, and archive.org:

image

However, since nearly all mods have forum threads, some authors may be tempted to change their mods' metadata to check the forum thread. Obviously this should be avoided whenever possible; a forum thread should only be used for DropBox-style hosts that have no formal organization of releases.

@HebaruSan HebaruSan added Enhancement New features or functionality Infrastructure Issues affecting everything around CKAN (the GitHub repos, build process, CI, ...) Netkan Issues affecting the netkan data labels Dec 13, 2017
@HebaruSan
Copy link
Member Author

Another interesting possibility is a $vref for forum threads. Rather than trying to get all of the info about a mod from the forum, we could get most of it from the host with an existing $kref, and then just use the forum thread as the authoritative source for version info. This could have problems with getting out of sync, though, if a modder uploads a new version and forgets to update the forum thread for example.

@HebaruSan HebaruSan changed the title Feature suggestion: Netkan source for forum threads [Feature] Netkan source for forum threads Sep 26, 2019
@HebaruSan
Copy link
Member Author

This could have problems with getting out of sync, though, if a modder uploads a new version and forgets to update the forum thread for example.

We could solve that the same way KSP-AVC does: require the mod version to match. The forum $vref could stipulate a format for the title (and presumably raise a warning if it doesn't fit):

[Min–Max] Mod Name v1.2.3

Then if the mod version is the same as what we're inflating, we use the compatibility from the title, otherwise we don't. That way we could be sure that we weren't applying it to the wrong version.

I might look for some mods without $vrefs but with nicely formatted forum thread titles on which to pilot this...

@HebaruSan
Copy link
Member Author

HebaruSan commented Apr 9, 2022

from git import Repo
from netkan.repos import NetkanRepo, CkanMetaRepo

nkr = NetkanRepo(Repo('/Users/User/github/NetKAN'))
ckmr = CkanMetaRepo(Repo('/Users/User/github/CKAN-meta'))

[ck.resources['homepage']
 for ck in (max(ckmr.ckans(nk.identifier), default=None, key=lambda ck: ck.version)
            for nk in nkr.netkans()
            if not nk.has_vref and not nk.on_netkan)
 if hasattr(ck, 'resources') and 'remote-avc' not in ck.resources and ck.resources.get('homepage', '').startswith('https://forum.kerbalspaceprogram.com')]

@HebaruSan
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New features or functionality Infrastructure Issues affecting everything around CKAN (the GitHub repos, build process, CI, ...) Netkan Issues affecting the netkan data
Projects
None yet
Development

No branches or pull requests

1 participant