Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Golang support #113

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions src/packageurl/contrib/purl2url.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
from packageurl import PackageURL
from packageurl.contrib.route import NoRouteAvailable
from packageurl.contrib.route import Router
import re

repo_router = Router()
download_router = Router()
Expand Down Expand Up @@ -237,6 +238,84 @@ def build_hackage_repo_url(purl):
elif name:
return f"https://hackage.haskell.org/package/{name}"

@repo_router.route("pkg:golang/.*")
def build_golang_pkg_go_repo_url(purl):
"""
Return a a download URL from the `purl` string for golang. Due to the non deterministic nature of go package
locations this function works in a best effort basis.
"""

##
# This function is built using a trial and error method using the golang purl-s I ran across and needed to convert.
# To have a more reliable algorithm golang would need to implement a determinaists way to refer to packages or the
# method should rely on the active golang proxies (https://proxy.golang.org/ or maybe
# https://github.com/gomods/athens)to determine the url.


purl_data = PackageURL.from_string(purl)

namespace = purl_data.namespace
name = purl_data.name
version = purl_data.version
qualifiers = purl_data.qualifiers

download_url = qualifiers.get("download_url")

if download_url:
return download_url

if not (namespace and name and version):
return

print(f"namespace: {purl_data.namespace}, name {purl_data.name}, version: {purl_data.version}, qualifiers: {purl_data.qualifiers}")

if "github.com" in purl_data.namespace:

namespace = purl_data.namespace.split("/")
exp = re.compile("v[0-9]+")

# if the version is a pseudo version and contains several sections separated by - the last section is a git
# commit id what should be referred in the tree of the repo
# https://stackoverflow.com/questions/57355929/what-does-incompatible-in-go-mod-mean-will-it-cause-harm
if "-" in purl_data.version:
version = purl_data.version.split("-")
if exp.match(purl_data.name):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of this check? Do go package names sometimes have v[0-9]+ in them? Why do we use the name in the download_url in that case?

Copy link

@jasinner jasinner Feb 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this check should actually be:
if len(namespace) >= 3:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, perhaps this check is looking for modules with backward incompatiable changes as explained here: https://go.dev/doc/modules/release-workflow

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the checks could instead be replaced by:

                if len(namespace_parts) == 2:
                    return (
                        f"https://{namespace_parts[0]}/{namespace_parts[1]}/{purl_data.name}"
                        f"/tree/{version}"
                    )
                else:
                    return (
                        f"https://{namespace_parts[0]}/{namespace_parts[1]}/{namespace_parts[2]}"
                        f"/tree/{version}"
                    )

I don't see the need for the regular expression check because the backwards incompatiable changes always occur at the 3rd portion of the namespace anyway right?

return f"https://{namespace[0]}/{namespace[1]}/{namespace[2]}/tree/{version[len(version) - 1]}"
else:
return f"https://{namespace[0]}/{namespace[1]}/{purl_data.name}/tree/{version[len(version) - 1]}"

# if the version refers to a module using semantic versioning, but not opted to use modules it has a
# '+incompatible' differentiator in the version what can be just omitted in our case.
# Ref: https://stackoverflow.com/questions/57355929/what-does-incompatible-in-go-mod-mean-will-it-cause-harm
# Ref: https://github.com/golang/go/wiki/Modules#can-a-module-consume-a-package-that-has-not-opted-in-to-modules

version = purl_data.version.replace("+incompatible", "")
# If the referred module is in a directory of a repo, than parts of the url are added as a part of a tag
if len(namespace) >= 3:
# Constructing the basic part of the URL
url = f"https://{namespace[0]}/{namespace[1]}/{namespace[2]}/releases/tag/"
# adding the remains of the path to the tag
for i in range(3, len(namespace)):
url = url + namespace[i] + "%2F"
# and finally adding the version
if exp.match(purl_data.name):
url = url + f"{version}"
else:
url = url + f"{purl_data.name}%2F{version}"
return url
else:
if exp.match(purl_data.name):
return f"https://{purl_data.namespace}/releases/tag/{version}"
else:
print("No match")
return f"https://{purl_data.namespace}/{purl_data.name}/releases/tag/{version}"
else:
if "-" in purl_data.version:
# Version is not semantic version, therefore not compatible with pkg.go.dev
return
else:
return f"https://pkg.go.dev/{purl_data.namespace}/{purl_data.name}@{version}"


# Download URLs:

Expand Down Expand Up @@ -334,3 +413,4 @@ def build_github_download_url(purl):
version = f"{version_prefix}{version}"

return f"https://github.com/{namespace}/{name}/archive/refs/tags/{version}.zip"