-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single release CMake file that contains all required links/hashes. #4631
Conversation
This pull request has been linked to Shortcut Story #17775: Generate cmake file with artifact URLs and checksums for each release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is a great start that will simplify updates for downstream packages.
I am hesitant to have a tight coupling between all of FindTileDB_EP.cmake
and downstream packages. For now I think we should just provide the capability to get TileDB from github releases, but not aim to have users replace the whole FindTileDB_EP.cmake
(in case there are package-specific modifications, etc.)
I think we should limit this to a .cmake
file with:
- package URLs and SHAs
- create a CMake function for the code here, and include it in the file. Downstream packages can call the function inside of their own
FindTileDB_EP.cmake
.
See comment inline about using cmake configure_file to create this from a .in template.
.github/misc/FindTileDB_EP.cmake
Outdated
@@ -0,0 +1,141 @@ | |||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call this TileDB-<version>-packages.cmake.in
, and use cmake configure file to do the substitution: https://cmake.org/cmake/help/latest/command/configure_file.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it can live in https://github.com/TileDB-Inc/TileDB/tree/dev/cmake/inputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I might not understand properly what you are trying to achieve. configure_file(...)
can be called from within another CMake file, so we would need something like generate.cmake
and TileDB-<version>-packages.cmake.in
that would be used to generate TileDB-<version>-packages.cmake
. In any case we still need to substitute all values in BASH script since it has the whole list-regex logic. Is this what you had in mind?:
- GitHub CI -> creates
generate.cmake
with all required values already in it. generate.cmake
uses these values to configure_fileTileDB-.....cmake.in
(This could have been done by CI in the first step)- upload the produced
TileDB-...cmake
with its SHA to release (We don't need to upload generate.cmake)
Do I understand it correctly?
If not could you perhaps explain your idea a bit more?
Love this. Having the information in one place (by release) is very good. But may I raise my hand and ask for a (complementary) non-cmake format? Maybe JSON? Maybe key:value pairs? R does use cmake and while I could shimmy it (with help) a simpler approach would be good too. |
I have no problem also adding a json format, however, do you really need release build outside cmake? I can read JSON from CMAKE without any issues since I can use |
Yes, good point @eddelbuettel
Yes, these builds are used by TileDB-R outside of CMake. |
To be perfectly plain, I would prefer non-JSON and reading JSON happens to not be batteries included. I currently use the 'Debian Control File' format R itself uses in its DESCRIPTION file, for that we have a ready-made parser included ( Of course, other formats work but at the cost of added dependencies. So JSON, YAML, TOML, ... are all equally (in)convenient with the need for other library. |
Yeah. Let's plan on a quick huddle this week and I show you a quick demo and more resources. In essence, R is driven by e.g. |
|
||
function(DOWNLOAD_TILEDB FROM_SOURCE) | ||
# Available builds | ||
set(URL_TILEDB_LINUX_X86_64 "{{URL_TILEDB_LINUX_X86_64}}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should set these in PARENT_SCOPE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to pollute the parent scope with these variables, since noone else needs them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two quick comments, I will need some time to think of this design.
ExternalProject_Add(ep_tiledb | ||
PREFIX "externals" | ||
URL "${URL_TILEDB_SOURCE}" | ||
URL_HASH SHA1=${SHA_TILEDB_SOURCE} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SHA-1 is insecure; we should be using at least SHA-256.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I would like to change it if everyone is OK with it
CMAKE_ARGS | ||
-DCMAKE_INSTALL_PREFIX=${EP_INSTALL_PREFIX} | ||
-DCMAKE_PREFIX_PATH=${EP_INSTALL_PREFIX} | ||
-DTILEDB_S3=${TILEDB_S3} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be passing more options here. The available set of options is at BuildOptions.cmake
. I'm not saying to pass all of them, but the user should be able to set some more like Azure, GCS and WebP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add these options to function's arguments
Ok so my next plan is to:
|
This might be a format that could be compatible with both R and CMake:
|
The item in the |
Here's how I am imagining this. Each release will have a JSON (or another format like CSV but I would prefer JSON IMHO) file named say {
// We could add this to help non-vcpkg builds from source.
"git-ref": "fa30a88abc5504ed8387e25937744afcd3802232",
"artifacts": {
"x64-windows": {
"filename": "tiledb-windows-x86_64-2.19.0-fa30a88a.zip",
"sha512": "…"
},
"x64-linux": {
"filename": "tiledb-linux-x86_64-2.19.0-fa30a88a.zip",
"sha512": "…"
},
"x64-linux-noavx2": {
"filename": "tiledb-linux-x86_64-noavx2-2.19.0-fa30a88a.zip",
"sha512": "…"
},
"x64-macos": {
"filename": "tiledb-macos-x86_64-2.19.0-fa30a88a.zip",
"sha512": "…"
},
"arm64-macos": {
"filename": "tiledb-macos-arm64-2.19.0-fa30a88a.zip",
"sha512": "…"
}
}
} By reading from the JSON file, the CMake script can be version-agnostic and be checked in once for every downstream repository. Here's how the script could be used: include(DownloadPrebuiltTileDB)
# Downloads prebuilt binaries of TileDB from GitHub
# Releases and calls find_package to import them.
find_prebuilt_tiledb(
# The version of TileDB to use. Required.
VERSION 2.19.0
# The hash of the artifacts.json file. Optional, will warn if not
# specified, maybe suggesting to at least enable CMAKE_TLS_VERIFY.
MANIFEST_SHA512 …
# The name of the release artifact to download.
# Optional, will try to autodetect if not specified.
ARTIFACT_NAME x64-linux-noavx2
# Flags that will also be passed to the find_package call.
REQUIRED QUIET
)
target_link_libraries(my_project PRIVATE TileDB::tiledb) Regarding building from source I am wondering if it would be a better idea to have this CMake script exclusively deal with downloading prebuilt binaries, and direct users that want more customizability to vcpkg (after we create a |
@teo-tsirpanis I really like your suggestion, but there is a requirement from the R team for a format that could be easily included without dependencies. So JSON is somewhat out of question. Other than that, the CMake script will contain two functions:
The question is that if we don't need |
The R build would need to load another package just to parse JSON, there is no built-in reader. Csv and alike work fine. Another idea, because we are not short those few bytes, would be to store both sha1 and sha256 checksums. |
Also, I am still not sure where to put this download logic, since it will not change between releases. Only the rellist changes. Maybe if I would hardcode the latest version into it it could be also distributed as release asset otherwise I am not sure what to do with it |
Ok, so what do you guys think about the latest changes and mostly about the Edit 1.: As always the produced CSV can be found here: https://github.com/dudoslav/TileDB/releases/download/t01/rellist.csv Edit 2: Fixed link Edit 2: Change link directly to CSV |
That's a cmake file, not a csv file. Edit two weeks later: Not sure what I was up to in that comment. Hm. |
4a40e0c
to
705260d
Compare
UPDATE: I had to rebase this branch on top of another branch that add CPACK releases, since those ones actually generate sha256 hashes. This means that this PR should be merged after that change: #4694 Here are proposed release assets: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
detect_artifact_name(FETCH_PREBUILT_TILEDB_ARTIFACT_NAME) | ||
endif() | ||
|
||
ExternalProject_Add(ep_tiledb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use FetchContent
or plain downloading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this PR should rework this download logic from the original approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am concerned that ExternalProject
might force downstream projects to use a superbuild architecture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExternalProject
Is this something we should address now in your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the issue is that they still use it right? Don't we want to switch to vcpkg style dependencies? (ports)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the plan is that all first-party downstream projects eventually get TileDB from vcpkg.
We might want to change ExternalProject
later, but it will be a breaking change which might or might not be OK to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought the use of ExternalProject
in this case will not require a superbuild project because we are using pre-built artifacts. I am not opposed to ExternalProject
then. There is still the question of what would happen with using this script to download source artifacts.
list(GET LINE 0 OS) | ||
list(GET LINE 1 ARCH) | ||
list(GET LINE 2 URL) | ||
list(GET LINE 3 SHA) | ||
|
||
if(${LENGTH} GREATER 4) | ||
list(GET LINE 4 NOAVX2) | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have some reservations on how good this schema is. A more scalable and clean solution would be a PLATFORM,URL,SHA
, where PLATFORM
would be a string like x64-windows
, x64-linux-noavx2
, or source
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no problem reworking it that way, I assume that it would work for everyone.
@@ -74,7 +74,7 @@ endfunction() | |||
|
|||
function(detect_artifact_name OUT_VAR) | |||
if (WIN32) # Windows | |||
SET(${OUT_VAR} TILEDB_WINDOWS_X86_64 PARENT_SCOPE) | |||
SET(${OUT_VAR} TILEDB_WINDOWS-X86_64 PARENT_SCOPE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should fail on other architectures. Same in Linux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that CMAKE can only define these three variables: WIN32/APPLE/LINUX
I changed download logic to use FetchContent and add ReleaseListGeneration into Release workflow. This is the resulting artifacts: https://github.com/dudoslav/TileDB/releases/tag/t09 And this is the pipeline that produced it: https://github.com/dudoslav/TileDB/actions/runs/8110756605 I also downloaded |
This PR aims to autogenerate cmake file containing all required hash/url combinations so that other repositories can simply pull this file and forget about changing all url/hash occurences in their code. The output assets look like this now:
https://github.com/dudoslav/TileDB/releases/tag/t01
Please let me know if the produced CMake file is ok, or should be changed to a different format.
TYPE: BUILD
DESC: Single release CMake file that contains all required links/hashes.