Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add functions to build html reference manuals #15

Open
wants to merge 7 commits into
base: devel
Choose a base branch
from

Conversation

LiNk-NY
Copy link
Contributor

@LiNk-NY LiNk-NY commented Nov 6, 2024

This is a work in progress. The functions are working but I am not sure how to make them run on the builders.

See here for the local setup step for testing:

https://github.com/r-devel/repos/blob/main/R/minibioc.R

cc: @jwokaty @hpages

@jwokaty
Copy link
Contributor

jwokaty commented Nov 6, 2024

I will take a look at it tomorrow. I think the BBS has some Python functions that can wrap around R functions.

@jwokaty
Copy link
Contributor

jwokaty commented Nov 11, 2024

While I can see that build_html_mans produces the manual with links, it's not clear to me how the other functions would be used with the BBS. Could you give a little more context?

For build_html_mans, I see packages_dirs is a list of paths to the repositories of packages and each path is passed to tools::pkg2HTML. Currently, the BBS uses biocViews::extractManuals to get the manuals from the source tarballs in a CRAN-style repository that will be propagated. Looking at tools::pkg2HTML, it sounds like there might be an option to get it from a tarball? I might be misunderstanding the documentation and maybe this is what is meant by 'experimental'. I couldn't seem to make this work on a tarball. This desirable because I don't have to change much when/where things happen in the BBS, but if that's not possible, it's okay.

Is src_base is the same as repoRoot that is referred in other parts of biocViews? It's the path of the top of the CRAN-style repository. I might recommend the same language.

I haven't looked too deeply but I did try to follow the examples, which worked for me.

@LiNk-NY
Copy link
Contributor Author

LiNk-NY commented Nov 12, 2024

Hi Jen, @jwokaty

I think copilot did a pretty good job at summarizing the code:

The other functions in build_html_mans.R are used to create and update databases of aliases and cross-references for R packages, which can be helpful for creating searchable documentation and reference materials. Here's a brief overview of how each function fits into the overall process:

  1. build_db_from_source:

    • This function generates the aliases.rds and rdxrefs.rds files from the source files of an R package. It reads the documentation files (.Rd files) in the package directory and extracts aliases (alternative names) and cross-references.
    • These .rds files are then saved to the web directory of the package repository, making them available for other functions to use.
  2. build_meta_aliases_db:

    • This function creates or updates a meta-database of aliases for all packages in the specified web directory.
    • It reads the aliases.rds files generated by build_db_from_source and combines them into a single meta-database file (aliases_db_file).
    • The force parameter determines whether to update only the entries with newer aliases.rds files or to rebuild the entire database.
  3. build_meta_rdxrefs_db:

    • Similar to build_meta_aliases_db, this function creates or updates a meta-database of cross-references (rdxrefs.rds files) for all packages in the web directory.
    • It reads the rdxrefs.rds files generated by build_db_from_source and combines them into a single meta-database file (rdxrefs_db_file).
    • The force parameter works the same way, determining whether to update only the entries with newer rdxrefs.rds files or to rebuild the entire database.
  4. build_html_mans:

    • This function generates HTML manuals from the source directories of R packages, which are then saved to the specified directory.
    • It uses the pkg2HTML function to convert the documentation files into HTML format, making them easily accessible and readable online.

End copilot.

I couldn't seem to make this work on a tarball. This desirable because I don't have to change much when/where things happen in the BBS, but if that's not possible, it's okay.

This worked form me on a package tarball:

> tools::pkg2HTML("AnVIL_1.19.3.tar.gz")
Warning message:
In y[i] <- if (is.null(value)) NULL else as.person(value) :
  number of items to replace is not a multiple of replacement length
> file.exists("AnVIL.html")
[1] TRUE

What are the advantages of running it on the tarball? I think running the code on the source package directory allows one to run it at any time (e.g., after the package is updated). The tarballs are dependent on the build process. It may also take more time to untar and parse the tarballs than to work on the source directories.

Is src_base is the same as repoRoot that is referred in other parts of biocViews? It's the path of the top of the CRAN-style repository. I might recommend the same language.

I don't see where repoRoot is mentioned in the package. Can you point to the code that uses repoRoot?

Yes, it is the first part of the full package URL, e.g.,
https://bioconductor.org/packages/release/bioc/ in
https://bioconductor.org/packages/release/bioc/src/contrib/MultiAssayExperiment_1.32.0.tar.gz

Best regards,
Marcel

R/build_html_mans.R Outdated Show resolved Hide resolved
R/build_html_mans.R Outdated Show resolved Hide resolved
@jwokaty
Copy link
Contributor

jwokaty commented Nov 18, 2024

The discussion last week helped clarify a lot of questions, including getting html from source versus tarballs. We'll get the HTML from source.

I'll include the other clarifications/todos here:

  • all aliases.rds should be combined and placed in src/contrib/Meta following CRAN's example: https://cran.r-project.org/src/contrib/Meta/. rdxrefs.rds files should be handled the same way.
  • These two files should be available when generating the HTML files.
  • It's still not clear if we should do force = TRUE. I think yes unless the amount of time to do it is significant.
  • HTML manuals will not necessary replace PDF manuals but may appear alongside them. (Need to follow up to determine how to make available on bioconductor.org.)

I misspelled: it's called reposRoot. See

Establish a top-level directory for the repository, we will refer to this
directory as reposRoot. Place your packages as follows:
.

R/build_html_mans.R Outdated Show resolved Hide resolved
@LiNk-NY
Copy link
Contributor Author

LiNk-NY commented Nov 19, 2024

including getting html from source versus tarballs.

It would be less disruptive to generate from the source directories but may not be in sync with the builds. OTOH, generating from tarballs will be in sync with the builds but will have to be run after the tarballs are generated. Feel free to chime in Hervé @hpages.

  • These two files should be available when generating the HTML files.

Not necessarily. These files are for CRAN to inspect / merge into their DB for linking to Bioconductor cross-references.

  • It's still not clear if we should do force = TRUE. I think yes unless the amount of time to do it is significant.

By default, the Rds files will be updated when the mtime is more recent than that of the Rds files. I think we should keep this default behavior.

  • (Need to follow up to determine how to make available on bioconductor.org.)

This will likely be another cell in the table on the landing page, right next to the pdf link.

I misspelled: it's called reposRoot. See

I have changed src_destDir to reposRoot.

@LiNk-NY
Copy link
Contributor Author

LiNk-NY commented Nov 19, 2024

Note. I've created a PR to update the landing pages with the HTML links:

Bioconductor/bioconductor.org#299

@jwokaty
Copy link
Contributor

jwokaty commented Nov 19, 2024

For the BBS builds the final repository structure is assembled close to the propagation step when it's easier to point to tarballs vs source repositories. I will work with whatever you make available.

I misunderstood what you said about the 2 RDS files. Great, they don't need to be available for generating HTML. Just put them in the Meta directory. And we should use the defaults for build_meta_*_db. Got it.

@LiNk-NY LiNk-NY marked this pull request as ready for review November 20, 2024 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants