Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Implementation of substructure modules #87

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

sattwamo
Copy link
Collaborator

@sattwamo sattwamo commented Nov 5, 2024

Added functions for mass drop, soft drop, jet filtering and jet trimming in src/Substructure.jl
Added examples of using mass drop and jet trimming in examples/jetreco-substructure.jl

sattwamo and others added 3 commits November 1, 2024 17:24
Added some of the jet taggers and groomers (Mass Drop, Soft Drop, Jet Trim, Jet Filter).
Yet to add the JH Top Tagger.
Need to add documentation to the code.
Added suitable documentation to Substructure.jl
Will add more to jetreco-substructure.jl
@sattwamo sattwamo added the enhancement New feature or request label Nov 5, 2024
Copy link

codecov bot commented Nov 5, 2024

Codecov Report

Attention: Patch coverage is 98.79518% with 1 line in your changes missing coverage. Please review.

Project coverage is 73.49%. Comparing base (eb1e269) to head (094b34e).

Files with missing lines Patch % Lines
src/Substructure.jl 98.79% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #87      +/-   ##
==========================================
+ Coverage   71.52%   73.49%   +1.97%     
==========================================
  Files          17       18       +1     
  Lines        1166     1249      +83     
==========================================
+ Hits          834      918      +84     
+ Misses        332      331       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@graeme-a-stewart
Copy link
Member

graeme-a-stewart commented Nov 6, 2024

Thank you for this @sattwamo.

A few things jump out right away:

Formatting

You need to format the code according to our formatting guidelines. It's not hard to do this:

~/.julia/dev/JetReconstruction/ [substructure] julia 
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.1 (2024-10-16)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using JuliaFormatter

julia> format(".")
false

julia> format(".")
true

The format action returns true when the code conforms (and it fixes is when it isn't). Our preferred format is in .JuliaFormatter.toml.

Tests

We need some tests for the new code. Roughly this is to use FastJet to generate a reference output. For the reconstruction it is convenient to store this as gzipped JSON files (see test/data).

The new way to do tests is to have them factorised, e.g., add a test/test-substructure.jl file that runs the tests for this code. Then in the higher level test runner, runtests.jl, just have a line somewhere as include("test-substructure.jl").

You can look at test-{ee,pp}-reconstruction.jl as examples.

This allows one to run tests for one component, but still use the same tests in the everything running together in the CI.

Examples

The new way I like to do examples is to factorise them by functionality. So create an examples/substructure directory and put the examples in here - use a Project.toml in this directory so that you only load the dependencies required.

See https://github.com/JuliaHEP/JetReconstruction.jl/tree/main/examples/EDM4hep as the working example for how I want to do this (the current examples need to be factorised).

Documentation

Tagging and filtering need to be documented. I would suggest doing this in its own documentation section, e.g., docs/src/substructure.md. Then add "Substructure" => "substructure." to docs/make.jl.

Previewing the documentation is as usual in Documenter.jl, viz.

~/.julia/dev/JetReconstruction/docs/ [substructure*] julia --project
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.1 (2024-10-16)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> include("make.jl")
[ Info: SetupBuildDirectory: setting up build directory.
[ Info: Doctest: running doctests.
[ Info: ExpandTemplates: expanding markdown templates.
[ Info: CrossReferences: building cross-references.
[ Info: CheckDocument: running document checks.
[ Info: Populate: populating indices.
[ Info: RenderDocument: rendering document.
[ Info: HTMLWriter: rendering HTML pages.
┌ Warning: Generated HTML over size_threshold_warn limit: lib/internal.md
│     Generated file size: 128.16 (KiB)
│     size_threshold_warn: 100.0 (KiB)
│     size_threshold:      200.0 (KiB)
│     HTML file:           lib/internal/index.html
└ @ Documenter.HTMLWriter ~/.julia/packages/Documenter/C1XEF/src/html/HTMLWriter.jl:1828
[ Info: Automatic `version="0.4.3"` for inventory from ../Project.toml
┌ Warning: Documenter could not auto-detect the building environment. Skipping deployment.
└ @ Documenter ~/.julia/packages/Documenter/C1XEF/src/deployconfig.jl:76

julia> using LiveServer

julia> LiveServer.serve(dir="build/")
✓ LiveServer listening on http://localhost:8000/ ...
  (use CTRL+C to shut down)

I will leave any specific code comments separately.

@Moelf
Copy link
Member

Moelf commented Nov 6, 2024

The format action returns true when the code conforms (and it fixes is when it isn't). Our preferred format is in .JuliaFormatter.toml.

what's the status of making this inline comments of PR

@graeme-a-stewart
Copy link
Member

The format action returns true when the code conforms (and it fixes is when it isn't). Our preferred format is in .JuliaFormatter.toml.

what's the status of making this inline comments of PR

On the TODO list!

Copy link
Member

@graeme-a-stewart graeme-a-stewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great - thank you @sattwamo. This looks pretty solid.

I have left a bunch of comments, which are mostly about coding style and being consistent with the conventions in the rest of the module.

src/JetReconstruction.jl Outdated Show resolved Hide resolved
src/Substructure.jl Outdated Show resolved Hide resolved
src/Substructure.jl Outdated Show resolved Hide resolved
src/Substructure.jl Outdated Show resolved Hide resolved
src/JetReconstruction.jl Outdated Show resolved Hide resolved
src/Substructure.jl Outdated Show resolved Hide resolved
src/Substructure.jl Outdated Show resolved Hide resolved
src/Substructure.jl Outdated Show resolved Hide resolved
src/Substructure.jl Outdated Show resolved Hide resolved
src/Substructure.jl Outdated Show resolved Hide resolved
examples/jetreco-substructure.jl Outdated Show resolved Hide resolved
Added documentation, tests and formatted the code appropriately.
@graeme-a-stewart
Copy link
Member

graeme-a-stewart commented Nov 20, 2024

Hello @sattwamo!

Sorry it took some time to get back to this as I was travelling.

Overall things are looking in rather good shape now! More detailed comments to come, but a few quick things:

Tests

Input files

Can you gzip the reference files for the tests? This keeps them small for the package. As they are JSON lists of numbers they compress really, really well:

~/code/juliadev/JetReconstruction/test/data/ [substructure] ls -l fastjet*
-rw-r--r--  1 graemes  staff  196740 Nov 20 15:24 fastjet-jet-filter.json
-rw-r--r--  1 graemes  staff  184888 Nov 20 15:24 fastjet-jet-trim.json
-rw-r--r--  1 graemes  staff  189687 Nov 20 15:24 fastjet-mass-drop.json
-rw-r--r--  1 graemes  staff  158266 Nov 20 15:24 fastjet-soft-drop.json
 ~/code/juliadev/JetReconstruction/test/data/ [substructure] gzip -9 fastjet*~/code/juliadev/JetReconstruction/test/data/ [substructure*] ls -l fastjet*  
-rw-r--r--  1 graemes  staff  39751 Nov 20 15:24 fastjet-jet-filter.json.gz
-rw-r--r--  1 graemes  staff  30918 Nov 20 15:24 fastjet-jet-trim.json.gz
-rw-r--r--  1 graemes  staff  34650 Nov 20 15:24 fastjet-mass-drop.json.gz
-rw-r--r--  1 graemes  staff  17865 Nov 20 15:24 fastjet-soft-drop.json.gz

The read_fastjet_outputs will automatically decompress them for the test.

Documentation

Structures

You don't need, in fact you shouldn't, document the structures in substructure.md. The documentation you have written in Substructure.jl will be built and put into either the Public or Internal API pages automatically.

You reference it like this:

[`MassDropTagger`](@ref)

(there are a lot of examples in index.md).

What goes in substructure.md should be more descriptive of how to achieve an outcome, together with code snippets, and appropriate links to the API docs and examples (see, e.g., EDM4hep.md).

Copy link
Member

@graeme-a-stewart graeme-a-stewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only small changes needed now. Looking good!

src/Substructure.jl Outdated Show resolved Hide resolved
src/Substructure.jl Outdated Show resolved Hide resolved
@@ -0,0 +1,129 @@
# Jet Substructure

## Structures
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't just repeat the docstrings! Write a description of how to use these functions and refer to the API documentation by reference (e.g. [MassDropTagger](@ref)).

You can refer to the nice examples you wrote as well.


---

## Functions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above - don't repeat docstrings, show how to use these functions.

end

# Function to calculate kt distance between two pseudojets
function kt_distance(jet1::PseudoJet, jet2::PseudoJet)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be better with a proper docstring

Returns:
- PseudoJet: Groomed jet or zero-momentum PseudoJet if grooming fails
"""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove blank line

parent1, parent2 = parent2, parent1
end

pti = pt2(parent1)^0.5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use pt(parent1) instead - it's defined for PseudoJets

end

pti = pt2(parent1)^0.5
ptj = pt2(parent2)^0.5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

end

else
return PseudoJet(0.0, 0.0, 0.0, 0.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not completely convinced by this return type - it might be better in this case to return nothing. But we need to check this is ok from a performance point of view. Could you check that?

Returns:
- PseudoJet: Trimmed jet composed of retained subjets
"""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove blank

sattwamo and others added 2 commits December 11, 2024 16:46
Added new documentation on using substructure modules. Minor changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants