Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax type signature for result_type and allow it to act on numbers #192

Closed
wants to merge 5 commits into from
Closed

Relax type signature for result_type and allow it to act on numbers #192

wants to merge 5 commits into from

Conversation

ericphanson
Copy link

Redo of #185 along with defining result_type for Numbers so that result_type can act on objects generically as discussed in #190.

Closes #190

@codecov-io
Copy link

codecov-io commented Dec 4, 2020

Codecov Report

Merging #192 (ddeab36) into master (e50899d) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #192      +/-   ##
==========================================
+ Coverage   95.05%   95.07%   +0.01%     
==========================================
  Files           8        8              
  Lines         688      690       +2     
==========================================
+ Hits          654      656       +2     
  Misses         34       34              
Impacted Files Coverage Δ
src/generic.jl 97.95% <100.00%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e50899d...ddeab36. Read the comment docs.

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except for the little typo.

src/generic.jl Outdated Show resolved Hide resolved
Co-authored-by: Daniel Karrasch <[email protected]>
@ericphanson
Copy link
Author

Bump

@johnnychen94
Copy link
Contributor

I believe this one is covered by #194 now

@ericphanson
Copy link
Author

Ah indeed, I missed that. Ok, should I close this PR?

@dkarrasch
Copy link
Member

I didn't mean to steal anything, I was just anticipating these changes and copied them. We should try to get feedback by somebody else before merging this. @nalimilan?

Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I find it a bit weird to add unexported definitions that aren't used at all by the package. Does #194 use them at least?

src/generic.jl Outdated Show resolved Hide resolved
result_type(::PreMetric, ::Type, ::Type) = Float64 # fallback
result_type(dist::PreMetric, a::AbstractArray, b::AbstractArray) = result_type(dist, eltype(a), eltype(b))
result_type(::PreMetric, ::Type, ::Type) = Float64 # fallback in Distances
result_type(f, a::Type, b::Type) = typeof(f(oneunit(a), oneunit(b))) # don't require `PreMetric` subtyping
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want to preserve units if inputs have them for all kinds of distances?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure; I guess that's a question for @johnnychen94, since the method came from #185. To me the "all kinds of distances" isn't super worrying since any distance could just provide a more specialized method, but to be honest I'm not super confident in the interplay between units and distances and maybe that could be tedious or painful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as we've discussed in the original issue #190, this change gives other packages the ability to reuse the name result_type without introducing type piracy. Because Distances defines result_type, the fallback should be defined in Distances.

Copy link
Contributor

@johnnychen94 johnnychen94 Dec 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, result_type is exported by Distances.

export
# generic types/functions
PreMetric,
SemiMetric,
Metric,
# generic functions
result_type,

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah good point, it is exported!

I think @nalimilan's question here was about oneunit, any thoughts on that @johnnychen94?

Are we sure we want to preserve units if inputs have them for all kinds of distances?

Copy link
Contributor

@johnnychen94 johnnychen94 Dec 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand the question clearly and I don't know what that question intends to suggest.


I don't know why the fallback is Float64 for PreMetric, it's probably written down when result_type wasn't exported and the main consideration is probably arithmetic performance inside Distances only.

FixedPointNumbers has introduced a better strategy floattype(x) to rewrite that N0f8( normalized UInt8) should get promoted to Float32 instead of Float16. I think Distances should follow the same pattern and should not hard code the result type into Float64. People don't like changes, so it's okay to keep the internal Distances unchanged.

We're not changing Distances internal implementation here. As is said, we're adding a fallback that will be overridden by all the internal distances' implementation.

I think for any general function f (not limited to Distances itself), calling f on the units is a more generic fallback. Let me give two examples here:

For Unitful:

julia> using Unitful, Distances

julia> x = fill(999u"mm", 4, 4);

julia> y = fill(1u"m", 4, 4);

julia> cityblock(x, y)
2//125 m

julia> result_type(Cityblock(), x, y)
Quantity{Rational{Int64},𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}}

julia> result_type(cityblock, x, y) # undefined but shouldn't be Float64

Also for JuliaImages, ColorVectorSpaces has defined a lot of basic operations for Gray and RGB, so it makes sense to let result_type be Gray/RGB if that's what they are supposed to be.

julia> using ImageCore, ColorVectorSpace

julia> x = rand(Gray{N0f8}, 4, 4);

julia> -(oneunit(eltype(x)), oneunit(eltype(x)))
Gray{N0f8}(0.0)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Actually I hadn't realized that oneunit is already used by existing methods:

Distances.jl/src/metrics.jl

Lines 216 to 221 in e50899d

result_type(dist::UnionMetrics, ::Type{Ta}, ::Type{Tb}) where {Ta,Tb} =
result_type(dist, Ta, Tb, parameters(dist))
result_type(dist::UnionMetrics, ::Type{Ta}, ::Type{Tb}, ::Nothing) where {Ta,Tb} =
typeof(_evaluate(dist, oneunit(Ta), oneunit(Tb)))
result_type(dist::UnionMetrics, ::Type{Ta}, ::Type{Tb}, p) where {Ta,Tb} =
typeof(_evaluate(dist, oneunit(Ta), oneunit(Tb), oneunit(eltype(p))))

Float64 is indeed not a great default. Do you know whether it's actually used in Distances ? Maybe we could remove the fallback method and always rely on oneunit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By removing this fallback, and observing the test errors, it tells me that only BhattacharyyaDist, HellingerDist and Bregman requires this fallback for its colwise implementation. Currently, it makes sense to me to remove this fallback (probably in another PR).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking. I think it would be better to try removing the fallback in this PR. Otherwise the new method wouldn't actually be used for PreMetric so it would be misleading to call that Distances.result_type!

Co-authored-by: Milan Bouchet-Valat <[email protected]>
@ericphanson
Copy link
Author

TBH I find it a bit weird to add unexported definitions that aren't used at all by the package

do you mean that you consider result_type internal and not part of the public API? The reason I want this change is because it's used in downstream code (ref #190). I can instead make a PR to NearestNeighborDescent if it should be using something other than Distances.result_type, e.g. its own vendored copy with this method.

(Relatedly, since Distances.jl doesn't have docs, it's not clear what constitutes the public API-- only exported functions, or functions with docstrings, or some mix thereof?)

@nalimilan
Copy link
Member

do you mean that you consider result_type internal and not part of the public API? The reason I want this change is because it's used in downstream code (ref #190). I can instead make a PR to NearestNeighborDescent if it should be using something other than Distances.result_type, e.g. its own vendored copy with this method.

Sorry, I hadn't realized it was exported. I considered it as the equivalent of Base.promote_op, which is something nobody is proud of but we haven't been able to replace yet. (In general, as ?Base.promote_op says, the recommended approach in Julia is to avoid relying on the eltype but instead of use the actual type of elements, and widen the type if necessary, like map and collect do. But in practice that's quite complex to implement, and Distances predates that approach.)

(Relatedly, since Distances.jl doesn't have docs, it's not clear what constitutes the public API-- only exported functions, or functions with docstrings, or some mix thereof?)

Exported functions are definitely part of the API. It's less clear for unexported functions that have docstrings: in that case I'd say it depends on what the manual/README says.

@ericphanson
Copy link
Author

Thanks all for the review and discussion; happily closing in favor of #194.

@ericphanson ericphanson deleted the eph/result_type branch December 18, 2020 11:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

result_type acts on objects for arrays but on types otherwise
5 participants