Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax type signature for result_type and allow it to act on numbers #192

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 7 additions & 4 deletions src/generic.jl
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,18 @@ evaluate(dist::PreMetric, a, b) = dist(a, b)
# Generic functions

"""
result_type(dist::PreMetric, Ta::Type, Tb::Type) -> T
result_type(dist::PreMetric, a::AbstractArray, b::AbstractArray) -> T
result_type(dist, Ta::Type, Tb::Type) -> T
result_type(dist, a::AbstractArray, b::AbstractArray) -> T
result_type(dist, a::Number, b::Number) -> T

Infer the result type of metric `dist` with input type `Ta` and `Tb`, or input
data `a` and `b`.
"""
result_type(::PreMetric, ::Type, ::Type) = Float64 # fallback
result_type(dist::PreMetric, a::AbstractArray, b::AbstractArray) = result_type(dist, eltype(a), eltype(b))
result_type(::PreMetric, ::Type, ::Type) = Float64 # fallback in Distances
result_type(f, a::Type, b::Type) = typeof(f(oneunit(a), oneunit(b))) # don't require `PreMetric` subtyping
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want to preserve units if inputs have them for all kinds of distances?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure; I guess that's a question for @johnnychen94, since the method came from #185. To me the "all kinds of distances" isn't super worrying since any distance could just provide a more specialized method, but to be honest I'm not super confident in the interplay between units and distances and maybe that could be tedious or painful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as we've discussed in the original issue #190, this change gives other packages the ability to reuse the name result_type without introducing type piracy. Because Distances defines result_type, the fallback should be defined in Distances.

Copy link
Contributor

@johnnychen94 johnnychen94 Dec 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, result_type is exported by Distances.

export
# generic types/functions
PreMetric,
SemiMetric,
Metric,
# generic functions
result_type,

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah good point, it is exported!

I think @nalimilan's question here was about oneunit, any thoughts on that @johnnychen94?

Are we sure we want to preserve units if inputs have them for all kinds of distances?

Copy link
Contributor

@johnnychen94 johnnychen94 Dec 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand the question clearly and I don't know what that question intends to suggest.


I don't know why the fallback is Float64 for PreMetric, it's probably written down when result_type wasn't exported and the main consideration is probably arithmetic performance inside Distances only.

FixedPointNumbers has introduced a better strategy floattype(x) to rewrite that N0f8( normalized UInt8) should get promoted to Float32 instead of Float16. I think Distances should follow the same pattern and should not hard code the result type into Float64. People don't like changes, so it's okay to keep the internal Distances unchanged.

We're not changing Distances internal implementation here. As is said, we're adding a fallback that will be overridden by all the internal distances' implementation.

I think for any general function f (not limited to Distances itself), calling f on the units is a more generic fallback. Let me give two examples here:

For Unitful:

julia> using Unitful, Distances

julia> x = fill(999u"mm", 4, 4);

julia> y = fill(1u"m", 4, 4);

julia> cityblock(x, y)
2//125 m

julia> result_type(Cityblock(), x, y)
Quantity{Rational{Int64},𝐋,Unitful.FreeUnits{(m,),𝐋,nothing}}

julia> result_type(cityblock, x, y) # undefined but shouldn't be Float64

Also for JuliaImages, ColorVectorSpaces has defined a lot of basic operations for Gray and RGB, so it makes sense to let result_type be Gray/RGB if that's what they are supposed to be.

julia> using ImageCore, ColorVectorSpace

julia> x = rand(Gray{N0f8}, 4, 4);

julia> -(oneunit(eltype(x)), oneunit(eltype(x)))
Gray{N0f8}(0.0)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Actually I hadn't realized that oneunit is already used by existing methods:

Distances.jl/src/metrics.jl

Lines 216 to 221 in e50899d

result_type(dist::UnionMetrics, ::Type{Ta}, ::Type{Tb}) where {Ta,Tb} =
result_type(dist, Ta, Tb, parameters(dist))
result_type(dist::UnionMetrics, ::Type{Ta}, ::Type{Tb}, ::Nothing) where {Ta,Tb} =
typeof(_evaluate(dist, oneunit(Ta), oneunit(Tb)))
result_type(dist::UnionMetrics, ::Type{Ta}, ::Type{Tb}, p) where {Ta,Tb} =
typeof(_evaluate(dist, oneunit(Ta), oneunit(Tb), oneunit(eltype(p))))

Float64 is indeed not a great default. Do you know whether it's actually used in Distances ? Maybe we could remove the fallback method and always rely on oneunit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By removing this fallback, and observing the test errors, it tells me that only BhattacharyyaDist, HellingerDist and Bregman requires this fallback for its colwise implementation. Currently, it makes sense to me to remove this fallback (probably in another PR).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking. I think it would be better to try removing the fallback in this PR. Otherwise the new method wouldn't actually be used for PreMetric so it would be misleading to call that Distances.result_type!


result_type(dist, a::AbstractArray, b::AbstractArray) = result_type(dist, eltype(a), eltype(b))
result_type(dist, a::Number, b::Number) = result_type(dist, typeof(a), typeof(b))

# Generic column-wise evaluation

Expand Down
20 changes: 20 additions & 0 deletions test/test_dists.jl
Original file line number Diff line number Diff line change
@@ -1,5 +1,25 @@
# Unit tests for Distances

struct FooDist <: PreMetric end # Julia 1.0 Compat: struct definition must be put in global scope

@testset "result_type" begin
foodist(a, b) = a + b
for (Ta, Tb) in [
(Int, Int),
(Int, Float64),
(Float32, Float32),
(Float32, Float64),
]
A, B = rand(Ta, 2, 3), rand(Tb, 2, 3)
@test result_type(FooDist(), A, B) == result_type(FooDist(), Ta, Tb) == Float64
@test result_type(foodist, A, B) == result_type(foodist, Ta, Tb) == typeof(foodist(oneunit(Ta), oneunit(Tb)))

a, b = rand(Ta), rand(Tb)
@test result_type(FooDist(), a, b) == result_type(FooDist(), Ta, Tb) == Float64
@test result_type(foodist, a, b) == result_type(foodist, Ta, Tb) == typeof(foodist(oneunit(Ta), oneunit(Tb)))
end
end

function test_metricity(dist, x, y, z)
@testset "Test metricity of $(typeof(dist))" begin
@test dist(x, y) == evaluate(dist, x, y)
Expand Down