-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Statistic.mean(f, A) calls f length(A)+1 times #49
Comments
Right, This behavior is new in Julia 1.5.0 (#25). But I'm not sure it's a problem: AFAIK we don't make any guarantees about how many times FWIW, the previous code was this: I'm not sure we can retain the new promotion behavior and at the same time avoid calling |
We could do something like:
in the common case where |
That looks reasonable, thanks. It's easy to work around but was a bit surprising. Guess I view an iterator with a visiting function to have an invariant of only calling the function once per element. |
Also ran into this as I was surprised why my very expensive function was getting called N+1 times instead of N (which really mattered, since my N was small). Having that special case or some other workaround would be great. |
Should be a straightforward PR to implement something like the suggestion above, if someone wants to take a crack at it? |
Unclear how to define first() when there's |
I wrote:
It would be nice to at least fix this simple common case. |
I must have overlooked the qualification because it feels unsatisfactory to me. |
Interesting rabbit hole...
@test (@inferred mean(Int[])) === 0/0
@test (@inferred mean(Float32[])) === 0.f0/0
@test (@inferred mean(Float64[])) === 0/0 seems like typeof(mean(Float32[]) == typeof(0.f0/0) is good enough I have no clue why the compiler is returning Any Julia> @inferred (mean([1.f0 3.f1 2; 1.0 3 missing],dims=1))
ERROR: return type Matrix{Union{Missing, Float64}} does not match inferred return type Any
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] top-level scope
@ REPL[27]:1
julia> mean([1.f0 3.f1 2; 1.0 3 missing],dims=1)
1×3 Matrix{Union{Missing, Float64}}:
1.0 16.5 missing Recommend case two to assist with issue #7. The first case this bug can be handled with documentation change. Code: function _mean(f, A::AbstractArray, dims::Dims=:) where Dims
isempty(A) && return sum(f, A, dims=dims)/0
if dims === (:)
x1 = f(first(A)) / 1
n = length(A)
if n == 1
return x1
end
AA = @view A[2:end]
result = x1 + sum(x -> _mean_promote(x1, f(x)), AA, dims=dims)
return result / n
elseif typeof(first(A)) <: AbstractFloat
x1 = convert(typeof(first(A)), 0.f0)
n = mapreduce(i -> size(A, i), *, unique(dims); init=1)
result = sum(x -> _mean_promote(x1, f(x)), A, dims=dims)
return result ./= n
else # note: f() will be called n+1 times
x1 = f(first(A)) / 1
n = mapreduce(i -> size(A, i), *, unique(dims); init=1)
result = sum(x -> _mean_promote(x1, f(x)), A, dims=dims)
return result ./= n
end
end |
(PS. please quote code in the future, @stefan911 … I edited your post to fix this.) |
Thanks. Since f() can change the return type it is difficult to remove the third case without unrolling into for loops. |
That also fails on vectors with one element, so it should be
|
Created PR I think Base.first() should be amended to accept |
Opinion: The general case of an iterator is to apply the function exactly once per element, to do otherwise should be documented in the help. foldr, foldl vs mapreduce. The expectation that f() is a pure function shouldn't be implied. Did issue #7 get addressed as well? |
As said elsewhere, it would be nice if Julia would develop a standard design pattern for this. In accumulation where there is no sensible typestable default value, you end up splitting off the first iteration, duplicate the entire loop body, and make a tiny change to it (eg += becomes =) and continue with a loop over the rest of the iterator. |
Actually I think I've found a super simple solution which works in general and luckily doesn't reduce performance. See #151. EDIT: that's more complex than I thought. I don't have a good solution. |
Not sure if it helps you but DNF gave me me a quirky snipped which initialises an accumulator in the loop in a way Julia can handle
|
Unfortunately the difficulty comes from the fact that we do not control the loop as we want to use pairwise summation just like |
Was expecting an iterator to only call the function once per element.
julia> versioninfo()
Julia Version 1.5.0-rc2.0
Commit 7f0ee122d7 (2020-07-27 15:24 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin19.5.0)
CPU: Intel(R) Core(TM) i7-3615QM CPU @ 2.30GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-9.0.1 (ORCJIT, ivybridge)
Environment:
JULIA_NUM_THREADS = 8
The text was updated successfully, but these errors were encountered: