StatsBase currently fails to calculate weighted sums and means of vectors of vectors/arrays (unweighted sums/mean works fine).
Given
A = [[1.5, 2.5], [3.5, 4.5]]
w = Weights([0.5, 0.6])
The unweighted mean does what it should:
julia> mean(A)
2-element Array{Float64,1}:
2.5
3.5
However, the weighted mean fails:
julia> mean(A, w)
ERROR: DimensionMismatch("x and y are of different lengths!")
This happens because StatsBase defines mean(A, w) via sum(A, w) / sum(w), and sum(A, w) via dot(A, values(w)). However, dot operates recursively, and so tries to do dot(A[i], values(w)[i]) with each element of A (elements are vectors) and w (elements are scalars).
I believe this is a valid use case - @lmh91 and I encountered this while trying to calculate the weighted mean of 3D spatial positions (represented as a vector of static vectors).
StatsBase currently fails to calculate weighted sums and means of vectors of vectors/arrays (unweighted sums/mean works fine).
Given
The unweighted mean does what it should:
However, the weighted mean fails:
This happens because StatsBase defines
mean(A, w)viasum(A, w) / sum(w), andsum(A, w)viadot(A, values(w)). However,dotoperates recursively, and so tries to dodot(A[i], values(w)[i])with each element ofA(elements are vectors) andw(elements are scalars).I believe this is a valid use case - @lmh91 and I encountered this while trying to calculate the weighted mean of 3D spatial positions (represented as a vector of static vectors).