Skip to content
This repository has been archived by the owner on Aug 29, 2018. It is now read-only.

immutable.Bag doesn't extend GenericTraversableTemplate[Bag[T]] #5

Open
Blaisorblade opened this issue May 26, 2015 · 7 comments
Open

Comments

@Blaisorblade
Copy link
Contributor

This means that the return type of e.g. flatten is just Iterable[T]. This should be hopefully easy to fix.
I haven't checked the other Bag variants yet, though I conjecture they need updating as well.

@nicolasstucki
Copy link
Owner

I'm not sure if the semantic of flatten would be correct for any custom equivalence. I'll have to analyse it depth. Could you tell me which is your specific use case.

@Blaisorblade
Copy link
Contributor Author

I'm using a standard equality (compact), and relying the equivalence b.flatMap(f).flatten == b.map(f). (More in particular, I'm saving the results of b.flatMap(f), to update it incrementally if b undergoes a change). I haven't fully understood semantics of other cases, I'm still exploring your library.

Your comment about semantics seems interesting, but since flatten needn't access implementation details, this seems likely to affect other client code. And indeed, if I consider bags of different sizes equivalent in my bb: Bag[Bag[T]], the size of bb.flatten will be unpredictable with some implementations of flatten.

I currently uglily reimplemented flatten like in the standard library, which is probably slower than needed.

  implicit def config[T] = Bag.configuration.compact[T]
  implicit class FlattenOps[T](b: Bag[Bag[T]]) {
    def flattenB: Bag[T] = {
      val builder = Bag.newBuilder[T]
      for (baglet <- b)
        builder ++= baglet
      builder.result
    }
  }

@nicolasstucki
Copy link
Owner

You are probably right, extending it will work correctly.

@Blaisorblade
Copy link
Contributor Author

I'm not sure I'm suggesting that. However, for standard equalities it'll work fine, and that's the usecase I'd like. For the rest, I'm not sure what correctly means.

Also, when I hacked together quickly a bag implementation, I actually had a potentially faster version, which multiplies multiplicities instead of readding elements:

case class Bag[A](contents: immutable.Map[A, Int] = immutable.HashMap())
//XXX: Not reusing collection infrastructure :-(
{
  //...
  def flatten[B](implicit p: A <:< Bag[B]): Bag[B] = {
    new Bag(for {
      (outerEl, count) <- contents
      innerBag = p(outerEl)
      (el, count2) <- innerBag.contents
    } yield (el, count * count2))
  }
}

Could such an implementation be supported in your library?

@nicolasstucki
Copy link
Owner

That may work for the compact representation but will fail on the others. But the implementation of flatten might be just a union over the bags. Which would internally do something similar to the code you wrote when using the compact representation.

@Blaisorblade
Copy link
Contributor Author

Thanks, I was indeed just supporting a compact representation.

I've tried out your suggestion. However, as currently implemented, that appears to take quadratic time, because union takes time linear in the size of both collections (see new issue #7), and the accumulator keeps growing. (In fact, I've not waited for my benchmark to terminate).

The complexity is thus equivalent to the one of the code below, which has more visibly quadratic cost:

var acc = Array.empty
for (bag <- bags) {
  val newAcc = Array.empty
  newAcc += acc   //cost: O( | acc | ), which keeps increasing
  newAcc += bag   //cost: O( | bag | )
  acc = newAcc
}

@Blaisorblade
Copy link
Contributor Author

I forgot to write the implementation of flattenB I tried out:

    def flattenB: Bag[T] =
      b.fold[Bag[T]](Bag.empty)(_ union _)

Apparently, that fold resolves to foldLeft, though foldRight wouldn't make a difference.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants