Lazily provides various "lazy" Enumerable operations, similar to:
- Clojure's sequences
- Haskell's lists
- Scala's "views"
- Ruby 2.0's Enumerator::Lazy
Lazy evaluation is triggered using Enumerable#lazily
:
[1,2,3].lazily #=> #<Lazily::Proxy: [1, 2, 3]>
The resulting object implements lazy versions of various Enumerable methods. Rather than returning Arrays, the lazy forms return new Enumerable objects, which generate results incrementally, on demand.
Consider the following code, which squares a bunch of numbers, then takes the first 4 results.
>> (1..10).collect { |x| p x; x*x }.take(4)
1
2
3
4
5
6
7
8
9
10
=> [1, 4, 9, 16]
See how it printed all the numbers from 1 to 10, indicating that the block given to collect
was run ten times. We can do the same thing lazily, like this:
>> (1..10).lazily.collect { |x| p x; x*x }.take(4).to_a
1
2
3
4
=> [1, 4, 9, 16]
Same result, but notice how the block was only evaluated four times.
By combining two or more lazy operations, you can create an efficient "pipeline", e.g.
User.to_enum(:find_each).lazily.select do |u|
u.first_name[0] == u.last_name[0]
end.collect(&:company).uniq.to_a
In case you missed it:
# enumerate all users
users = User.to_enum(:find_each)
# lazily select those matching "X... X..." (e.g. "Mickey Mouse")
users = users.lazily.select { |u| u.first_name[0] == u.last_name[0] }
# grab their company (weeding out duplicates)
companies = users.collect(&:company).uniq
# force resolution
companies.to_a #=> ["Disney"]
Because the steps in the pipeline operate in parallel, without creation of intermediate collections (Arrays), you can efficiently operate on large (or even infinite) Enumerable collections.
This is analogous to a Unix shell pipeline, though of course here we're talking about streams of objects, rather than streams of text.
The #in_threads
method is a multi-threaded version of #collect
, allowing multiple elements of a collection to be processed in parallel. It requires a numeric argument specifying the maximum number of Threads to use.
Benchmark.realtime do
[1,2,3,4].lazily.in_threads(10) do |x|
sleep 1
x * 2
end.to_a #=> [2,4,6,8]
end.to_i #=> 1
Outputs will be yielded in the expected order, making it a drop-in replacement for #collect
.
Unlike some other "parallel map" implementations, the output of #in_threads
is lazy (though it does need to pre-fetch elements from the source collection as required to start Threads).
Lazily.dequeue
makes a Queue look like a (lazy) Enumerable, making it easier to process data produced by other Threads, e.g.
q = Queue.new
Thread.new do
q << 1
q << 2
q << 3
end
Lazily.dequeue(q).take(2).to_a # => [1,2]
Lazily also provides some interesting ways to combine several Enumerable collections to create a new collection.
Lazily.zip
pulls elements from a number of collections in parallel, yielding each group.
array1 = [1,3,6]
array2 = [2,4,7]
Lazily.zip(array1, array2) #=> [1,2], [3,4], [6,7]
Lazily.concat
concatenates collections.
array1 = [1,3,6]
array2 = [2,4,7]
Lazily.concat(array1, array2) #=> [1,3,6,2,4,7]
Lazily.merge
merges multiple collections, preserving sort-order. The inputs are assumed to be sorted already.
array1 = [1,4,5]
array2 = [2,3,6]
Lazily.merge(array1, array2) #=> [1,2,3,4,5,6]
A block can be provided to determine the sort-order.
array1 = %w(a dd cccc)
array2 = %w(eee bbbbb)
Lazily.merge(array1, array2) { |x| x.length }
#=> %w(a dd eee cccc bbbbb)
Lazily.associate
matches up "like" elements from separate collections. Again, it assumes it's inputs sorted.
fruit = %w(apple banana orange)
nautical_terms = %w(anchor boat flag)
Lazily.associate(:fruit => fruit, :nautical => nautical_terms) do |word|
word.chars.first
end
#=> [
{ :fruit => ["apple"], :nautical => ["anchor"] },
{ :fruit => ["banana"], :nautical => ["boat"] },
{ :fruit => [], :nautical => ["flag"] },
{ :fruit => ["orange"], :nautical => [] }
]
There are numerous similar implementations of lazy operations on Enumerables.
Lazily supercedes "Enumerating". Whereas Enumerating mixed lazy operations directly onto Enumerable
, Lazily does not. Instead, it implements an API modelled on Ruby 2.0's Enumerable#lazy
.
Q: Why use Lazily, when Ruby 2.x has built-in lazy enumerations?
- Compatibility: Perhaps you haven't managed to migrate to Ruby 2.0 yet. Lazily provides the same benefits, but works in older versions of Ruby (most features work even in Ruby-1.8.7).
- Consistency: Being pure-Ruby, you can use the same implementation of lazy enumeration across Ruby versions and interpreters.
- Features: Lazily provides some extra features not present in Ruby 2.0, such as multi-threaded lazy enumeration.
- Speed: Despite being implemented in pure Ruby,
Enumerable#lazily
actually performs a little better thanEnumerable#lazy
.
See also:
- Greg Spurrier's gem "
lazing
" Enumerable#defer
from the Ruby Facets library- The "
backports
" gem, which implementsEnumerable#lazy
for Ruby pre-2.0.