Friday, October 30, 2009

Recipe 7.7. Writing Block Methods That Classify or Collect










Recipe 7.7. Writing Block Methods That Classify or Collect



Problem


The basic block methods that come with the Ruby standard library aren't enough for you. You want to define your own method that classifies the elements in an enumeration (like Enumerable#detect and Enumerable#find_all), or that does a transformation on each element in an enumeration (like Enumerable#collect).




Solution


You can usually use inject to write a method that searches or classifies an enumeration of objects. With inject you can write your own versions of methods such as detect and find_all:



module Enumerable
def find_no_more_than(limit)
inject([]) do |a,e|
a << e if yield e
return a if a.size >= limit
a
end
end
end



This code finds at most three of the even numbers in a list:



a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
a.find_no_more_than(3) { |x| x % 2 == 0 } # => [2, 4, 6]



If you find yourself needing to write a method like collect, it's probably because, for your purposes, collect itself yields elements in the wrong order. You can't use inject, because that yields elements in the same order as collect.


You need to find or write an iterator that yields elements in the order you want. Once you've done that, you have two options: you can write a collect equivalent on top of the iterator method, or you can use the iterator method to build an Enumerable object, and call its collect method (as seen in Recipe 7.6).




Discussion


We discussed these block methods in more detail in Chapter 4, because arrays are the simplest and most common Enumerable data type, and the most common. But almost any data structure can be enumerated, and a more complex data structure can be enumerated in more different ways.


As you'll see in Recipe 9.4, the Enumerable methods, like detect and inject, are actually implemented in terms of each. The detect and inject methods yield to the code block every element that comes out of each. The value of the yield statement is used to determine whether the element matches some criteria.


In a method like detect, the
iteration may stop once it finds an element that matches. In a method like find_all, the iteration goes through all elements, collecting the ones that match.


Methods like collect work the same way, but instead of returning a subset of elements based on what the code block says, they collect the values returned by the code block in a new data structure, and return the data structure once the iteration is completed.


If you're using a particular object and you wish its collect method used a different iterator, then you should turn the object into an Enumerator and call its collect method. But if you're writing a class and you want to expose a new collect-like method, you'll have to define a new method.[4] In that case, the best solution is probably to expose a method that returns a custom Enumerator: that way, your users can use all the methods of Enumerable, not just collect.

[4] Of course, behind the scenes, your method could just create an appropriate Enumerator and call its collect implemenation.




See Also


  • Recipe 4.5, "Sorting an Array"

  • Recipe 4.11, "Getting the N Smallest Items of an Array"

  • Recipe 4.15, "Partitioning or Classifying a Set"

  • Recipe 7.6, "Changing the Way an Object Iterates"

  • If all you want is to make your custom data structure support the methods of Enumerable, see Recipe 9.4, "Implementing Enumerable: Write One Method, Get 22 Free"













No comments: