Saturday, November 7, 2009

Recipe 4.5. Sorting an Array










Recipe 4.5. Sorting an Array





Problem


You want to sort an array of objects, possibly according to some custom notion of what "sorting" means.




Solution


Homogeneous
arrays of common data types, like strings or numbers, can be sorted "naturally" by just calling
Array#sort
:



[5.01, -5, 0, 5].sort # => [-5, 0, 5, 5.01]
["Utahraptor", "Ankylosaur", "Maiasaur"].sort
# => ["Ankylosaur", "Maiasaur", "Utahraptor"]



To sort objects based on one of their data members, or by the results of a method call, use
Array#sort_by
. This code sorts an array of arrays by size, regardless of their contents:



arrays = [[1,2,3], [100], [10,20]]
arrays.sort_by { |x| x.size } # => [[100], [10, 20], [1, 2, 3]]



To do a more general sort, create a code block that compares the relevant aspect of any two given objects. Pass this block into the sort method of the array you want to sort.


This code sorts an array of numbers in ascending numeric order, except that the number 42 will always be at the end of the list:



[1, 100, 42, 23, 26, 10000].sort do |x, y|
x == 42 ? 1 : x <=> y
end
# => [1, 23, 26, 100, 10000, 42]





Discussion


If there is one "canonical" way to sort a particular class of object, then you can have that class implement the <=> comparison operator. This is how Ruby automatically knows how to sort numbers in ascending order and strings in ascending ASCII order: Numeric and String both implement the comparison operator.


The sort_by method sorts an array using a Schwartzian transform (see Recipe 4.6 for an in-depth discussion). This is the most useful customized sort, because it's fast and easy to define. In this example, we use sort_by to sort on any one of an object's fields.



class Animal
attr_reader :name, :eyes, :appendages

def initialize(name, eyes, appendages)
@name, @eyes, @appendages = name, eyes, appendages
end

def inspect
@name
end
end

animals = [Animal.new("octopus", 2, 8),
Animal.new("spider", 6, 8),
Animal.new("bee", 5, 6),
Animal.new("elephant", 2, 4),
Animal.new("crab", 2, 10)]

animals.sort_by { |x| x.eyes }
# => [octopus, elephant, crab, bee, spider]

animals.sort_by { |x| x.appendages }
# => [elephant, bee, octopus, spider, crab]



If you pass a block into sort, Ruby calls the block to make comparisons instead of using the comparison operator. This is the most general possible sort, and it's useful for cases where sort_by won't work.


The comparison operator and a sort code block both take one argument: an object against which to compare self. A call to <=> (or a sort code block) should return1 if self is "less than" the given object (and should therefore show up before it in a sorted list). It should return 1 if self is "greater than" the given object (and should show up after it in a sorted list), and 0 if the objects are "equal" (and it doesn't matter which one shows up first). You can usually avoid remembering this by delegating the return value to some other object's <=> implementation.




See Also


  • Recipe 4.6, "Ignoring Case When
    Sorting
    Strings," covers the workings of the Schwartzian Transform

  • Recipe 4.7, "Making Sure a Sorted Array Stays Sorted"

  • Recipe 4.10, "Shuffling an Array"

  • If you need to find the minimum or maximum item in a list according to some criteria, don't sort it just to save writing some code; see Recipe 4.11, "Getting the N Smallest Items of an Array," for other options













No comments: