ChatGPT解决这个技术问题 Extra ChatGPT

Difference between map and collect in Ruby?

I have Googled this and got patchy / contradictory opinions - is there actually any difference between doing a map and doing a collect on an array in Ruby/Rails?

The docs don't seem to suggest any, but are there perhaps differences in method or performance?

map is preferred at Code Golf.
As an explanation to why map is preferred at CodeGolf, which might not be obvious for all: it is only because collect is four characters longer than map, but the same in functionality.
Just to play devil's advocate, I personally find collect more readable and natural - the idea of 'collecting' records and doing X to them makes more natural sense to me than 'mapping' records and doing X to them.

J
Jakub Hampl

There's no difference, in fact map is implemented in C as rb_ary_collect and enum_collect (eg. there is a difference between map on an array and on any other enum, but no difference between map and collect).

Why do both map and collect exist in Ruby? The map function has many naming conventions in different languages. Wikipedia provides an overview:

The map function originated in functional programming languages but is today supported (or may be defined) in many procedural, object oriented, and multi-paradigm languages as well: In C++'s Standard Template Library, it is called transform, in C# (3.0)'s LINQ library, it is provided as an extension method called Select. Map is also a frequently used operation in high level languages such as Perl, Python and Ruby; the operation is called map in all three of these languages. A collect alias for map is also provided in Ruby (from Smalltalk) [emphasis mine]. Common Lisp provides a family of map-like functions; the one corresponding to the behavior described here is called mapcar (-car indicating access using the CAR operation).

Ruby provides an alias for programmers from the Smalltalk world to feel more at home.

Why is there a different implementation for arrays and enums? An enum is a generalized iteration structure, which means that there is no way in which Ruby can predict what the next element can be (you can define infinite enums, see Prime for an example). Therefore it must call a function to get each successive element (typically this will be the each method).

Arrays are the most common collection so it is reasonable to optimize their performance. Since Ruby knows a lot about how arrays work it doesn't have to call each but can only use simple pointer manipulation which is significantly faster.

Similar optimizations exist for a number of Array methods like zip or count.


@Mark Reed but then, programmers not coming from SmallTalk would be confuset by having two different functions wchich turn out to be just aliases. It causes questions like the OP one above.
@SasQ I don't disagree - I think it would be better overall if there were only one name. But there are plenty of other aliases in Ruby, and one feature of the aliasing is that there's a nice naming parallel among the operations collect, detect, inject, reject, and select (otherwise known as map, find, reduce, reject (no alias), and find_all).
Indeed. Apparently, Ruby is using aliases / synonyms at more occasions. For example, the number of elements in an array can be retrieved with count, length, or size. Different words for the same attribute of an array, but by this, Ruby enables you to pick the most appropriate word for your code: do you want the number of items you're collecting, the length of an array, or the current size of the structure. Essentially, they're all the same, but picking the right word might make your code easier to read, which is a nice property of the language.
K
Kelvin

I've been told they are the same.

Actually they are documented in the same place under ruby-doc.org:

http://www.ruby-doc.org/core/classes/Array.html#M000249

ary.collect {|item| block } → new_ary ary.map {|item| block } → new_ary ary.collect → an_enumerator ary.map → an_enumerator Invokes block once for each element of self. Creates a new array containing the values returned by the block. See also Enumerable#collect. If no block is given, an enumerator is returned instead. a = [ "a", "b", "c", "d" ] a.collect {|x| x + "!" } #=> ["a!", "b!", "c!", "d!"] a #=> ["a", "b", "c", "d"]


B
BrunoF

The collect and collect! methods are aliases to map and map!, so they can be used interchangeably. Here is an easy way to confirm that:

Array.instance_method(:map) == Array.instance_method(:collect)
 => true

k
ktec

I did a benchmark test to try and answer this question, then found this post so here are my findings (which differ slightly from the other answers)

Here is the benchmark code:

require 'benchmark'

h = { abc: 'hello', 'another_key' => 123, 4567 => 'third' }
a = 1..10
many = 500_000

Benchmark.bm do |b|
  GC.start

  b.report("hash keys collect") do
    many.times do
      h.keys.collect(&:to_s)
    end
  end

  GC.start

  b.report("hash keys map") do
    many.times do
      h.keys.map(&:to_s)
    end
  end

  GC.start

  b.report("array collect") do
    many.times do
      a.collect(&:to_s)
    end
  end

  GC.start

  b.report("array map") do
    many.times do
      a.map(&:to_s)
    end
  end
end

And the results I got were:

                   user     system      total        real
hash keys collect  0.540000   0.000000   0.540000 (  0.570994)
hash keys map      0.500000   0.010000   0.510000 (  0.517126)
array collect      1.670000   0.020000   1.690000 (  1.731233)
array map          1.680000   0.020000   1.700000 (  1.744398) 

Perhaps an alias isn't free?


I'm not sure whether these differences are significant. On a rerun I get different results in speed (even while your hash collect seems slower, your array collect seems faster)
j
jeton

Ruby aliases the method Array#map to Array#collect; they can be used interchangeably. (Ruby Monk)

In other words, same source code :

               static VALUE
rb_ary_collect(VALUE ary)
{
long i;
VALUE collect;

RETURN_SIZED_ENUMERATOR(ary, 0, 0, ary_enum_length);
collect = rb_ary_new2(RARRAY_LEN(ary));
for (i = 0; i < RARRAY_LEN(ary); i++) {
    rb_ary_push(collect, rb_yield(RARRAY_AREF(ary, i)));
}
return collect;
}

http://ruby-doc.org/core-2.2.0/Array.html#method-i-map


I wish the documentation stated explicitly that they are aliases. At the moment they simply reference each other, and both have slightly different descriptions.
J
Junaid Abbasi

#collect is actually an alias for #map. That means the two methods can be used interchangeably, and effect the same behavior.