ChatGPT解决这个技术问题 Extra ChatGPT

Array versus List<T>: When to use which?

MyClass[] array;
List<MyClass> list;

What are the scenarios when one is preferable over the other? And why?

Arrays are rather obsolete, as seen in a popular discussion here. Also pointed out here, and by our host in the blog.
If I'm not mistaken the List<> has an array as internal structure. Whenever the internal array is filled it simply copy the content to an array that is double the size (or some other constant times the current size). en.wikipedia.org/wiki/Dynamic_array
Ykok: What you say seems about right, I found the source code of List<> here.
@gimel Arguing that arrays are obsolete is perhaps a bit bold

C
Community

It is rare, in reality, that you would want to use an array. Definitely use a List<T> any time you want to add/remove data, since resizing arrays is expensive. If you know the data is fixed length, and you want to micro-optimise for some very specific reason (after benchmarking), then an array may be useful.

List<T> offers a lot more functionality than an array (although LINQ evens it up a bit), and is almost always the right choice. Except for params arguments, of course. ;-p

As a counter - List<T> is one-dimensional; where-as you have have rectangular (etc) arrays like int[,] or string[,,] - but there are other ways of modelling such data (if you need) in an object model.

See also:

How/When to abandon the use of Arrays in c#.net?

Arrays, What's the point?

That said, I make a lot of use of arrays in my protobuf-net project; entirely for performance:

it does a lot of bit-shifting, so a byte[] is pretty much essential for encoding;

I use a local rolling byte[] buffer which I fill before sending down to the underlying stream (and v.v.); quicker than BufferedStream etc;

it internally uses an array-based model of objects (Foo[] rather than List), since the size is fixed once built, and needs to be very fast.

But this is definitely an exception; for general line-of-business processing, a List<T> wins every time.


The argument about resizing is totally valid. However people prefer Lists even when no resizing is needed. For this latter case, is there a solid, logical argument or is it nothing more than "arrays are out of fashion"?
"Definitely use a List any time you want to add/remove data, since resizing arrays is expensive." List uses an array internally. Were you thinking of LinkedList?
More features == more complex == not good, unless you need those features. This answer basically lists reasons why array's are better, yet draws the opposite conclusion.
@EamonNerbonne if you're not using those features, I can pretty much guarantee that they aren't going to hurt you... but: the number of collections that never need mutation is much smaller, in my experience, than those that are mutated
@MarcGravell: that depends on your coding style. In my experience virtually no collections are ever mutated. That is; collections are retrieved from the database or constructed from some source, but further processing is always done by recreating a new collection (e.g. map/filter etc). Even where conceptual mutation is necessary, it tends to be simplest to just generate a new collection. I only ever mutate a collection as a performance optimization, and such optimizations tend to be highly local and not expose the mutation to API consumers.
B
Brian

Really just answering to add a link which I'm surprised hasn't been mentioned yet: Eric's Lippert's blog entry on "Arrays considered somewhat harmful."

You can judge from the title that it's suggesting using collections wherever practical - but as Marc rightly points out, there are plenty of places where an array really is the only practical solution.


Finally got around to reading this over 3 years later haha. Good article then, good article now. :)
A
Alnitak

Notwithstanding the other answers recommending List<T>, you'll want to use arrays when handling:

image bitmap data

other low-level data-structures (i.e. network protocols)


Why for network protocols? Wouldn't you rather use custom structures here and give them an special serializer or an explicit memory layout? Furthermore, what speaks against using a List<T> here rather than a byte array?
@Konrad - well, for starters, Stream.Read and Stream.Write work with byte[], as does Encoding etc...
S
Spencer Ruport

Unless you are really concerned with performance, and by that I mean, "Why are you using .Net instead of C++?" you should stick with List<>. It's easier to maintain and does all the dirty work of resizing an array behind the scenes for you. (If necessary, List<> is pretty smart about choosing array sizes so it doesn't need to usually.)


"Why are you using .Net instead of C++?" XNA
To elaborate on @Bengt comment, performance is not .NET's priority and this is by all means fair. But some bright minds thought of using .NET with game engines. In fact most game engines use C# nowadays. In this regard Unity 3D is the paradigm of "When a game engine was not meant for AAA".
@mireazma That hasn't been a valid statement (wrt Unity) for a long time. We're getting C++ performance out of C# via HPC# and Burst. A lot of the engine internals are being migrated to C#. Even if you're talking about game script in a non-DOTS project, IL2CPP does a very, very good job of producing performant code.
@3Dave Let's not start a polemic on this. "IL2CPP does a very, very good job of producing performant code". Indeed it does, to the extents of its reach. Contrary to jacksondunstan.com/articles/3001 (in 2015) I think that IL2CPP compared to mono does a very good job performance wise. But compared to mono. Compared to native C++ it's nonsense. I did an arithmetic comparative test in Unity with il2cpp in a released exe with all optimizations vs C++ and the times were 18745, 18487ms and more times around there in Unity 2020 vs 189ms in C++ (at times ~36ms). Do a test yourself.
@mireazma My comment was intented to be more focused on Burst and HPC#, where we are seeing performance as good (and in some cases better) than equivalent native code. Your benchmarking numbers are interesting, though. I'll look into that. Cheers.
H
Herman Schoenfeld

Arrays should be used in preference to List when the immutability of the collection itself is part of the contract between the client & provider code (not necessarily immutability of the items within the collection) AND when IEnumerable is not suitable.

For example,

var str = "This is a string";
var strChars = str.ToCharArray();  // returns array

It is clear that modification of "strChars" will not mutate the original "str" object, irrespective implementation-level knowledge of "str"'s underlying type.

But suppose that

var str = "This is a string";
var strChars = str.ToCharList();  // returns List<char>
strChars.Insert(0, 'X');

In this case, it's not clear from that code-snippet alone if the insert method will or will not mutate the original "str" object. It requires implementation level knowledge of String to make that determination, which breaks Design by Contract approach. In the case of String, it's not a big deal, but it can be a big deal in almost every other case. Setting the List to read-only does help but results in run-time errors, not compile-time.


I'm relatively new to C# but it's not clear to me why returning a list would suggest mutability of the original data in a way that returning an array wouldn't. I would have though that a method whose name starts with To is going to create an object that has no ability to modify the original instance, as opposed to strChars as char[] which if valid would suggest you're now able to modify the original object.
@TimMB There's the immutability of the collection (cannot add or remote items) and immutability of the items in the collection. I was referring to the latter, whereas you're probably conflating the two. Returning an array assures the client it cannot add/remove items. If it does, it re-allocates the array and is assured it will not affect original. Returning a list, no such assurances are made and original could be affected (depends on implementation). Changing items within the collection (whether array or list) could affect the original, if the item's type is not a struct.
Thank you for the clarification. I'm still confused (probably because I come from the C++ world). If str internally uses an array and ToCharArray returns a reference to this array then the client can mutate str by changing the elements of that array, even if the size remains fixed. Yet you write 'It is clear that modification of "strChars" will not mutate the original "str" object'. What am I missing here? From what I can see, in either case the client may have access to the internal representation and, regardless of the type, this would allow mutation of some kind.
s
smack0007

If I know exactly how many elements I'm going to need, say I need 5 elements and only ever 5 elements then I use an array. Otherwise I just use a List.


Why wouldn't you use a List in the case where you know the number of elements?
S
Sune Rievers

Most of the times, using a List would suffice. A List uses an internal array to handle its data, and automatically resizes the array when adding more elements to the List than its current capacity, which makes it more easy to use than an array, where you need to know the capacity beforehand.

See http://msdn.microsoft.com/en-us/library/ms379570(v=vs.80).aspx#datastructures20_1_topic5 for more information about Lists in C# or just decompile System.Collections.Generic.List<T>.

If you need multidimensional data (for example using a matrix or in graphics programming), you would probably go with an array instead.

As always, if memory or performance is an issue, measure it! Otherwise you could be making false assumptions about the code.


Hi, could you explain why "A list's lookup time would be O(n)" is true? As far as I know List uses array behind the scenes.
@dragonfly you're totally right. Source. At the time, I assumed that the implementation used pointers, but I've since learned otherwise. From the link above: 'Retrieving the value of this property is an O(1) operation; setting the property is also an O(1) operation.'
C
Christian Findlay

Arrays Vs. Lists is a classic maintainability vs. performance problem. The rule of thumb that nearly all developers follow is that you should shoot for both, but when they come in to conflict, choose maintainability over performance. The exception to that rule is when performance has already proven to be an issue. If you carry this principle in to Arrays Vs. Lists, then what you get is this:

Use strongly typed lists until you hit performance problems. If you hit a performance problem, make a decision as to whether dropping out to arrays will benefit your solution with performance more than it will be a detriment to your solution in terms of maintenance.


This cuts to the core business decision that we must face, time is money, arrays generally end up costing more time in development and maintenance, especially multi-dimension arrays, than lists or collections of objects. But they come with a cost of minor performance and memory hits in many cases
s
supercat

Another situation not yet mentioned is when one will have a large number of items, each of which consists of a fixed bunch of related-but-independent variables stuck together (e.g. the coordinates of a point, or the vertices of a 3d triangle). An array of exposed-field structures will allow the its elements to be efficiently modified "in place"--something which is not possible with any other collection type. Because an array of structures holds its elements consecutively in RAM, sequential accesses to array elements can be very fast. In situations where code will need to make many sequential passes through an array, an array of structures may outperform an array or other collection of class object references by a factor of 2:1; further, the ability to update elements in place may allow an array of structures to outperform any other kind of collection of structures.

Although arrays are not resizable, it is not difficult to have code store an array reference along with the number of elements that are in use, and replace the array with a larger one as required. Alternatively, one could easily write code for a type which behaved much like a List<T> but exposed its backing store, thus allowing one to say either MyPoints.Add(nextPoint); or MyPoints.Items[23].X += 5;. Note that the latter would not necessarily throw an exception if code tried to access beyond the end of the list, but usage would otherwise be conceptually quite similar to List<T>.


What you described is a List<>. There's an indexer so you can access the underlying array directly, and the List<> will maintain the size for you.
@Carl: Given e.g. Point[] arr;, it's possible for code to say, e.g. arr[3].x+=q;. Using e.g. List<Point> list, it would be necessary to instead say Point temp=list[3]; temp.x+=q; list[3]=temp;. It would be helpful if List<T> had a method Update<TP>(int index, ActionByRefRef<T,TP> proc, ref TP params). and compilers could turn list[3].x+=q; into {list.Update(3, (ref int value, ref int param)=>value+=param, ref q); but no such feature exists.
Good news. It works. list[0].X += 3; will add 3 to the X property of the first element of the list. And list is a List<Point> and Point is a class with X and Y properties
m
moarboilerplate

Rather than going through a comparison of the features of each data type, I think the most pragmatic answer is "the differences probably aren't that important for what you need to accomplish, especially since they both implement IEnumerable, so follow popular convention and use a List until you have a reason not to, at which point you probably will have your reason for using an array over a List."

Most of the time in managed code you're going to want to favor collections being as easy to work with as possible over worrying about micro-optimizations.


i
iliketocode

Lists in .NET are wrappers over arrays, and use an array internally. The time complexity of operations on lists is the same as would be with arrays, however there is a little more overhead with all the added functionality / ease of use of lists (such as automatic resizing and the methods that come with the list class). Pretty much, I would recommend using lists in all cases unless there is a compelling reason not to do so, such as if you need to write extremely optimized code, or are working with other code that is built around arrays.


s
snipsnipsnip

Since no one mention: In C#, an array is a list. MyClass[] and List<MyClass> both implement IList<MyClass>. (e.g. void Foo(IList<int> foo) can be called like Foo(new[] { 1, 2, 3 }) or Foo(new List<int> { 1, 2, 3 }) )

So, if you are writing a method that accepts a List<MyClass> as an argument, but uses only subset of features, you may want to declare as IList<MyClass> instead for callers' convenience.

Details:

Why array implements IList?

How do arrays in C# partially implement IList?


"In C#, an array is a list" That's not true; an array is not a List, it only implements the IList interface.
佚名

They may be unpopular, but I am a fan of Arrays in game projects. - Iteration speed can be important in some cases, foreach on an Array has significantly less overhead if you are not doing much per element - Adding and removing is not that hard with helper functions - Its slower, but in cases where you only build it once it may not matter - In most cases, less extra memory is wasted (only really significant with Arrays of structs) - Slightly less garbage and pointers and pointer chasing

That being said, I use List far more often than Arrays in practice, but they each have their place.

It would be nice if List where a built in type so that they could optimize out the wrapper and enumeration overhead.


B
Bimal Poudel

Populating a list is easier than an array. For arrays, you need to know the exact length of data, but for lists, data size can be any. And, you can convert a list into an array.

List<URLDTO> urls = new List<URLDTO>();

urls.Add(new URLDTO() {
    key = "wiki",
    url = "https://...",
});

urls.Add(new URLDTO()
{
    key = "url",
    url = "http://...",
});

urls.Add(new URLDTO()
{
    key = "dir",
    url = "https://...",
});

// convert a list into an array: URLDTO[]
return urls.ToArray();

A
Alberto Costa

Keep in mind that with List is not possible to do this:

List<string> arr = new List<string>();

arr.Add("string a");
arr.Add("string b");
arr.Add("string c");
arr.Add("string d");

arr[10] = "new string";

It generates an Exception.

Instead with arrays:

string[] strArr = new string[20];

strArr[0] = "string a";
strArr[1] = "string b";
strArr[2] = "string c";
strArr[3] = "string d";

strArr[10] = "new string";

But with Arrays there is not an automatic data structure resizing. You have to manage it manually or with Array.Resize method.

A trick could be initialize a List with an empty array.

List<string> arr = new List<string>(new string[100]);

arr[10] = "new string";

But in this case if you put a new element using Add method it will be injected in the end of the List.

List<string> arr = new List<string>(new string[100]);

arr[10] = "new string";

arr.Add("bla bla bla"); // this will be in the end of List

I'm not sure that this adds anything to the discussion. They are true statements, but not phrased in a way to help you choose one over the other. It only highlights a misinterpretation of the new List<T>(int) constructor.
I agree with you, it is not correct to use a list as if it were an array. Maybe I could explain the concept better. Lists are handy and grow without the programmer noticing (magic?), And you can only add items at the end. Arrays on the other hand do not scale automatically, the programmer has to make do by creating a new larger array and copying the elements of the old array. Arrays take up memory more efficiently than lists.
This is an old post and the accepted answer covers this scenario succinctly. If you are going to re-open a post for discussion then please bring something new to the table or give us a genuine reason to vote for your answer, you should directly comment about why your answer is superior. You have added some entry level info about the types in question, but haven't commented on the methodology that one might go through to select which of the types to use in their code.
s
sajidnizami

It completely depends on the contexts in which the data structure is needed. For example, if you are creating items to be used by other functions or services using List is the perfect way to accomplish it.

Now if you have a list of items and you just want to display them, say on a web page array is the container you need to use.


If you have a list of items and you just want to display them, then what is wrong with just using the list you already have? What would an array offer here?
And for "creating items to be used by other functions or services", actually, I'd prefer an iterator block with IEnumerable<T> - then I can stream objects rather than buffer them.