Given:
>>> d = {'a': 1, 'b': 2}
Which of the following is the best way to check if 'a'
is in d
?
>>> 'a' in d
True
>>> d.has_key('a')
True
in
wins hands-down, not just in elegance (and not being deprecated;-) but also in performance, e.g.:
$ python -mtimeit -s'd=dict.fromkeys(range(99))' '12 in d'
10000000 loops, best of 3: 0.0983 usec per loop
$ python -mtimeit -s'd=dict.fromkeys(range(99))' 'd.has_key(12)'
1000000 loops, best of 3: 0.21 usec per loop
While the following observation is not always true, you'll notice that usually, in Python, the faster solution is more elegant and Pythonic; that's why -mtimeit
is SO helpful -- it's not just about saving a hundred nanoseconds here and there!-)
has_key
appears to be O(1) too.
According to python docs:
has_key() is deprecated in favor of key in d.
has_key()
is now removed in Python 3
Use dict.has_key()
if (and only if) your code is required to be runnable by Python versions earlier than 2.3 (when key in dict
was introduced).
There is one example where in
actually kills your performance.
If you use in
on a O(1) container that only implements __getitem__
and has_key()
but not __contains__
you will turn an O(1) search into an O(N) search (as in
falls back to a linear search via __getitem__
).
Fix is obviously trivial:
def __contains__(self, x):
return self.has_key(x)
has_key()
is specific to Python 2 dictionaries. in
/ __contains__
is the correct API to use; for those containers where a full scan is unavoidable there is no has_key()
method anyway, and if there is a O(1) approach then that'll be use-case specific and so up to the developer to pick the right data type for the problem.
Solution to dict.has_key() is deprecated, use 'in' -- sublime text editor 3
Here I have taken an example of dictionary named 'ages' -
ages = {}
# Add a couple of names to the dictionary
ages['Sue'] = 23
ages['Peter'] = 19
ages['Andrew'] = 78
ages['Karren'] = 45
# use of 'in' in if condition instead of function_name.has_key(key-name).
if 'Sue' in ages:
print "Sue is in the dictionary. She is", ages['Sue'], "years old"
else:
print "Sue is not in the dictionary"
Expanding on Alex Martelli's performance tests with Adam Parkin's comments...
$ python3.5 -mtimeit -s'd=dict.fromkeys(range( 99))' 'd.has_key(12)'
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/timeit.py", line 301, in main
x = t.timeit(number)
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/timeit.py", line 178, in timeit
timing = self.inner(it, self.timer)
File "<timeit-src>", line 6, in inner
d.has_key(12)
AttributeError: 'dict' object has no attribute 'has_key'
$ python2.7 -mtimeit -s'd=dict.fromkeys(range( 99))' 'd.has_key(12)'
10000000 loops, best of 3: 0.0872 usec per loop
$ python2.7 -mtimeit -s'd=dict.fromkeys(range(1999))' 'd.has_key(12)'
10000000 loops, best of 3: 0.0858 usec per loop
$ python3.5 -mtimeit -s'd=dict.fromkeys(range( 99))' '12 in d'
10000000 loops, best of 3: 0.031 usec per loop
$ python3.5 -mtimeit -s'd=dict.fromkeys(range(1999))' '12 in d'
10000000 loops, best of 3: 0.033 usec per loop
$ python3.5 -mtimeit -s'd=dict.fromkeys(range( 99))' '12 in d.keys()'
10000000 loops, best of 3: 0.115 usec per loop
$ python3.5 -mtimeit -s'd=dict.fromkeys(range(1999))' '12 in d.keys()'
10000000 loops, best of 3: 0.117 usec per loop
has_key
is a dictionary method, but in
will work on any collection, and even when __contains__
is missing, in
will use any other method to iterate the collection to find out.
in
tests on range
objects. I'm not so sure about its efficiency on Python 2 xrange
, though. ;)
__contains__
can trivially calculate if a value is in the range or not.
range
instance each time. Using a single, pre-existing instance the "integer in range" test is about 40% faster in my timings.
If you have something like this:
if d.has_key('a'):
change it to below for running on Python 3.X and above:
if 'a' in d:
t.has_key(ew)
returns True
if the value ew
references is also a key in the dictionary. key not in t
returns True
if the value is not in the dictionary. Moreover, the key = ew
alias is very, very redundant. The correct spelling is if ew in t
. Which is what the accepted answer from 8 years prior already told you.
Success story sharing
keys()
is just a set-like view into a dictionary rather than a copy, sox in d.keys()
is O(1). Still,x in d
is more Pythonic.x in d.keys()
must construct and destroy a temporary object, complete with the memory allocation that entails, wherex in d.keys()
is just doing an arithmetic operation (computing the hash) and doing a lookup. Note thatd.keys()
is only about 10 times as long as this, which is still not long really. I haven't checked but I'm still pretty sure it's only O(1).