I have two dictionaries, but for simplification, I will take these two:
>>> x = dict(a=1, b=2)
>>> y = dict(a=2, b=2)
Now, I want to compare whether each key, value
pair in x
has the same corresponding value in y
. So I wrote this:
>>> for x_values, y_values in zip(x.iteritems(), y.iteritems()):
if x_values == y_values:
print 'Ok', x_values, y_values
else:
print 'Not', x_values, y_values
And it works since a tuple
is returned and then compared for equality.
My questions:
Is this correct? Is there a better way to do this? Better not in speed, I am talking about code elegance.
UPDATE: I forgot to mention that I have to check how many key, value
pairs are equal.
x == y
should be true according to stackoverflow.com/a/5635309/186202
x == y
should be true according to the official documentation: "Dictionaries compare equal if and only if they have the same (key, value) pairs (regardless of ordering). Order comparisons (‘<’, ‘<=’, ‘>=’, ‘>’) raise TypeError."
If you want to know how many values match in both the dictionaries, you should have said that :)
Maybe something like this:
shared_items = {k: x[k] for k in x if k in y and x[k] == y[k]}
print(len(shared_items))
What you want to do is simply x==y
What you do is not a good idea, because the items in a dictionary are not supposed to have any order. You might be comparing [('a',1),('b',1)]
with [('b',1), ('a',1)]
(same dictionaries, different order).
For example, see this:
>>> x = dict(a=2, b=2,c=3, d=4)
>>> x
{'a': 2, 'c': 3, 'b': 2, 'd': 4}
>>> y = dict(b=2,c=3, d=4)
>>> y
{'c': 3, 'b': 2, 'd': 4}
>>> zip(x.iteritems(), y.iteritems())
[(('a', 2), ('c', 3)), (('c', 3), ('b', 2)), (('b', 2), ('d', 4))]
The difference is only one item, but your algorithm will see that all items are different
def dict_compare(d1, d2):
d1_keys = set(d1.keys())
d2_keys = set(d2.keys())
shared_keys = d1_keys.intersection(d2_keys)
added = d1_keys - d2_keys
removed = d2_keys - d1_keys
modified = {o : (d1[o], d2[o]) for o in shared_keys if d1[o] != d2[o]}
same = set(o for o in shared_keys if d1[o] == d2[o])
return added, removed, modified, same
x = dict(a=1, b=2)
y = dict(a=2, b=2)
added, removed, modified, same = dict_compare(x, y)
DataFrame
s by design don't allow truthy comparisons (unless it has a length of 1) as they inherit from numpy.ndarray
. -credit to stackoverflow.com/a/33307396/994076
dic1 == dic2
From python docs:
The following examples all return a dictionary equal to {"one": 1, "two": 2, "three": 3}: >>> a = dict(one=1, two=2, three=3) >>> b = {'one': 1, 'two': 2, 'three': 3} >>> c = dict(zip(['one', 'two', 'three'], [1, 2, 3])) >>> d = dict([('two', 2), ('one', 1), ('three', 3)]) >>> e = dict({'three': 3, 'one': 1, 'two': 2}) >>> a == b == c == d == e True
Providing keyword arguments as in the first example only works for keys that are valid Python identifiers. Otherwise, any valid keys can be used.
Comparison is valid for python2
and python3
.
OrderedDict != dict
Since it seems nobody mentioned deepdiff
, I will add it here for completeness. I find it very convenient for getting diff of (nested) objects in general:
Installation
pip install deepdiff
Sample code
import deepdiff
import json
dict_1 = {
"a": 1,
"nested": {
"b": 1,
}
}
dict_2 = {
"a": 2,
"nested": {
"b": 2,
}
}
diff = deepdiff.DeepDiff(dict_1, dict_2)
print(json.dumps(diff, indent=4))
Output
{
"values_changed": {
"root['a']": {
"new_value": 2,
"old_value": 1
},
"root['nested']['b']": {
"new_value": 2,
"old_value": 1
}
}
}
Note about pretty-printing the result for inspection: The above code works if both dicts have the same attribute keys (with possibly different attribute values as in the example). However, if an "extra"
attribute is present is one of the dicts, json.dumps()
fails with
TypeError: Object of type PrettyOrderedSet is not JSON serializable
Solution: use diff.to_json()
and json.loads()
/ json.dumps()
to pretty-print:
import deepdiff
import json
dict_1 = {
"a": 1,
"nested": {
"b": 1,
},
"extra": 3
}
dict_2 = {
"a": 2,
"nested": {
"b": 2,
}
}
diff = deepdiff.DeepDiff(dict_1, dict_2)
print(json.dumps(json.loads(diff.to_json()), indent=4))
Output:
{
"dictionary_item_removed": [
"root['extra']"
],
"values_changed": {
"root['a']": {
"new_value": 2,
"old_value": 1
},
"root['nested']['b']": {
"new_value": 2,
"old_value": 1
}
}
}
Alternative: use pprint
, results in a different formatting:
import pprint
# same code as above
pprint.pprint(diff, indent=4)
Output:
{ 'dictionary_item_removed': [root['extra']],
'values_changed': { "root['a']": { 'new_value': 2,
'old_value': 1},
"root['nested']['b']": { 'new_value': 2,
'old_value': 1}}}
I'm new to python but I ended up doing something similar to @mouad
unmatched_item = set(dict_1.items()) ^ set(dict_2.items())
len(unmatched_item) # should be 0
The XOR operator (^
) should eliminate all elements of the dict when they are the same in both dicts.
{'a':{'b':1}}
gives TypeError: unhashable type: 'dict'
)
Just use:
assert cmp(dict1, dict2) == 0
dict1 == dict2
cmp
built in has been removed (and should be treated as removed before. An alternative they propose: (a > b) - (a < b) == cmp(a, b)
for a functional equivalent (or better __eq__
and __hash__
)
TypeError
: unorderable types: dict() < dict()
@mouad 's answer is nice if you assume that both dictionaries contain simple values only. However, if you have dictionaries that contain dictionaries you'll get an exception as dictionaries are not hashable.
Off the top of my head, something like this might work:
def compare_dictionaries(dict1, dict2):
if dict1 is None or dict2 is None:
print('Nones')
return False
if (not isinstance(dict1, dict)) or (not isinstance(dict2, dict)):
print('Not dict')
return False
shared_keys = set(dict1.keys()) & set(dict2.keys())
if not ( len(shared_keys) == len(dict1.keys()) and len(shared_keys) == len(dict2.keys())):
print('Not all keys are shared')
return False
dicts_are_equal = True
for key in dict1.keys():
if isinstance(dict1[key], dict) or isinstance(dict2[key], dict):
dicts_are_equal = dicts_are_equal and compare_dictionaries(dict1[key], dict2[key])
else:
dicts_are_equal = dicts_are_equal and all(atleast_1d(dict1[key] == dict2[key]))
return dicts_are_equal
not isinstance(dict1, dict)
instead of type(dict1) is not dict
, this will work on other classes based on dict. Also, instead of
(dict1[key] == dict2[key]), you can do
all(atleast_1d(dict1[key] == dict2[key]))` to handle arrays at least.
for loop
as soon as your dicts_are_equal
becomes false. There's no need to continue any further.
>>> dict2 = {"a": {"a": {"a": "b"}}} >>> dict1 = {"a": {"a": {"a": "b"}}} >>> dict1 == dict2 True >>> dict1 = {"a": {"a": {"a": "a"}}} >>> dict1 == dict2 False
Yet another possibility, up to the last note of the OP, is to compare the hashes (SHA
or MD
) of the dicts dumped as JSON. The way hashes are constructed guarantee that if they are equal, the source strings are equal as well. This is very fast and mathematically sound.
import json
import hashlib
def hash_dict(d):
return hashlib.sha1(json.dumps(d, sort_keys=True)).hexdigest()
x = dict(a=1, b=2)
y = dict(a=2, b=2)
z = dict(a=1, b=2)
print(hash_dict(x) == hash_dict(y))
print(hash_dict(x) == hash_dict(z))
json.dumps(d, sort_keys=True)
will give you canonical JSON so that you can be certain that both dict are equivalent. Also it depends what you are trying to achive. As soon as the value are not JSON serizalizable it will fail. For thus who say it is inefficient, have a look at the ujson project.
The function is fine IMO, clear and intuitive. But just to give you (another) answer, here is my go:
def compare_dict(dict1, dict2):
for x1 in dict1.keys():
z = dict1.get(x1) == dict2.get(x1)
if not z:
print('key', x1)
print('value A', dict1.get(x1), '\nvalue B', dict2.get(x1))
print('-----\n')
Can be useful for you or for anyone else..
EDIT:
I have created a recursive version of the one above.. Have not seen that in the other answers
def compare_dict(a, b):
# Compared two dictionaries..
# Posts things that are not equal..
res_compare = []
for k in set(list(a.keys()) + list(b.keys())):
if isinstance(a[k], dict):
z0 = compare_dict(a[k], b[k])
else:
z0 = a[k] == b[k]
z0_bool = np.all(z0)
res_compare.append(z0_bool)
if not z0_bool:
print(k, a[k], b[k])
return np.all(res_compare)
To test if two dicts are equal in keys and values:
def dicts_equal(d1,d2):
""" return True if all keys and values are the same """
return all(k in d2 and d1[k] == d2[k]
for k in d1) \
and all(k in d1 and d1[k] == d2[k]
for k in d2)
If you want to return the values which differ, write it differently:
def dict1_minus_d2(d1, d2):
""" return the subset of d1 where the keys don't exist in d2 or
the values in d2 are different, as a dict """
return {k,v for k,v in d1.items() if k in d2 and v == d2[k]}
You would have to call it twice i.e
dict1_minus_d2(d1,d2).extend(dict1_minus_d2(d2,d1))
A simple compare with == should be enough nowadays (python 3.8). Even when you compare the same dicts in a different order (last example). The best thing is, you don't need a third-party package to accomplish this.
a = {'one': 'dog', 'two': 'cat', 'three': 'mouse'}
b = {'one': 'dog', 'two': 'cat', 'three': 'mouse'}
c = {'one': 'dog', 'two': 'cat', 'three': 'mouse'}
d = {'one': 'dog', 'two': 'cat', 'three': 'mouse', 'four': 'fish'}
e = {'one': 'cat', 'two': 'dog', 'three': 'mouse'}
f = {'one': 'dog', 'two': 'cat', 'three': 'mouse'}
g = {'two': 'cat', 'one': 'dog', 'three': 'mouse'}
h = {'one': 'dog', 'two': 'cat', 'three': 'mouse'}
print(a == b) # True
print(c == d) # False
print(e == f) # False
print(g == h) # True
Code
def equal(a, b):
type_a = type(a)
type_b = type(b)
if type_a != type_b:
return False
if isinstance(a, dict):
if len(a) != len(b):
return False
for key in a:
if key not in b:
return False
if not equal(a[key], b[key]):
return False
return True
elif isinstance(a, list):
if len(a) != len(b):
return False
while len(a):
x = a.pop()
index = indexof(x, b)
if index == -1:
return False
del b[index]
return True
else:
return a == b
def indexof(x, a):
for i in range(len(a)):
if equal(x, a[i]):
return i
return -1
Test
>>> a = {
'number': 1,
'list': ['one', 'two']
}
>>> b = {
'list': ['two', 'one'],
'number': 1
}
>>> equal(a, b)
True
The easiest way (and one of the more robust at that) to do a deep comparison of two dictionaries is to serialize them in JSON format, sorting the keys, and compare the string results:
import json
if json.dumps(x, sort_keys=True) == json.dumps(y, sort_keys=True):
... Do something ...
I am using this solution that works perfectly for me in Python 3
import logging
log = logging.getLogger(__name__)
...
def deep_compare(self,left, right, level=0):
if type(left) != type(right):
log.info("Exit 1 - Different types")
return False
elif type(left) is dict:
# Dict comparison
for key in left:
if key not in right:
log.info("Exit 2 - missing {} in right".format(key))
return False
else:
if not deep_compare(left[str(key)], right[str(key)], level +1 ):
log.info("Exit 3 - different children")
return False
return True
elif type(left) is list:
# List comparison
for key in left:
if key not in right:
log.info("Exit 4 - missing {} in right".format(key))
return False
else:
if not deep_compare(left[left.index(key)], right[right.index(key)], level +1 ):
log.info("Exit 5 - different children")
return False
return True
else:
# Other comparison
return left == right
return False
It compares dict, list and any other types that implements the "==" operator by themselves. If you need to compare something else different, you need to add a new branch in the "if tree".
Hope that helps.
for python3:
data_set_a = dict_a.items()
data_set_b = dict_b.items()
difference_set = data_set_a ^ data_set_b
Why not just iterate through one dictionary and check the other in the process (assuming both dictionaries have the same keys)?
x = dict(a=1, b=2)
y = dict(a=2, b=2)
for key, val in x.items():
if val == y[key]:
print ('Ok', val, y[key])
else:
print ('Not', val, y[key])
Output:
Not 1 2
Ok 2 2
In PyUnit there's a method which compares dictionaries beautifully. I tested it using the following two dictionaries, and it does exactly what you're looking for.
d1 = {1: "value1",
2: [{"subKey1":"subValue1",
"subKey2":"subValue2"}]}
d2 = {1: "value1",
2: [{"subKey2":"subValue2",
"subKey1": "subValue1"}]
}
def assertDictEqual(self, d1, d2, msg=None):
self.assertIsInstance(d1, dict, 'First argument is not a dictionary')
self.assertIsInstance(d2, dict, 'Second argument is not a dictionary')
if d1 != d2:
standardMsg = '%s != %s' % (safe_repr(d1, True), safe_repr(d2, True))
diff = ('\n' + '\n'.join(difflib.ndiff(
pprint.pformat(d1).splitlines(),
pprint.pformat(d2).splitlines())))
standardMsg = self._truncateMessage(standardMsg, diff)
self.fail(self._formatMessage(msg, standardMsg))
I'm not recommending importing unittest
into your production code. My thought is the source in PyUnit could be re-tooled to run in production. It uses pprint
which "pretty prints" the dictionaries. Seems pretty easy to adapt this code to be "production ready".
Being late in my response is better than never!
Compare Not_Equal is more efficient than comparing Equal. As such two dicts are not equal if any key values in one dict is not found in the other dict. The code below takes into consideration that you maybe comparing default dict and thus uses get instead of getitem [].
Using a kind of random value as default in the get call equal to the key being retrieved - just in case the dicts has a None as value in one dict and that key does not exist in the other. Also the get != condition is checked before the not in condition for efficiency because you are doing the check on the keys and values from both sides at the same time.
def Dicts_Not_Equal(first,second):
""" return True if both do not have same length or if any keys and values are not the same """
if len(first) == len(second):
for k in first:
if first.get(k) != second.get(k,k) or k not in second: return (True)
for k in second:
if first.get(k,k) != second.get(k) or k not in first: return (True)
return (False)
return (True)
>>> hash_1
{'a': 'foo', 'b': 'bar'}
>>> hash_2
{'a': 'foo', 'b': 'bar'}
>>> set_1 = set (hash_1.iteritems())
>>> set_1
set([('a', 'foo'), ('b', 'bar')])
>>> set_2 = set (hash_2.iteritems())
>>> set_2
set([('a', 'foo'), ('b', 'bar')])
>>> len (set_1.difference(set_2))
0
>>> if (len(set_1.difference(set_2)) | len(set_2.difference(set_1))) == False:
... print "The two hashes match."
...
The two hashes match.
>>> hash_2['c'] = 'baz'
>>> hash_2
{'a': 'foo', 'c': 'baz', 'b': 'bar'}
>>> if (len(set_1.difference(set_2)) | len(set_2.difference(set_1))) == False:
... print "The two hashes match."
...
>>>
>>> hash_2.pop('c')
'baz'
Here's another option:
>>> id(hash_1)
140640738806240
>>> id(hash_2)
140640738994848
So as you see the two id's are different. But the rich comparison operators seem to do the trick:
>>> hash_1 == hash_2
True
>>>
>>> hash_2
{'a': 'foo', 'b': 'bar'}
>>> set_2 = set (hash_2.iteritems())
>>> if (len(set_1.difference(set_2)) | len(set_2.difference(set_1))) == False:
... print "The two hashes match."
...
The two hashes match.
>>>
see dictionary view objects: https://docs.python.org/2/library/stdtypes.html#dict
This way you can subtract dictView2 from dictView1 and it will return a set of key/value pairs that are different in dictView2:
original = {'one':1,'two':2,'ACTION':'ADD'}
originalView=original.viewitems()
updatedDict = {'one':1,'two':2,'ACTION':'REPLACE'}
updatedDictView=updatedDict.viewitems()
delta=original | updatedDict
print delta
>>set([('ACTION', 'REPLACE')])
You can intersect, union, difference (shown above), symmetric difference these dictionary view objects. Better? Faster? - not sure, but part of the standard library - which makes it a big plus for portability
Here is my answer, use a recursize way:
def dict_equals(da, db):
if not isinstance(da, dict) or not isinstance(db, dict):
return False
if len(da) != len(db):
return False
for da_key in da:
if da_key not in db:
return False
if not isinstance(db[da_key], type(da[da_key])):
return False
if isinstance(da[da_key], dict):
res = dict_equals(da[da_key], db[da_key])
if res is False:
return False
elif da[da_key] != db[da_key]:
return False
return True
a = {1:{2:3, 'name': 'cc', "dd": {3:4, 21:"nm"}}}
b = {1:{2:3, 'name': 'cc', "dd": {3:4, 21:"nm"}}}
print dict_equals(a, b)
Hope that helps!
Below code will help you to compare list of dict in python
def compate_generic_types(object1, object2):
if isinstance(object1, str) and isinstance(object2, str):
return object1 == object2
elif isinstance(object1, unicode) and isinstance(object2, unicode):
return object1 == object2
elif isinstance(object1, bool) and isinstance(object2, bool):
return object1 == object2
elif isinstance(object1, int) and isinstance(object2, int):
return object1 == object2
elif isinstance(object1, float) and isinstance(object2, float):
return object1 == object2
elif isinstance(object1, float) and isinstance(object2, int):
return object1 == float(object2)
elif isinstance(object1, int) and isinstance(object2, float):
return float(object1) == object2
return True
def deep_list_compare(object1, object2):
retval = True
count = len(object1)
object1 = sorted(object1)
object2 = sorted(object2)
for x in range(count):
if isinstance(object1[x], dict) and isinstance(object2[x], dict):
retval = deep_dict_compare(object1[x], object2[x])
if retval is False:
print "Unable to match [{0}] element in list".format(x)
return False
elif isinstance(object1[x], list) and isinstance(object2[x], list):
retval = deep_list_compare(object1[x], object2[x])
if retval is False:
print "Unable to match [{0}] element in list".format(x)
return False
else:
retval = compate_generic_types(object1[x], object2[x])
if retval is False:
print "Unable to match [{0}] element in list".format(x)
return False
return retval
def deep_dict_compare(object1, object2):
retval = True
if len(object1) != len(object2):
return False
for k in object1.iterkeys():
obj1 = object1[k]
obj2 = object2[k]
if isinstance(obj1, list) and isinstance(obj2, list):
retval = deep_list_compare(obj1, obj2)
if retval is False:
print "Unable to match [{0}]".format(k)
return False
elif isinstance(obj1, dict) and isinstance(obj2, dict):
retval = deep_dict_compare(obj1, obj2)
if retval is False:
print "Unable to match [{0}]".format(k)
return False
else:
retval = compate_generic_types(obj1, obj2)
if retval is False:
print "Unable to match [{0}]".format(k)
return False
return retval
>>> x = {'a':1,'b':2,'c':3}
>>> x
{'a': 1, 'b': 2, 'c': 3}
>>> y = {'a':2,'b':4,'c':3}
>>> y
{'a': 2, 'b': 4, 'c': 3}
METHOD 1:
>>> common_item = x.items()&y.items() #using union,x.item()
>>> common_item
{('c', 3)}
METHOD 2:
>>> for i in x.items():
if i in y.items():
print('true')
else:
print('false')
false
false
true
In Python 3.6, It can be done as:-
if (len(dict_1)==len(dict_2):
for i in dict_1.items():
ret=bool(i in dict_2.items())
ret variable will be true if all the items of dict_1 in present in dict_2
You can find that out by writing your own function in the following way.
class Solution:
def find_if_dict_equal(self,dict1,dict2):
dict1_keys=list(dict1.keys())
dict2_keys=list(dict2.keys())
if len(dict1_keys)!=len(dict2_keys):
return False
for i in dict1_keys:
if i not in dict2 or dict2[i]!=dict1[i]:
return False
return True
def findAnagrams(self, s, p):
if len(s)<len(p):
return []
p_dict={}
for i in p:
if i not in p_dict:
p_dict[i]=0
p_dict[i]+=1
s_dict={}
final_list=[]
for i in s[:len(p)]:
if i not in s_dict:
s_dict[i]=0
s_dict[i]+=1
if self.find_if_dict_equal(s_dict,p_dict):
final_list.append(0)
for i in range(len(p),len(s)):
element_to_add=s[i]
element_to_remove=s[i-len(p)]
if element_to_add not in s_dict:
s_dict[element_to_add]=0
s_dict[element_to_add]+=1
s_dict[element_to_remove]-=1
if s_dict[element_to_remove]==0:
del s_dict[element_to_remove]
if self.find_if_dict_equal(s_dict,p_dict):
final_list.append(i-len(p)+1)
return final_list
I have a default/template dictionry that I want to update its values from a second given dictionary. Thus the update will happen on keys that exist in the default dictionary and if the related value is compatible with the default key/value type.
Somehow this is similar to the question above.
I wrote this solution:
CODE
def compDict(gDict, dDict):
gDictKeys = list(gDict.keys())
for gDictKey in gDictKeys:
try:
dDict[gDictKey]
except KeyError:
# Do the operation you wanted to do for "key not present in dict".
print(f'\nkey \'{gDictKey}\' does not exist! Dictionary key/value no set !!!\n')
else:
# check on type
if type(gDict[gDictKey]) == type(dDict[gDictKey]):
if type(dDict[gDictKey])==dict:
compDict(gDict[gDictKey],dDict[gDictKey])
else:
dDict[gDictKey] = gDict[gDictKey]
print('\n',dDict, 'update successful !!!\n')
else:
print(f'\nValue \'{gDict[gDictKey]}\' for \'{gDictKey}\' not a compatible data type !!!\n')
# default dictionary
dDict = {'A':str(),
'B':{'Ba':int(),'Bb':float()},
'C':list(),
}
# given dictionary
gDict = {'A':1234, 'a':'addio', 'C':['HELLO'], 'B':{'Ba':3,'Bb':'wrong'}}
compDict(gDict, dDict)
print('Updated default dictionry: ',dDict)
OUTPUT
Value '1234' for 'A' not a compatible data type !!!
key 'a' does not exist! Dictionary key/value no set !!!
{'A': '', 'B': {'Ba': 0, 'Bb': 0.0}, 'C': ['HELLO']} update successful !!!
{'Ba': 3, 'Bb': 0.0} update successful !!!
Value 'wrong' for 'Bb' not a compatible data type !!!
Updated default dictionry: {'A': '', 'B': {'Ba': 3, 'Bb': 0.0}, 'C': ['HELLO']}
import json
if json.dumps(dict1) == json.dumps(dict2):
print("Equal")
json.dumps
is deterministic with the default settings ).
Success story sharing
list
key in the first place.x = {[1,2]: 2}
will fail. The question already has validdicts
.list
keys is not valid python code - dict keys must be immutable. Therefore you are not comparing dictionaries. If you try and use a list as a dictionary key your code will not run. You have no objects for which to compare. This is like typingx = dict(23\;dfg&^*$^%$^$%^)
then complaining how the comparison does not work with the dictionary. Of course it will not work. Tim's comment on the other hand is about mutablevalues
, hence why I said that these are different issues.set
requires values to be hashable anddict
requires keys to be hashable.set(x.keys())
will always work because keys are required to be hashable, butset(x.values())
will fail on values that aren't hashable.