I know how to use both for loops and if statements on separate lines, such as:
>>> a = [2,3,4,5,6,7,8,9,0]
... xyz = [0,12,4,6,242,7,9]
... for x in xyz:
... if x in a:
... print(x)
0,4,6,7,9
And I know I can use a list comprehension to combine these when the statements are simple, such as:
print([x for x in xyz if x in a])
But what I can't find is a good example anywhere (to copy and learn from) demonstrating a complex set of commands (not just "print x") that occur following a combination of a for loop and some if statements. Something that I would expect looks like:
for x in xyz if x not in a:
print(x...)
Is this just not the way python is supposed to work?
for
loop and if
statement.
x in a
is slow if a
is a list.
You can use generator expressions like this:
gen = (x for x in xyz if x not in a)
for x in gen:
print(x)
As per The Zen of Python (if you are wondering whether your code is "Pythonic", that's the place to go):
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Flat is better than nested.
Readability counts.
The Pythonic way of getting the sorted
intersection
of two set
s is:
>>> sorted(set(a).intersection(xyz))
[0, 4, 6, 7, 9]
Or those elements that are xyz
but not in a
:
>>> sorted(set(xyz).difference(a))
[12, 242]
But for a more complicated loop you may want to flatten it by iterating over a well-named generator expression and/or calling out to a well-named function. Trying to fit everything on one line is rarely "Pythonic".
Update following additional comments on your question and the accepted answer
I'm not sure what you are trying to do with enumerate
, but if a
is a dictionary, you probably want to use the keys, like this:
>>> a = {
... 2: 'Turtle Doves',
... 3: 'French Hens',
... 4: 'Colly Birds',
... 5: 'Gold Rings',
... 6: 'Geese-a-Laying',
... 7: 'Swans-a-Swimming',
... 8: 'Maids-a-Milking',
... 9: 'Ladies Dancing',
... 0: 'Camel Books',
... }
>>>
>>> xyz = [0, 12, 4, 6, 242, 7, 9]
>>>
>>> known_things = sorted(set(a.iterkeys()).intersection(xyz))
>>> unknown_things = sorted(set(xyz).difference(a.iterkeys()))
>>>
>>> for thing in known_things:
... print 'I know about', a[thing]
...
I know about Camel Books
I know about Colly Birds
I know about Geese-a-Laying
I know about Swans-a-Swimming
I know about Ladies Dancing
>>> print '...but...'
...but...
>>>
>>> for thing in unknown_things:
... print "I don't know what happened on the {0}th day of Christmas".format(thing)
...
I don't know what happened on the 12th day of Christmas
I don't know what happened on the 242th day of Christmas
The following is a simplification/one liner from the accepted answer:
a = [2,3,4,5,6,7,8,9,0]
xyz = [0,12,4,6,242,7,9]
for x in (x for x in xyz if x not in a):
print(x)
12
242
Notice that the generator
was kept inline. This was tested on python2.7
and python3.6
(notice the parens in the print
;) )
It is honestly cumbersome even so: the x
is mentioned four times.
I personally think this is the prettiest version:
a = [2,3,4,5,6,7,8,9,0]
xyz = [0,12,4,6,242,7,9]
for x in filter(lambda w: w in a, xyz):
print x
Edit
if you are very keen on avoiding to use lambda you can use partial function application and use the operator module (that provides functions of most operators).
https://docs.python.org/2/library/operator.html#module-operator
from operator import contains
from functools import partial
print(list(filter(partial(contains, a), xyz)))
filter(a.__contains__, xyz)
. Usually when people use lambda, they really need something much simpler.
__contains__
is a method like any other, only it is a special method, meaning it can be called indirectly by an operator (in
in this case). But it can also be called directly, it is a part of the public API. Private names are specifically defined as having at most one trailing underscore, to provide exception for special method names - and they are subject to name mangling when lexically in class scopes. See docs.python.org/3/reference/datamodel.html#specialnames and docs.python.org/3.6/tutorial/classes.html#private-variables .
in
is singly dispatched wrt right operand). Besides, note that operator
also exports contains
method under the name __contains__
, so it surely is not a private name. I think you'll just have to learn to live with the fact that not every double underscore means "keep away". :-]
lambda
needs fixing to include not
: lambda w: not w in a, xyz
I would probably use:
for x in xyz:
if x not in a:
print(x...)
pythonic
results. I can code functionally in every other language I use (scala, kotlin, javascript, R, swift, ..) but difficult/awkward in python
a = [2,3,4,5,6,7,8,9,0]
xyz = [0,12,4,6,242,7,9]
set(a) & set(xyz)
set([0, 9, 4, 6, 7])
import time a = [2,3,4,5,6,7,8,9,0] xyz = [0,12,4,6,242,7,9] start = time.time() print (set(a) & set(xyz)) print time.time() - start
if x in ignore: ...
.
if set(a) - set(ignore) == set([]):
so perhaps that's why it was much slower than checking membership. I'll test this again in the future on a much simpler example than what I'm writing.
You can use generators too, if generator expressions become too involved or complex:
def gen():
for x in xyz:
if x in a:
yield x
for x in gen():
print x
I liked Alex's answer, because a filter is exactly an if applied to a list, so if you want to explore a subset of a list given a condition, this seems to be the most natural way
mylist = [1,2,3,4,5]
another_list = [2,3,4]
wanted = lambda x:x in another_list
for x in filter(wanted, mylist):
print(x)
this method is useful for the separation of concerns, if the condition function changes, the only code to fiddle with is the function itself
mylist = [1,2,3,4,5]
wanted = lambda x:(x**0.5) > 10**0.3
for x in filter(wanted, mylist):
print(x)
The generator method seems better when you don't want members of the list, but a modification of said members, which seems more fit to a generator
mylist = [1,2,3,4,5]
wanted = lambda x:(x**0.5) > 10**0.3
generator = (x**0.5 for x in mylist if wanted(x))
for x in generator:
print(x)
Also, filters work with generators, although in this case it isn't efficient
mylist = [1,2,3,4,5]
wanted = lambda x:(x**0.5) > 10**0.3
generator = (x**0.9 for x in mylist)
for x in filter(wanted, generator):
print(x)
But of course, it would still be nice to write like this:
mylist = [1,2,3,4,5]
wanted = lambda x:(x**0.5) > 10**0.3
# for x in filter(wanted, mylist):
for x in mylist if wanted(x):
print(x)
Use intersection
or intersection_update
intersection : a = [2,3,4,5,6,7,8,9,0] xyz = [0,12,4,6,242,7,9] ans = sorted(set(a).intersection(set(xyz)))
intersection_update: a = [2,3,4,5,6,7,8,9,0] xyz = [0,12,4,6,242,7,9] b = set(a) b.intersection_update(xyz) then b is your answer
A simple way to find unique common elements of lists a and b:
a = [1,2,3]
b = [3,6,2]
for both in set(a) & set(b):
print(both)
based on the article here: https://towardsdatascience.com/a-comprehensive-hands-on-guide-to-transfer-learning-with-real-world-applications-in-deep-learning-212bf3b2f27a I used the following code for the same reason and it worked just fine:
an_array = [x for x in xyz if x not in a]
This line is a part of the program! this means that XYZ is an array which is to be defined and assigned previously, and also the variable a
Using generator expressions (which is recommended in the selected answer) makes some difficulties because the result is not an array
Success story sharing
gen = (y for (x,y) in enumerate(xyz) if x not in a)
returns >>>12
when I typefor x in gen: print x
-- so why the unexpected behavior with enumerate?for x in xyz if x:
for x in (x for x in xyz if x not in a):
works for me, but why you shouldn't just be able to dofor x in xyz if x not in a:
, I'm not sure...