I've very recently migrated to Python 3.5. This code was working properly in Python 2.7:
with open(fname, 'rb') as f:
lines = [x.strip() for x in f.readlines()]
for line in lines:
tmp = line.strip().lower()
if 'some-pattern' in tmp: continue
# ... code
After upgrading to 3.5, I'm getting the:
TypeError: a bytes-like object is required, not 'str'
The error is on the last line (the pattern search code).
I've tried using the .decode()
function on either side of the statement and also tried:
if tmp.find('some-pattern') != -1: continue
- to no avail.
I was able to resolve almost all Python 2-to-Python 3 issues quickly, but this little statement was bugging me.
result = requests.get
and I attempt to x = result.content.split("\n")
. I am a little confused by the error message because it seems to imply that result.content
is a string and .split()
is requiring a bytes-like object..?? ( "a bytes-like object is required, not 'str"')..
'rb'
option to 'r'
to treat the file as a string
You opened the file in binary mode:
with open(fname, 'rb') as f:
This means that all data read from the file is returned as bytes
objects, not str
. You cannot then use a string in a containment test:
if 'some-pattern' in tmp: continue
You'd have to use a bytes
object to test against tmp
instead:
if b'some-pattern' in tmp: continue
or open the file as a textfile instead by replacing the 'rb'
mode with 'r'
.
You can encode your string by using .encode()
Example:
'Hello World'.encode()
As the error describes, in order to write a string to a file you need to encode it to a byte-like object first, and encode()
is encoding it to a byte-string.
fd.subprocess.Popen(); fd.communicate(...);
.
"Hello "+("World".encode()).decode()
(same with join()
obviously).
encode()
method of a string, we get the encoded version of it in the default encoding, which is usually utf-8
.
Like it has been already mentioned, you are reading the file in binary mode and then creating a list of bytes. In your following for loop you are comparing string to bytes and that is where the code is failing.
Decoding the bytes while adding to the list should work. The changed code should look as follows:
with open(fname, 'rb') as f:
lines = [x.decode('utf8').strip() for x in f.readlines()]
The bytes type was introduced in Python 3 and that is why your code worked in Python 2. In Python 2 there was no data type for bytes:
>>> s=bytes('hello')
>>> type(s)
<type 'str'>
str
while the type for text strings is called unicode
. In Python 3 they changed the meaning of str
so that it was the same as the old unicode
type, and renamed the old str
to bytes
. They also removed a bunch of cases where it would automatically try to convert from one to the other.
You have to change from wb to w:
def __init__(self):
self.myCsv = csv.writer(open('Item.csv', 'wb'))
self.myCsv.writerow(['title', 'link'])
to
def __init__(self):
self.myCsv = csv.writer(open('Item.csv', 'w'))
self.myCsv.writerow(['title', 'link'])
After changing this, the error disappears, but you can't write to the file (in my case). So after all, I don't have an answer?
Source: How to remove ^M
Changing to 'rb' brings me the other error: io.UnsupportedOperation: write
For this small example, adding the only b
before 'GET http://www.py4inf.com/code/romeo.txt HTTP/1.0\n\n'
solved my problem:
import socket
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('www.py4inf.com', 80))
mysock.send(b'GET http://www.py4inf.com/code/romeo.txt HTTP/1.0\n\n')
while True:
data = mysock.recv(512)
if (len(data) < 1):
break
print (data);
mysock.close()
What does the 'b' character do in front of a string literal?
Use the encode() function along with the hardcoded string value given in a single quote.
Example:
file.write(answers[i] + '\n'.encode())
Or
line.split(' +++$+++ '.encode())
You opened the file in binary mode:
The following code will throw a TypeError: a bytes-like object is required, not 'str'.
for line in lines:
print(type(line))# <class 'bytes'>
if 'substring' in line:
print('success')
The following code will work - you have to use the decode() function:
for line in lines:
line = line.decode()
print(type(line))# <class 'str'>
if 'substring' in line:
print('success')
Try opening your file as text:
with open(fname, 'rt') as f:
lines = [x.strip() for x in f.readlines()]
Additionally, here is a link for Python 3.x on the official page: io — Core tools for working with streams.
And this is the open function: open
If you are really trying to handle it as a binary then consider encoding your string.
I got this error when I was trying to convert a char (or string) to bytes
, the code was something like this with Python 2.7:
# -*- coding: utf-8 -*-
print(bytes('ò'))
This is the way of Python 2.7 when dealing with Unicode characters.
This won't work with Python 3.6, since bytes
require an extra argument for encoding, but this can be little tricky, since different encoding may output different result:
print(bytes('ò', 'iso_8859_1')) # prints: b'\xf2'
print(bytes('ò', 'utf-8')) # prints: b'\xc3\xb2'
In my case I had to use iso_8859_1
when encoding bytes in order to solve the issue.
coding
comment at the top of the file doesn't affect the way bytes
or encode
works, it only changes the way characters in your Python source are interpreted.
Success story sharing
'r'
vs'rb'
too, switching between binary and text file behaviours (like translating newlines and on certain platforms, how the EOF marker is treated). That theio
library (providing the default I/O functionality in Python 3 but also available in Python 2) now also decodes text files by default is the real change.'b'
flag when having to work with binary files on DOS/Windows (as binary is the POSIX default). It's good that there is a dual purpose when usingio
in 3.x for file access.ZipFile.open()
docs explicitly state that only binary mode is supported (Access a member of the archive as a binary file-like object). You can wrap the file object inio.TextIOWrapper()
to achieve the same effect..readlines()
when you can iterate over the file object directly. Especially when you only need info from a single line. Why read everything into memory when that info could be found in the first buffered block?