Getting "write() argument must be str, not bytes" in simple Pillow module example - python-imaging-library

I'm pretty new to Python, and I'm trying to generate some images, so I was testing some Pillow functions.
This is one of the very first examples presented in Pillow's documentation:
from PIL import Image, ImageDraw
with Image.open("C:\\Users\\USER\\Desktop\\Quick_edits\\blank_square.jpg") as im:
draw = ImageDraw.Draw(im)
draw.line((0, 0) + im.size, fill=128)
draw.line((0, im.size[1], im.size[0], 0), fill=128)
# write to stdout
im.save(sys.stdout, "PNG")
And I keep getting this error:
Traceback (most recent call last):
File "<stdin>", line 5, in <module>
File "C:\Users\USER\AppData\Local\Programs\Python\Python38-32\lib\site-packages\PIL\Image.py", line 2158, in save
save_handler(self, fp, filename)
File "C:\Users\USER\AppData\Local\Programs\Python\Python38-32\lib\site-packages\PIL\PngImagePlugin.py", line 1179, in _save
fp.write(_MAGIC)
TypeError: write() argument must be str, not bytes
Please, help me. I've seen seimilar errors been addressed, but this is supposedly a simple example and can't find how to fix this...

If your goal is to simply view the image, try using image.show() instead:
from PIL import Image, ImageDraw
with Image.open("C:\\Users\\USER\\Desktop\\Quick_edits\\blank_square.jpg") as im:
draw = ImageDraw.Draw(im)
draw.line((0, 0) + im.size, fill=128)
draw.line((0, im.size[1], im.size[0], 0), fill=128)
im.show()

Related

Automate Boring Stuff Ch13 PyPDF2: pdfReader is not defined

While trying to follow the instructions on the book, I encountered an error which I am afraid is due to my wrong understanding about the loop. My code is as below.
#! Python3
import PyPDF2, os
# Loop through all the PDF files
for filename in pdfFiles:
try:
pdfFileObj = open(filename, 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
except FileNotFoundError:
print('File not foungd ' + filename)
pass
# Read through all the PDF files.
for pageNum in range(1, pdfReader.numPages):
pageObj = pdfReader.getPage(pageNum)
pdfWriter.addPage(pageObj)
=====
and the result is:
Traceback (most recent call last):
File "C:/…automate_py/combinesPdf.py", line 24, in <module>
for pageNum in range(1, pdfReader.numPages):
NameError: name 'pdfReader' is not defined
Does anyone know why is pdfReader not found? Very appreciated.
I have tried adjust indentation but it didn't seem to work. :(
You have committed an indentation mistake. pdfReader is defined within the first loop of your piece of code, so the second one should be inside that block. This is the entire commented code of that sample on the book:
#! python3
# combinePdfs.py - Combines all the PDFs in the current working directory into
# a single PDF.
import PyPDF2, os
# Get all the PDF filenames.
pdfFiles = []
for filename in os.listdir('.'):
if filename.endswith('.pdf'):
pdfFiles.append(filename)
pdfFiles.sort()
pdfWriter = PyPDF2.PdfFileWriter()
# Loop through all the PDF files.
for filename in pdfFiles:
pdfFileObj = open(filename, 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
# Loop through all the pages (except the first) and add them.
for pageNum in range(1, pdfReader.numPages):
pageObj = pdfReader.getPage(pageNum)
pdfWriter.addPage(pageObj)
# Save the resulting PDF to a file.
pdfOutput = open('allminutes.pdf', 'wb')
pdfWriter.write(pdfOutput)
pdfOutput.close()
Had the same issue (Python 3.x, Windows 10). This worked for me:
from PyPDF2 import PdfFileReader
with open("C:/yourpdf.pdf", 'rb') as f:
pdf = PdfFileReader(f)
page = pdf.getPage(1)
text = page.extractText()
print(text)

Openpyxl Unicode decode error cannot remove \ufeff from cell value

I am parsing multiple worksheets of unicode data and creating a dictionary for specific cells in each sheet but I am having trouble decoding the unicode data. The small snippet of the code is below
for key in shtDict:
sht = wb[key]
for row in sht.iter_rows('A:A',row_offset = 1):
for cell in row:
if isinstance(cell.value,unicode):
if "INC" in cell.value:
shtDict[key] = cell.value
The output of this section is:
{'60071508': u'\ufeffReason: INC8595939', '60074426': u'\ufeffReason. Ref INC8610481', '60071539': u'\ufeffReason: INC8603621'}
I tried to properly decode the data based on u'\ufeff' in Python string, by changing the last line to:
shtDict[key] = cell.value.decode('utf-8-sig')
But I get the following error:
Traceback (most recent call last):
File "", line 55, in <module>
shtDict[key] = cell.value.decode('utf-8-sig')
File "C:\Python27\lib\encodings\utf_8_sig.py", line 22, in decode
(output, consumed) = codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\ufeff' in position 0: ordinal not in range(128)
Not sure what the issue is, I have also tried decoding with 'utf-16', but I get the same error. Can anyone help with this?
Just make it simpler: you can ignore BOF, so just ignore BOF characters.
shtDict[key] = cell.value.replace(u'\ufeff', '', 1)
Note: cell.value is already unicode type (you just checked it), so you cannot decode it again.

Reading and writing error pajek file in Networkx

I am receiving error when I write to a pajek file and then read back the same file using Networkx library python
>>> G=nx.read_pajek("eatRS.net")
>>> nx.write_pajek(G,"temp.net")
>>> G1=nx.read_pajek("temp.net")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 2, in read_pajek
File "/usr/local/lib/python2.7/dist-packages/networkx/utils/decorators.py", line 193, in _open_file
result = func(*new_args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/networkx/readwrite/pajek.py", line 132, in read_pajek
return parse_pajek(lines)
File "/usr/local/lib/python2.7/dist-packages/networkx/readwrite/pajek.py", line 168, in parse_pajek
splitline=shlex.split(str(next(lines)))
File "/usr/lib/python2.7/shlex.py", line 279, in split
return list(lex)
File "/usr/lib/python2.7/shlex.py", line 269, in next
token = self.get_token()
File "/usr/lib/python2.7/shlex.py", line 96, in get_token
raw = self.read_token()
File "/usr/lib/python2.7/shlex.py", line 172, in read_token
r aise ValueError, "No closing quotation"
ValueError: No closing quotation
Creating a graph within networkx, writing in pajek format and then back again works fine for me. E.g. with gnm_random_graph:
import matplotlib.pyplot as np
n = 10
m = 20
G = nx.gnm_random_graph(n,m)
nx.write_pajek(G, "temp.net")
G1 = nx.read_pajek("temp.net")
Only if I edit the intermediate graph to have, say,
"vertex one 0.3456 0.1234 box ic White fos 20
do I get the ValueError: No closing quotation error you have. Node labels can be numeric or string, but if they include spaces, the name must be quoted.From the Pajek manual:
label - if label starts with character A..Z or 0..9 first blank determines end of the label
(example: vertex1), labels consisting of more words must be enclosed in pair of special
characters (example: "vertex 1")
Thus, I suggest that you inspect your input file "eatRS.net". Perhaps there is an issue with character encoding, mismatched quotes (e.g. opening with " and closing with '), or a line break within the node label?

Python with Gtk3 not setting unicode properly

I have some simple code that isn't working as expected. First, the docs say that Gtk.Clipboard.get(Gdk.SELECTION_PRIMARY).set_text() should be able to accept only one argument with the length argument option, but it doesn't work (see below). Finally, pasting a unicode ° symbol breaks setting the text when trying to retrieve it from the clipboard (and won't paste into other programs). It gives this warning:
Gdk-WARNING **: Error converting selection from UTF8_STRING
>>> from gi.repository.Gtk import Clipboard
>>> from gi.repository.Gdk import SELECTION_PRIMARY
>>> d='\u00B0'
>>> print(d)
°
>>> cb=Clipboard
Clipboard
>>> cb=Clipboard.get(SELECTION_PRIMARY)
>>> cb.set_text(d) #this should work
Traceback (most recent call last):
File "<ipython-input-6-b563adc3e800>", line 1, in <module>
cb.set_text(d)
File "/usr/lib/python3/dist-packages/gi/types.py", line 43, in function
return info.invoke(*args, **kwargs)
TypeError: set_text() takes exactly 3 arguments (2 given)
>>> cb.set_text(d, len(d))
>>> cb.wait_for_text()
(.:13153): Gdk-WARNING **: Error converting selection from UTF8_STRING
'\\Uffffffff\\Uffffffff'
From the documentation for Gtk.Clipboard
It looks like the method set_text needs a second argument. The first is the text, the second is the length of the text. Or if you don't want to provide the length, you can use -1 to let it calculate the length itself.
gtk.Clipboard.set_text
def set_text(text, len=-1)
text : a string.
len : the length of text, in bytes, or -1, to calculate the length.
I've tested it on Python 3 and it works with cb.set_text(d, -1).
Since GTK version 3.16 there is a easier way of getting the clipboard. You can get it with the get_default() method:
import gi
gi.require_version('Gtk', '3.0')
from gi.repository import Gtk, Gdk, GLib, Gio
display = Gdk.Display.get_default()
clipboard = Gtk.Clipboard.get_default(display)
clipboard.set_text(string, -1)
also for me it worked without
clipboard.store()
Reference: https://lazka.github.io/pgi-docs/Gtk-3.0/classes/Clipboard.html#Gtk.Clipboard.get_default
In Python 3.4. this is only needed for GtkEntryBuffers. In case of GtkTextBuffer set_text works without the second parameter.
example1 works as usual:
settinginfo = 'serveradres = ' + server + '\n poortnummer = ' + poort
GtkTextBuffer2.set_text(settinginfo)
example2 needs extra parameter len:
ErrorTextDate = 'choose earlier date'
GtkEntryBuffer1.set_text(ErrorTextDate, -1)

Python 3.2 lxml fill and submit form, select multiple, how to do it? value not working

Great page this one, coming from the perl world and after several years of doing nothing, I've re-started to program again (this web page didn't exist, how things change). And now, after a 2 full-days of searching, I play the last card of asking here for help.
Working under mac environment, with python 3.2 and lxml 2.3 (installed following www.jtmoon.com/?p=21), what I am trying to do:
web: http://biodbnet.abcc.ncifcrf.gov/db/db2db.php
to fill the form that you find there
to submit it
My code. I put several attempts and the output code.
from lxml.html import parse, submit_form, tostring
page = parse('http://biodbnet.abcc.ncifcrf.gov/db/db2db.php').getroot()
page.forms[0].fields['input'] = 'GI Number'
page.forms[0].inputs['outputs[]'].value = 'Gene ID'
page.forms[0].fields['hasComma'] = 'no'
page.forms[0].fields['removeDupValues'] = 'yes'
page.forms[0].fields['request'] = 'db2db'
page.forms[0].action = 'http://biodbnet.abcc.ncifcrf.gov/db/db2dbRes.php'
page.forms[0].fields['idList'] = '86439006'
submit_form(page.forms[0])
Output:
File "/Users/gerard/Desktop/barbacue/MGFtoXML.py", line 30, in <module>
page.forms[0].inputs['outputs[]'].value = 'Gene ID'
File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/html/__init__.py", line 1058, in _value__set
"You must pass in a sequence")
TypeError: You must pass in a sequence
So, since that element is a multi-select element, I understand that I have to give a list
page.forms[0].inputs['outputs[]'].value = list('Gene ID')
Output:
File "/Users/gerard/Desktop/barbacue/MGFtoXML.py", line 30, in <module>
page.forms[0].inputs['outputs[]'].value = list('Gene ID')
File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/html/__init__.py", line 1059, in _value__set
self.value.clear()
File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/html/_setmixin.py", line 115, in clear
self.remove(item)
File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/html/__init__.py", line 1159, in remove
"The option %r is not currently selected" % item)
ValueError: The option 'Affy ID' is not currently selected
'Affy ID' is the first option value of the list, and it is not selected. But what's the problem with it?
Surprisingly, if I instead put
page.forms[0].inputs['outputs[]'].multiple = list('Gene ID')
#page.forms[0].inputs['outputs[]'].value = list('Gene ID')
Then, somehow lxml likes it, and move on. However, the multiple attribute should be a boolean (actually it is if I print the value), I shouldn't touch it, and the "value" of the item should actually point to the selected items, according to the lxml docs.
The new output
File "/Users/gerard/Desktop/barbacue/MGFtoXML.py", line 87, in <module>
submit_form(page.forms[0])
File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/html/__init__.py", line 856, in submit_form
return open_http(form.method, url, values)
File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/html/__init__.py", line 876, in open_http_urllib
return urlopen(url, data)
File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/urllib/request.py", line 138, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/urllib/request.py", line 364, in open
req = meth(req)
File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/urllib/request.py", line 1052, in do_request_
raise TypeError("POST data should be bytes"
TypeError: POST data should be bytes or an iterable of bytes. It cannot be str.
So, what can be done?? I am sure that with python 2.6 I could use mecanize, or that perhaps lxml could work? But I really don't want to code in a sort-of deprecated version. I am enjoying a lot python, but I am starting to consider going back to perl. Perhaps this could be a smart movement??
Any help will be hugely appreciated
Gerard
Reading in this forum, I find pythonpaste.org, could it be a replacement for lxml?
Passing in a sequence to list() will generate a list from that sequence. 'Gene ID' is sequence (namely a sequence of characters). So list('Gene ID') will generate a list of characters, like so:
>>> list('Gene ID')
['G', 'e', 'n', 'e', ' ', 'I', 'D']
That's not what you want. Try this:
>>> ['Gene ID']
['Gene ID']
In other words:
page.forms[0].inputs['outputs[]'].value = ['Gene ID']
That should take you a bit forward.