Fix Python – Character reading from file in Python

In a text file, there is a string “I don’t like this”.
However, when I read it into a string, it becomes “I don\xe2\x80\x98t like this”. I understand that \u2018 is the unicode representation of “‘”. I use
f1 = open (file1, “r”)
text = f1.read()

command to do the reading.
Now, is it possible to read the string in such a way that when it is read ….

Fix Python – Why does Python print unicode characters when the default encoding is ASCII?

From the Python 2.6 shell:
>>> import sys
>>> print sys.getdefaultencoding()
ascii
>>> print u’\xe9′
é
>>>

I expected to have either some gibberish or an Error after the print statement, since the “é” character isn’t part of ASCII and I haven’t specified an encoding. I guess I don’t understand what ASCII being the default encoding means.
EDIT
I ….

Fix Python – Python string prints as [u’String’]

This will surely be an easy one but it is really bugging me.
I have a script that reads in a webpage and uses Beautiful Soup to parse it. From the soup I extract all the links as my final goal is to print out the link.contents.
All of the text that I am parsing is ASCII. I know that Python treats strings as unicode, and I am sure this is very han….

Fix Python – Convert Unicode to ASCII without errors in Python

My code just scrapes a web page, then converts it to Unicode.
html = urllib.urlopen(link).read()
html.encode(“utf8″,”ignore”)
self.response.out.write(html)

But I get a UnicodeDecodeError:

Traceback (most recent call last):
File “/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/googl….

Fix Python – Replace non-ASCII characters with a single space

I need to replace all non-ASCII (\x00-\x7F) characters with a space. I’m surprised that this is not dead-easy in Python, unless I’m missing something. The following function simply removes all non-ASCII characters:
def remove_non_ascii_1(text):

return ”.join(i for i in text if ord(i)<128) And this one replaces non-ASCII characters with the ....