Don't do the silly misstake that I did today. I improved my code to better support unicode by replacing all plain strings with unicode strings. In there I had code that looked like this:
if type_ is 'textarea':
do something
This was changed to:
if type_ is u'textarea':
do something
And it no longer matched since type_
was a normal ascii string. The correct wat to do these things is like this:
if type_ == u'textarea':
do something
elif type_ is None:
do something else
Remember:
>>> "peter" is u"peter"
False
>>> "peter" == u"peter"
True
>>> None is None
True
>>> None == None
True
Comments
Post your own commentYou should really only use 'is' to check for object identity, and for any kind of value comparison, == is the way to go. Also, as you yourself point out, the changes you made make no difference at all, unless there are non-ascii characters in the string you are comparing the variable to, so in the case of 'textarea', I would have just left it as is. ;)
That ("peter" is "peter") works at all is an implementation quirk that shouldn't be relied upon, even for 8-bit strings.
Hi Peter,
You're mixing up identity with equality...easily done :-)
While it's true that...
"peter" == "peter" and
"peter" is "peter"
you'll note that:
"peter" is "peter1"[:-1] is not true while
"peter" == "peter1"[:-1] is
Hope this sheds some light :-)
The interpreter is obviously using two references to the same string "peter" in the first case, but is creating a new string in the second example.
Kevin
yes what you are doing was dangerous EVEN with plain 8bit data; check this:
>>> a ="p"+"eter"
>>> b = "peter"
>>> a is b
False
>>> a
'peter'
>>> b
'peter'
>>>
"peter" == u"peter" will raise a UnicodeDecodeError if you compare "müsli" instead of "peter"
so you should use e.g.
"müsli" == u"müsli".encode("utf-8")
Why would you ever use Unicode strings in the internal types?
You only need Unicode for strings displayed to the user.
This is the reason for loving python:
never ever try to do things more complicated than they need to be!
"peter" in u"peter" and viceversa will do also.
I don´t know if this would be better (performance wise) than use "==". I do know "==" have a little extra overhead versus "is". But in the case of "in" would be good to see which one perform better.