Often when working with lists of strings in python you might want to deal with the strings in a case insensitive manner. Today I had to fix an issue where I couldn't use somelist.remove(somestring)
because the somestring
variable might be in there but of a different (case)spelling.
Here was the original code:: def ss(s): return s.lower().strip() if ss(name) in names: foo(name + " was already in 'names'") names.remove(name)
The problem there is that you get an ValueError
if the name
variable is "peter" and the names
variable is ["Peter"].
Here is my solution. Let me know what you think:
def ss(s):
return s.lower().strip()
def ss_remove(list_, element):
correct_element = None
element = ss(element)
for item in list_:
if ss(item) == element:
list_.remove(item)
break
L = list('ABC')
L.remove('B')
#L.remove('c') # will fail
ss_remove(L, 'c') # will work
print L # prints ['A']
Might there be a better way?
UPDATE Check out Case insensitive list remove call (part II)
Comments
Post your own commentThe following (based on a guess about how list.remove() might work) is perhaps a nicer solution:
class CaseInsensitiveString(object):
... def __init__(self, s):
....... self.s = s
... def __cmp__(self, other):
....... return cmp(self.s.lower(), other.lower())
L = list('ABC')
L.remove(CaseInsensitiveString("c"))
L = list('ABC')
other = 'c'
L = [x for x in L if x.lower() != other.lower()]
# L == ['A', 'B']
Slim and fast but as you can see in
http://www.peterbe.com/plog/case-insensitive-list-remove-call-part-ii
I made some necessary improvements.
How about this? Oops: I almost had time to be the first!
def ss_remove(lst,el):
return [e for e in lst if ss(e) != ss(el)]
L = list('ABC')
L = ss_remove(L, 'c') # will work
print L # prints ['A','B']
What about using a dictionary populated with (s.lower(), s)? Then your normal list is in .values(), and look-up is much faster than going through the whole list over and over. I guess it depends how often you need to look for strings disregarding case.
d = {}
d[ss(name)] = name
if ss(name) in d: print name + " there!"
Nice try but it doesn't preserve the order of the list.
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/107747 shows how to make an ordered dictionary.
prefer the list comp personally, but
for idx, item in enumerate(list_):
if ss(item) == element:
del list_[idx]
break
# is it a bit more efficient in worst case.
See the test cases on http://www.peterbe.com/plog/case-insensitive-list-remove-call-part-ii/iremove.py
and notice how unfortunately few of them are long lists. If they were really really long, a O(log n) method like this might beat the list comprehension one.
You asked for a better way to do the removal. I interpreted it to mean 'shorter, more Pythonic' or something like that.
If we instead are talking about speed, I have a new one for you:
def f7(L,e):
....eLower = e.lower()
....eUpper = e.upper()
....L = [x for x in L if x != eLower and x != eUpper]
I don't know what kind of beast your machine is because my timings are much slower than yours. They tend to vary a little (GC?) but here's a typical run:
f1 1.59400010109
f2 2.65599989891
f3 2.23399996758
f4 0.921999931335
f5 0.905999898911
f6 1.73400020599
f7 0.43700003624
Please don't announce the winner before the race is over! :)
This solution works for lists of single characters only! Sorry!
How about using dictionaries? I didn't expect improvement speedwise. I was wrong. That's nothing new: I've been wrong before!
def f8(L,e):
....d = {}
....# The value could be anything, really
....d[e.lower()] = d[e.upper()] = True
....L = [x for x in L if not x in d]
Timings on my machine:
f7 0.43 - 0.48
f8 0.28 - 0.33
This solution works for lists of single characters only! Sorry!