Yesterday I blogged about a simple algorithm for removing a string from a list of string case insensitively. I was happy to see that several people joined in with suggestions just a few hours after I published it. I have now thought about it a bit more and to honour those who commented I've done a little benchmark to find out which one is the best.

What both my own solution and some peoples suggestions forgot was to raise a ValueError if it fails just like this would: list("abc").remove("d")

So, I've tidied up the suggestions and where need be I've added the exception throw. This is not the first time list comprehension in python impresses me. The winner in terms of performance is Andrews list comprehension suggestion. Here are the timeing results:


f1 0.704859256744
f2 1.5358710289
f3 1.37636256218
f4 0.468783378601
f5 0.475452899933
f6 0.666154623032

Here's a listing of all the f[n] functions (or download it):


# ---------------------------------------------------
def f1(L, e):

   e = ss(e)
   for item in L:
       if ss(item) == e:
           L.remove(item)
           return
   raise ValueError, "%r not in list" % e

# ---------------------------------------------------
class CaseInsensitiveString(object):
   def __init__(self, s):
       self.s = s
   def __cmp__(self, other):
       return cmp(self.s.lower(), other.lower())

def f2(L, e):
   L.remove(CaseInsensitiveString(e))

# ---------------------------------------------------
class CaseInsensitiveString2(object):
   def __init__(self, s):
       self.s = s
       self.s_lower = s.lower()
   def __cmp__(self, other):
       return cmp(self.s_lower, other.lower())

def f3(L, e):
   L.remove(CaseInsensitiveString2(e))

# ---------------------------------------------------
def f4(L,e):
   e = e.lower()
   L = [x for x in L if x.lower() != e]

# ---------------------------------------------------
def f5(L,e):
   e_lower = e.lower()
   c = len(L)
   L = [x for x in L if x.lower() != e_lower]
   if not len(L) < c:
       raise ValueError, "%r not in list" % e

# ---------------------------------------------------
def f6(L, e):
   element = ss(e)
   for idx, item in enumerate(L):
       if ss(item) == e:
           del L[idx]
           return
   raise ValueError, "%r not in list" % element

Comments

Yair Chuchem

if you're looking for speed, here's another one:

def f7(L, e):
e_lower = e.lower()
e_len = len(e)
L[:] = [x for x in L if len(x) != e_len or x.lower() != e_lower]

at my silly test datasets it works about twice faster
not raising ValueError cause that ain't my point
also note that replacing the function's L as in f4 and f5 won't affect the L of outside

Your email will never ever be published.

Related posts