This has served me well of the last couple of years of using Django:
from django import forms
class _BaseForm(object):
def clean(self):
cleaned_data = super(_BaseForm, self).clean()
for field in cleaned_data:
if isinstance(cleaned_data[field], basestring):
cleaned_data[field] = (
cleaned_data[field].replace('\r\n', '\n')
.replace(u'\u2018', "'").replace(u'\u2019', "'").strip())
return cleaned_data
class BaseModelForm(_BaseForm, forms.ModelForm):
pass
class BaseForm(_BaseForm, forms.Form):
pass
So instead of doing...
class SigupForm(forms.Form):
name = forms.CharField(max_length=100)
nick_name = forms.CharField(max_length=100, required=False)
...you do:
class SigupForm(BaseForm):
name = forms.CharField(max_length=100)
nick_name = forms.CharField(max_length=100, required=False)
What it does is that it makes sure that any form field that takes a string strips all preceeding and trailing whitespace. It also replaces the strange "curved" apostrophe ticks that Microsoft Windows sometimes uses.
Yes, this might all seem trivial and I'm sure there's something as good or better out there but isn't it a nice thing to never have to worry about doing things like this again:
class SignupForm(forms.Form):
...
def clean_name(self):
return self.cleaned_data['name'].strip()
#...or...
form = SignupForm(request.POST)
if form.is_valid():
name = form.cleaned_data['name'].strip()
UPDATE
This breaks some fields, like DateField
.
>>> class F(BaseForm):
... start_date = forms.DateField()
... def clean_start_date(self):
... return self.cleaned_data['start_date']
...
>>> f=F({'start_date': '2013-01-01'})
>>> f.is_valid()
True
>>> f.cleaned_data['start_date']
datetime.datetime(2013, 1, 1, 0, 0)
As you can see, it cleans up '2013-01-01'
into datetime.datetime(2013, 1, 1, 0, 0)
when it should become datetime.date(2013, 1, 1)
.
Not sure why yet.
Comments
The “curved” apostrophe (U+2019 RIGHT SINGLE QUOTATION MARK) is actually a preferred symbol for apostrophe in many cases.
See http://unicode.org/Public/UNIDATA/NamesList.txt or https://en.wikipedia.org/wiki/Apostrophe#Unicode
So it's a standard Unicode character? I thought it was something incorrectly encoded because UTF-8 had troubles with it.