This has served me well of the last couple of years of using Django:


from django import forms

class _BaseForm(object):
    def clean(self):
        cleaned_data = super(_BaseForm, self).clean()
        for field in cleaned_data:
            if isinstance(cleaned_data[field], basestring):
                cleaned_data[field] = (
                    cleaned_data[field].replace('\r\n', '\n')
                    .replace(u'\u2018', "'").replace(u'\u2019', "'").strip())

        return cleaned_data


class BaseModelForm(_BaseForm, forms.ModelForm):
    pass


class BaseForm(_BaseForm, forms.Form):
    pass

So instead of doing...


class SigupForm(forms.Form):
    name = forms.CharField(max_length=100)
    nick_name = forms.CharField(max_length=100, required=False)

...you do:


class SigupForm(BaseForm):
    name = forms.CharField(max_length=100)
    nick_name = forms.CharField(max_length=100, required=False)

What it does is that it makes sure that any form field that takes a string strips all preceeding and trailing whitespace. It also replaces the strange "curved" apostrophe ticks that Microsoft Windows sometimes uses.

Yes, this might all seem trivial and I'm sure there's something as good or better out there but isn't it a nice thing to never have to worry about doing things like this again:



class SignupForm(forms.Form):
    ...

    def clean_name(self):
        return self.cleaned_data['name'].strip()

#...or...

form = SignupForm(request.POST)
if form.is_valid():
    name = form.cleaned_data['name'].strip()

UPDATE

This breaks some fields, like DateField.


>>> class F(BaseForm):
...     start_date = forms.DateField()
...     def clean_start_date(self):
...         return self.cleaned_data['start_date']
...
>>> f=F({'start_date': '2013-01-01'})
>>> f.is_valid()
True
>>> f.cleaned_data['start_date']
datetime.datetime(2013, 1, 1, 0, 0)

As you can see, it cleans up '2013-01-01' into datetime.datetime(2013, 1, 1, 0, 0) when it should become datetime.date(2013, 1, 1).

Not sure why yet.

Comments

Simonas

The “curved” apostrophe (U+2019 RIGHT SINGLE QUOTATION MARK) is actually a preferred symbol for apostrophe in many cases.

See http://unicode.org/Public/UNIDATA/NamesList.txt or https://en.wikipedia.org/wiki/Apostrophe#Unicode

Peter Bengtsson

So it's a standard Unicode character? I thought it was something incorrectly encoded because UTF-8 had troubles with it.

Your email will never ever be published.

Previous:
From Postgres to JSON strings November 12, 2013 Python
Next:
Credit Card formatter in Javascript November 19, 2013 JavaScript
Related by category:
How to avoid a count query in Django if you can February 14, 2024 Django
How to have default/initial values in a Django form that is bound and rendered January 10, 2020 Django
My site's now NextJS - And I (almost) regret it already December 17, 2021 Django
How to sort case insensitively with empty strings last in Django April 3, 2022 Django
Related by keyword:
How do you thousands-comma AND whitespace format a f-string in Python March 17, 2024 Python
HTML whitespace "compression" - don't bother! March 11, 2013 Web development
The awesomest way possible to serve your static stuff in Django with Nginx March 24, 2010 Django
Automatically strip whitespace in Django forms October 12, 2009 Django