I needed to write this little function because I need to add some parameters to a URL that I was going to open with urllib2. The benefit with this script is that it can combine a any URL with some structured parameters. The URL could potentially already contain a query string (aka CGI parameters). Here's how to use it if it was placed in a file called 'urlfixer.py':
>>> from urlfixer import parametrize_url
>>> parametrize_url('https://www.peterbe.com?some=thing',
any='one', tv="b b c")
'https://www.peterbe.com?some=thing&tv=b+b+c&any=one'
>>>
The function needed some extra attention (read hack) if the starting url was of the form http://foo.com?bar=xxx
which is non-standard. The standard way would be http://foo.com/?bar=xxx
. You can download urlfixer.py or read it here:
from urlparse import urlparse, urlunparse
from urllib import urlencode
def parametrize_url(url, **params):
""" don't just add the **params because the url
itself might contain CGI variables embedded inside
the string. """
url_parsed = list(urlparse(url))
encoded = urlencode(params)
qs = url_parsed[4]
if encoded:
if qs:
qs += '&'+encoded
else:
qs = encoded
netloc = url_parsed[1]
if netloc.find('?')>-1:
url_parsed[1] = url_parsed[1][:netloc.find('?')]
if qs:
qs = netloc[netloc.find('?')+1:]+'&'+qs
else:
qs = netloc[netloc.find('?')+1:]
url_parsed[4] = qs
url = urlunparse(url_parsed)
return url
Comments
look i have a problem i need to create a unique parameter like a id for a url and that parameter get to a form man i dont know how to do it so i need your help
Hello, nice site look this:
As '?' cannot be in url_parsed.netloc, 'netloc.find('?') > -1' is always false, so that block is useless.
Using '.find()' is discouraged, the Pythonic idiom is 'if "?" in netloc'.
I guess the hack was necessary exactly becuse the non-standard 'http://foo.com?bar=xx' form. As of Python 2.5 this is parsed correctly:
>>> u = urlparse('http://myfoo.com?a')
>>> u.netloc
'myfoo.com'
>>> u.query
'a'