Filtered by This site

Page 3

Reset

Dream: python bindings for squidclient

October 11, 2005
3 comments Python, This site

At the moment I'm not running Squid for this site but if experimentation time permits I'll have it running again soon. One thing I feel uneasy about is how to "manually" purge cached pages that needs to be updated. For example, if you read this page (and it's cached for one hour) and post a comment, then I'd like to re-cache this page with a purge. Setting a HTTP header could be something but that I would only be able to do on the page where you have this in the URL:


?msg=Comment+added

which, because of the presence of a querystring, is not necessarily cached anyway. The effect is that as soon as the "?msg=Comment+added" is removed from the URL, the viewer will see the page as it was before she posted her comment. squidclient might be the solution. ...sort of.

squidclient is an executable program that you get when you install the squid cache server. As described in this documentation you can manually purge any cache on a site which would have the desired effect to the problem mentioned above. The only problem is that I have about 30-60 posted comments per day and that would be a hell of a lot of command line calls to squid. Secondly, they'd probably be quite slow and the person posting a comment won't be prepared to wait that long. The code for this would be something like this:


cmd = 'squidclient -m PURGE %s' % self.absolute_url()
result = os.popen4(cmd).read()
return result.find('200 OK') > -1

(obviously this would need to be wrapped in some security assertions)

An even better solution exists only as a dream. A Python binding for squidclient that I can use directly from my Zope Python code:


# <pseudocode>
import pysquidclient
server = pysquidclient.Server('localhost', 80)
r = server.purgeURL(self.absolute_url())
return r.isOK()
# </pseudocode>

Imagine that! You could create the server instance for the duration of the Zope server running just call the purgeURL() function multiple times thus saving looooads of time. I guess it might be worth testing the os.popen4() method to see if it works. If it doesn't work, then maybe it's time to start looking at ESI

UPDATE

Thanks Kevin (and Seb) for pointing this out. The solution is something like this:


(scheme, host, path, params, query, fragment
            ) = urlparse.urlparse(objecturl)
h = httplib.HTTPConnection(host)
h.request('PURGE', path)
r = h.getresponse()
return r.status, r.reason

I'm back! Peterbe.com has been renewed

June 5, 2005
1 comment This site

Finally I got my domain name back. What happened was that it expired without me being notified. The reason I wasn't notified was that the email address that Network Solutions use is ancient and I don't check it anymore. What I had to do was to send a signed fax with a photocopy of me driving license to Network Solutions in the states to tell them to change my email address. Once I've changed my email address I was able to log in and renew the service for three years.

What confused the whole thing was that apparently I thought I could transfer the domain name over to mydomain.com who I use to administer the domain name. The reason it didn't work was that the domain could not be transfered when it was pending deletion.

Long story short: I'm back. To all those of you who have emailed me on mail @peterbe.com you and have got a delivery-error-message, do you want to resend that important piece of email now?

I'm back and awake!

October 19, 2004
1 comment This site

I'm back! My dear little website is back up and running. This time on a different computer on a different network.

What happened was that the poor little old laptop that my computer was running on completely screwed itself up after a hard restart. Everything on its memory became totally random. When it managed to boot up I had several gigantic folders, some with equal name that couldn't be opened. My friend Jan Kokoska helped me eventually run a few disk-checking programs and eventually we could see my non-backedup files again. With a Linux LiveCD we managed to copy the data across to another computer and eventually it got up here on this server.

The culprit was faulty RAM. Jan did lots of tests on it with software and eventually we managed to isolate that the extra 128Mb memory I had in the computer was broken. We took it out and threw it in the bin.

Now this new server is one of Fry-IT's. It's a dual Xeon 2.4Ghz box and not on some silly 256Kbits/s connection that I had before. Let's see how it goes.

PlogRank - my own PageRank application

May 21, 2004
2 comments This site, Web development

Now I've done something relatively useful with my PageRank algorithm written in Python that I'm actually quite proud of. It's not rocket science but at least I've managed to understand the Google PageRank algorithm and applied it to my own setup. This application is very simple and not so useful as one could hope but at least I prove to myself that it can be done.

I call it PlogRank. As you might have noticed, most blog items here on this site have on the left hand side, beneath the menu, a list of "Related blogs". These are from now on sorted by PlogRank! Cool, ha?

The "Related blogs" work by specific word matching. Every blog item has a list keywords that I define manually through the management interface. The selection of keywords is helped by another little database that filters out all typical words. E.g. "PageRank" is a particular word and "page" is not; so selecting these keywords is very easy for me.

Anyway. What I do now, once every week, is that I load a huge matrix of all connections between pages. If this blog item has a link to PageRank in Python then that page increases in PlogRank. It does not effect this page. I then feed this into the PageRanker program I've written which calculates the corresponding PageRank for each blog item. Easy! The whole calculation takes only a couple of seconds with 30 iterations. The calculation is actually only a small part of that time because reading from and writing to the database is the real bottleneck.

So, the end result is that every blog item that has related links will show these links in PlogRank-sorted order. Isn't that neat?

The importance of being findable

April 15, 2004
3 comments This site

Did a quick analysis on all the referers to my web site. Referers being when web users click a link to my site from another site instead of manually typing in the URL. The result is not surprising but quite sad. About 5% of all referer visits to my web site is from other normal web pages. All the remaining is from search engine results such as Yahoo, Google etc or other obscure web services.

What 5% might look like

The sad truth is that very few people make a link to my site :(
The good thing is that my site must be very findable.

The most important conclusion is probably that people don't surf the web anymore. Instead they search it. I for one trust Google so much that I sometimes search instead of digging up the URL written down somewhere. This proves the importance of being findable on the web. You have to make your pages findable otherwise you don't get any hits. So, redesign your sites so that Google can index them accurately and avoid silly things like frames or images with text in them.

Optimized stylesheets

March 5, 2004
1 comment This site

I have been experimenting recently with HTML optimization but haven't applied it yet. But I have applied this now to my stylsheets. The size gain is 33%! (1577 bytes to 1027 bytes) However, the speed gain involves also the time to perform the optimization so the speed gain will obviously be less than 33%. But the optimization takes, on this slow computer, 0.004 seconds so in approximate terms the speed gain is also 33%. This is on a stylesheet file with some but short and few comments.

The optimization script removes almost all unnecessary whitespace (newline characters included) and all comments. The code for python friends looks like this:


import re
css_comments = re.compile(r'/\*.*?\*/', re.MULTILINE|re.DOTALL)
def _css_slimmer(css):
   css = css_comments.sub('', css)
   css = re.sub(r'\s\s+', '', css)
   css = re.sub(r'\s+{','{', css)
   css = re.sub(r'\s}','}', css)
   css = re.sub(r'}','}\n', css)
   return css