Thursday, February 15. 2007
My C WSGI implementation is relatively decoupled from the transport code (AJP). It's conceivable that replacing the transport layer with something else is entirely possible. Like say... a full HTTP 1.1 server. I guess there could be two directions to take that:
I suppose one could take an embeddable HTTP server (or write one from scratch, if so inclined) and glue the WSGI code into the request handling pipeline. Personally, it's not a project that interests me much. Leave web server writing to the web server experts, I say. However, suppose one wrote it as an extension module for one of the popular web servers...
How about an extension module for say, Apache HTTPD or lighttpd? The problem I see with this is that, functionally, it would just be equivalent to mod_python. (After all, you're just embedding Python into the web server and patching into its request processing.) I think the only difference would be that the WSGI-adapter code (the layer that sits just above Python) would be written in C. Additionally, you inherit some of the more interesting problems of mod_python... namely conforming Python's process/threading model to that of the web server's. In all likelihood, you would be running multi-process, not multi-threaded. I guess if you were clever, you could run the Python interpreter in only a single process, similar to how CGI works in modern versions of Apache HTTPD.
But then, if you do that, all it buys you is automatic process management versus using an external server model like AJP/SCGI. So why bother with yet another mod_somethingsomething?
Anyway, ajp-wsgi will remain ajp-wsgi for the foreseeable future. (If I'm sufficiently bored and curious, I may try creating an scgi-wsgi someday.) Though http-wsgi does pique my interest somewhat... I just have to find a suitably feature-laden embeddable C web server.
Wednesday, February 14. 2007
Well, not much has been happening. ajp-wsgi has been humming along, stable as a rock. It's been running all of my Python-based websites for a little over a month now (longer, really, since I had to reboot my server early January to update FreeBSD). Once I make a few documentation updates, I think it would be safe to declare a 1.0 release. (Seemingly a big milestone nowadays...)
Saturday, December 16. 2006
Well, after a week of coding, I decided to "release"
I found it highly educational to create, illuminating the mysteries of the Python C API. Plus I'm using it everywhere now. And after having closed ticket #4, the solution of which seems to have been a panacea to all current issues, it's pretty much complete as far as I envisioned it.
Now maybe I can get back to Flannel...
And I have thought about writing an SCGI version as well... but it's not something I would use. So what's the point? I don't really like supporting something I don't use regularly, though by virtue of being free software, I'm not really obligated to provide any support. (But I still do because I'm a nice guy. ) Maybe if Apache HTTPD eventually adds a mod_proxy_scgi though...
Ah, no Google hits for mod_proxy_scgi. Oh well. At least this entry may eventually show up.
Thursday, December 14. 2006
Apparently, I did not read the AJP13 spec nor my original code very closely:
Note: The content-length header is extremely important. If it is present and non-zero, the container assumes that the request has a body (a POST request, for example), and immediately reads a separate packet off the input stream to get that body.Here I was requesting the first block. Anyway, a quick and easy fix.
Trac seemed to be the most non-trivial to convert. It doesn't provide a ready-made application factory to create the WSGI app object. I basically had to mimic (using my config options) the operations that its main() method performed. Other applications (my own blog & shorten projects, moinmoin) had readily-available app objects though. And I'm also glad to say that Paste Deploy-based apps are easily deployed with
Anyhow, I went ahead and decoupled the C WSGI code from the AJP code today. Now the next time I'm bored, I think I'll write drivers for both ends of an SCGI connection. It would be interesting if I could write that FastCGI->SCGI adapter wholly in C (using the standard C FastCGI dev kit). Actually, I guess I should check if there's already a C SCGI implementation...
Tuesday, December 12. 2006
I moved all the WSGI stuff out of my AJP C library project into its own project: ajp-wsgi. I polished it up a bit, gave it a command-line interface, wrote a better build system, and even wrote a simple README for it. You can find it here.
Note that this is not a Python extension. Rather, it is a 100% C WSGI implementation... that executes the application in an embedded Python interpreter.
It's moved beyond a proof-of-concept and is quickly becoming more and more practical. (At this moment, my personal wiki is running atop it. Maybe I'll switch my Trac sites and shorten over to it as well.) But make no mistake, it is very much alpha-quality and untested.
Monday, December 11. 2006
A continuation from the weekend's entry... I actually finished the WSGI implementation in C and glued it to the AJP library I wrote. It was an interesting endeavor... programming for Python in C.
I was going to cop out and just implement wsgi.input in Python, but I went all the way and wrote that in C as well. And I'm glad too, because it's far more efficient. Data copies are greatly minimized. And data is streamed from the server. Assuming the application reads wsgi.input in decent-sized chunks, the memory usage will always remain manageable. (For example, I uploaded a 600+ MB file and hashed it. The server never used more than 2-3MB of memory.)
And from my braindead (i.e. "Hello World!") benchmarks, the server is capable of about 900 requests per second. This is a 10X improvement compared to the pure-Python server serving the same application. Not bad at all.
I'm glad to say that as far as I can tell, the server is pretty close to 100% WSGI compliance. At least,
Anyhow, still a bit of work to do. It would be nice if it was configurable somehow. Also, I should probably investigate if I can just turn it into a simple Python extension module (vs. being a C server that embeds a Python interpreter). I haven't looked how the hybrid FastCGI servers are packaged, but I'm sure it's something much more sane than the route I went.
Sunday, December 10. 2006
There don't seem to be many AJP C libraries. In fact, there don't seem to be any (according to Google, at least). There's at least one FastCGI C library, which is unsurprising given the ubiquity of FastCGI. So yesterday afternoon, I decided to "read spec, write code" yet again and began a C implementation of the "container" (app server) side of AJP. After not having touched C for over 2-3 years, it was a good feeling to muck around with C and BSD sockets again. (Procedural programming, how I missed you!)
I finished it up in a few hours and it is now fairly complete. It's actually a pretty simple protocol, I've realized. All the complexity comes from the way requests/responses are encoded and decoded. (Otherwise, it's a fairly straightforward 1:1 mapping.)
Of course there were the 3 undocumented spec additions, the first two I had to figure out through experimentation so long ago and the last was conveyed to me by someone who actually looked at the source mod_jk/mod_proxy_ajp source. (As much as I believe in the whole "the best documentation is the source" thing, I don't really like looking at similar/related source code when implementing something.)
Anyhow, I don't think those 3 undocumented additions are documented anywhere (hah!) besides my source (ajp_base.py and ajp.c). So:
As far as request/response throughput is concerned, it looks promising. While the threaded pure-Python AJP server could only handle ~86 requests per second (with 100 parallel clients), the threaded C/Python hybrid version was nearly pushing 1000 requests per second. Of course it doesn't include WSGI overhead yet. But we shall see.
All this effort is, of course, inspired by the WSGI servers built upon the FastCGI C library: fcgiapp and python-fastcgi. If I actually used FastCGI, I'd probably be using one of those servers rather than flup's.
Wednesday, December 6. 2006
While thinking about how I'm going to implement security in flannel, I thought: it would be nice to be able to reuse all the WSGI auth middleware out there. (Really, coming up with my own auth scheme to fit within flannel's framework did not seem very appealing.)
But it seems, you either protect the entire application, or you don't. Nothing so far that I've seen lets you protect only certain URL patterns. So generalizing upon this, it would be nice to have:
A conditional filter-type middleware that will accept: a bunch of URL patterns and a middleware instance. If the PATH_INFO matches any of the URL patterns, the middleware is invoked... otherwise the request is passed directly to the application. One can build more complex conditional behavior by composing different instances of this filter middleware.And it would be nice if the pattern language were something simple, like Ant-style path pattern matching. Other nice-to-have features would be pluggable pattern matchers (maybe people would rather use regexes... who knows?), a caching decorator for pattern matchers.
Another wish-list item I had, which is somewhat also security-related:
"Remember Me" middleware: This would be similar toA nice-to-have feature would be including the date the cookie was created into the signed data. This would allow some sort of "expire all persistent login data" feature.
Anyway, stuff to work on if I have time and it interests me enough. I haven't been doing much lately because I haven't been feeling well. A shame though, that a little illness would utterly stop all my Python/flannel momentum.
Tuesday, December 5. 2006
In an effort to quell a warning from wsgiref's validator, QUERY_STRING will now default to an empty string if it doesn't exist in the environ. Despite not being required to always be present by the WSGI spec, it looks like the cgi module will assume sys.argv is the query string if QUERY_STRING isn't present in the environ.
Also, I changed the keyword parameters in GzipMiddleware a bit:
And as far as
Wednesday, November 29. 2006
You know a project is getting serious when you create a Trac site and a blog category for it.
So I've been hammering away at this thing for the past 3-4 days. I'm surprised to say that it's actually in a working state now. Maybe functional enough to be classified as a 'toy' web framework? Well, I suppose I need to finish more components first, since having only text fields for form input could get quite frustrating.
But the basics are all there... component rendering, event triggers, parameter binding (both by value and by reference), automatic persistence of session data.
Time to polish it, add more components, refine the interfaces. But I wonder... do I ever really want to publish it? Make it widely available? Document and support it?
Hmmm... I don't know.
But for now, I'd rather write code.
Saturday, November 25. 2006
Related to a recent entry I've made, I started thinking about a Python port (at least in concept) of Tapestry. Rather than go for the template/spec/class approach of Tapestry 4, I opted to go for the wholly annotation-based approach of Tapestry 5 (still in development) thus eliminating the need for a spec file.
Annotating methods/functions in Python is pretty simple, just use decorators. Annotating member variables, on the other hand, seemed non-trivial. But then I remembered ORMs like SQLObject and SQLAlchemy and that gave me an idea. However, it would require metaclass magic...
So when I woke up this morning, I grabbed my laptop and started hacking away at a proof-of-concept (while still in bed!) Many hours later (and I did get out of bed, eventually), I had most of the metaclass infrastructure for flannel done. Yes, flannel. (I'm entertaining ideas for a better name though.)
So now I can easily mark variables as a persistent, transient, or parameter type. And the variables will be magically transformed into a Python property with my own setters and getters. Also, I can mark methods with certain "lifecycle decorators" which defines when during the rendering cycle they should be called. Basically something like:
Transient variables don't really need to be annotated... but doing so allows them to be reset to an initial value at the start of a processing cycle. Speaking of which, if you aren't familiar with Tapestry, the abbreviated request/response cycle I'm aiming for looks something like:
Anyhow, a significant (and unfortunate, I think) part of this project will involve coming up with yet another templating engine. However, it should be a fairly simple tag-based system, since most of the functionality will actually be implemented by components.
It will be interesting. Though I don't expect to take this project very seriously. But maybe if/when I actually get it to where I want it to be, I'll change my mind.
Friday, November 24. 2006
I added FCGIApp and SCGIApp to the (new) flup.client package. I wrote FCGIApp a few months ago for Ian Bicking's WPHP. While attempting to debug a problem with flup's fcgi server on Dreamhost (vs. my old fcgi.py module), I thought of an interesting solution.
Since the problem seemed to stem from the app's long startup time, and since Dreamhost only allows dynamic (i.e. web server-launched) FastCGI apps, why not have the web server-started app be as simple as possible and have it forward requests to a static, manually-launched app server?
In other words:
httpd (mod_fastcgi) -> fcgi.WSGIServer -> FCGIApp -> fcgi.WSGIServer -> app
Yeah, it ends up doing some extra work... which is why I wrote SCGIApp (SCGI is a far simpler protocol):
httpd (mod_fastcgi) -> fcgi.WSGIServer -> SCGIApp -> scgi.WSGIServer -> app
Anyway, it's a trade off and I really don't know how significant it is. But I was never really a fan of web server-launched FastCGI apps. Unless you configure mod_fastcgi explicitly, it has a tendency to launch an application multiple times. And if your application isn't multi-process safe and aware, it can lead to problems.
Consider these two modules experimental for now. I'm not really sure where I want them to live. But they're there for the time being to play around with.
Tuesday, November 21. 2006
Nothing about Python per-se, but applications/frameworks written in Python that I wish I could find. (And that weren't related to Zope/Plone!)
Poor flup. Reviled and yet seemingly a necessary evil because it was one of the first. (But thankfully, no longer the only choice as more FastCGI servers are released.)
Whatever. Winter break is coming up. I should have time to work on something.
Monday, November 20. 2006
I briefly toyed with Trac some time ago. Though I had to patch it to use WSGI, I found it pretty easy to use and maintain. My efforts eventually stalled, however, probably due to the way my public repository was derived from my real repository (i.e. lots of empty changesets).
Anyway, yesterday I updated to Trac 0.10.2 and was pleasantly surprised to find that it supported WSGI and flup's ajp and scgi servers out-of-the-box. The empty changeset issue is still there (which is my problem to fix). I just might have to split my repository up in the future.
Anyhow, flup has its own Trac site now:
The old flup site, http://www.saddi.com/software/flup/ simply redirects there now. (And will probably do so for the foreseeable future.) Please continue to use this URL when referencing flup.
I locked down all the default pages, but all the flup-related pages (save for the front page) are editable by anyone. So feel free to contribute examples, FAQs, etc.
Tuesday, November 14. 2006
I've worked on my Python "blog" project for a number of years now. If I remember correctly, under the hood, it started out using flat files for storage and CGI. Eventually I moved it to mod_python+Publisher, then to FastCGI+my own version of Publisher. Sometime before that, I switched to using PostGreSQL... eventually transitioning the site content and then the session data to the database. (The templating engine changed at least once. And of course eventually, came the switch of the whole webserver interface to WSGI.)
Anyhow, what I was aiming for was a user-customizable blog/forum site. Multi-column, a selection of premade components (such as polls and HTML boxes), user profiles, personal bookmarks, etc. It was pretty ambitious and it met most of those goals. Though little did I realize that if I just generalized my goals a bit, I would actually be creating a Content Management System.
I didn't come to that realization until I started playing around with CMS's recently. At first, being in my Java mood, I sought open source Java CMS's. I only stumbled across Lenya. I ran into some difficulty, and promptly tabled it. I figured there would be more Java CMS's out there. But I really didn't have much luck finding anymore... (ok, so I didn't try that hard, either)
I briefly looked at Plone (based on Zope). It seemed a bit overkill for what I wanted... and overtly complex. So I moved on the PHP-based projects. (Note that I eventually did try Plone. But my gut feeling was correct, especially compared to the PHP CMS's.)
So I decided to look at Joomla! and Drupal, based on all the hype around those projects. (Some of it good, some of it bad... i.e. security vulnerabilities that made the front page of Slashdot. )
I tried Drupal first and immediately fell in love. It was utterly simple to use and maintain. It was a breeze to set up multiple instances using the same installation. It had a whole array of 3rd party modules that did what I wanted. (And I found, it could do most of what I wanted out of the box already - I just had to enable the module.) Anyhow, I switched 3 of my (static) sites to Drupal to try it out.
Unfortunately, at this point, I never gave Joomla! a chance. I hear many good things about Joomla! pertaining to the ease of use (and the eye candy). But it had one fatal flaw in my eyes: it only supported MySQL. Hence the reason that I tried Drupal first...
As an aside, it's probably obvious, but I'm not a big fan of the L-M-P in LAMP. 1) I am and have been an avid FreeBSD user for the past decade. 2) I prefer PostGreSQL over MySQL, speed be damned. I like my transactions, thank you. 3) Whether the P stands for Perl or PHP, I'd rather have my apps written in Python and to some extent, Java.
Anyway, the only thing really missing (so far) from Drupal is a good CAPTCHA module. Solving a simple addition problem seems a bit simplistic. I'd be surprised if there weren't bots that could answer those CAPTCHAs already. And the image CAPTCHA support seems a bit lame, especially compared to the last CAPTCHA library I worked with, JCaptcha. Anyhow, I bring this up because even though I normally turn off user registration, I did forget to turn it off for one of my sites. Overnight, some 3-4 "people" (all using rather fake looking yahoo addresses) registered for my site. I'm sure if I had some actual content that you could post comments to, I'd also have a bunch of spam to take care of.
Anyway, one of the sites that was converted happens to be saddi.com. If I get more familiar/comfortable with Drupal, I'll probably be subsuming this blog into its Drupal instance.
Syndicate This Blog