Some notes on my first hour with xhtml2pdf (python pdf processor):

This is an open source tool built on python for converting html documents to pdf. For those without access to for-pay PDF processors like Antenna House or Prince, it could potentially be a viable option. (Prince also offers a demo version of their tool, which adds a watermark to the first page of each document but otherwise is fully functional.)

The code is all up on GitHub: I tried following the instructions in the github README first, failed (I'm not really
great at installing command line tools piecemeal), and only then decided to look through the docs, where I found these easy installation instructions:

To install: $ easy_install xhtml2pdf
To run: $ xhtml2pdf --css=/path/to/cssfile.css /path/to/htmlfile.html

Here's what I learned in my first hour:

* If the script starts to run and then errors out, you probably have either an error in your css, or you're trying to use css that isn't supported.
* Use the test css provided and the html docs to get the hang of things. When you clone the repo, you can find them here:
xhtml2pdf/doc/pisa-en.html, pisa.css
* Define all your block elements with display: block; (for some reason it's not built-in). This will also allow your block-level formatting to work (e.g., borders).
* Most of CSS3 paged media spec is not supported, but there are extensions that duplicate a lot of the paged media functionality. For example, to set up page size, you'd use "-pdf-page-size: letter;" instead of "size: 8.5in 11in;".
Shared publiclyView activity