Posted in Uncategorized by scarsonmsm on October 26, 2006

So here’s the end product of a process that has inspired quite a number of posts. A while back, we undertook a really exhaustive review of possible publication formats for MIT OCW. We receive some criticism for our use of PDF as our basic publication format. Especially among educational technologists, PDF is viewed as being a relatively inflexible as compared to XML-based approaches. And it is. But given our constraints (publishing existing materials authored in a variety of formats and publishing all of MIT’s courses), it quickly becomes the logical choice.

The short version of why is that with PDF, our level of effort is basically linear according to the number of documents we publish. The conversion from other formats is reliable, requires little QA, and is one step for the entire document. There are no reliable tools to convert to XML that also convert formulas (and most of our materials have formulas) so if you accept that all formulas will be images only, then using an XML converter means our level of effort becomes linear according to number of pages, as there is still proofing and formatting required. If you want the formulas in XML as well (which is the really cool and useful part of XML) then the level of effort is linear according to the number of formulas, because you have to hand code them.

Of course, having materials authored in XML to begin with would solve the problem, but it’ll be a while ’till that happens at MIT. Anyway, here’s the report. In MS Word (incidentally the most preferred format of MIT OCW visitors). Please do let me know if you have comments.


