Minor side project: PDFBoxLite

My latest contribution to GitHub is PDFBoxLite, which represents a few hours work. Basically, all of my software efforts that involve creating chemical graphics are leaning toward pushing PDF as the format of choice for creating inline figures. Everyone knows PDF as the Portable Document Format – things you look at before sending to the printer. But what you may not know is that PDF files with just a single page, of arbitrary size, are a very well supported as a way to get pictures into a larger document (Word, Pages, PowerPoint, Keynote, Excel, etc.). This is particularly true within the Apple ecosystem. Unlike SVG, correct rendering is essentially universal; and unlike WMF/EMF, it works on platforms other than Windows. And unlike PNG, it isn’t a trainwreck when you try to print/zoom/convert your document.

While the Apple development platforms (iOS, Mac) support PDF natively, the back-end toolkit used by Molecular Materials Informatics is written in Java, and the most effective way to add PDF support was via the Apache PDFBox project. This is an excellent resource (like most of everything the Apache foundation turns out), but the only annoying part is that it is really rather a bit too large for my liking. The final .jar file gets bigger to the tune of 8 megabytes, which is several times the size of my own actual project. Increasing the deliverable size by a large multiple in order to add a leaf-node feature, which should really be a part of the core platform, is not something that sits well with the OCD tendencies of a professional software engineer.

So this morning I found the time to sit down and take a chainsaw to the PDFBox library and remove a lot of the functionality I didn’t want. No need to read existing PDFs, or add tables of contents, or hyperlinks, or forms, or signatures, or security. Don’t care about any of that: all I want is to be able to create static figures, and have them be equally beautiful at any resolution, on any platform. Thus far I have managed to knock the size down to under 1.5 megabytes, and no doubt a few more pieces could be lopped out, but that’s acceptable for now.

If you are looking for a way to add PDF support to your Java project, and output size is a factor, check it out. Also, if you are not familiar with the PDFBox library, you may want to look at the example RenderContext.java file, which shows how to draw basic primitives onto a PDF context (lines, rectangles, ovals, splines, simple text, etc.).


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s