Thursday, December 22, 2011

A little adventure on the bill watcher on the pdf part. One I am hoping to able to attach comment to a section on the pdf, then have a script that compile the comments, and email blast to the MP's

So a few thing I am trying

Loading pdf into iframe, 

It is actually a pretty standard approach of handling it. Except I am linux now, and i am lazy to install a plugin. 

Also iframe height is a problem, it is hard to get it work well across browser.

I don't know ways to have javascript to access the text inside the pdf. So I don't know how to attach jquery on it. 

Use pdf.js

Work well on firefox, but not anything else. Also this is also a very new library. 

While there is a text layer for text, it don't line up well in text in the bill document . I suspect it is because of badly form pdf. 

Javascript should work well on it though, as it have a  text layer div.

Convert pdf into html

Have jquery to use .load(). It work across browser, until the layout gone wrong, could be because of bootstrap(the css library). Pdftotext work very well, the html generated can be processed pretty easily. The layout on the other hand....

javascript should work well too.

load converted pdf's html into iframe

It does not solve the height problem on iframe, but javascript should able to use it. It is a bit of a pain though. Because the pdf generated is pretty well formed. Solved the issue of layout gone wrong because of css library though.

No comments:

Post a Comment