This is mostly a note for my own use. I was recently sent a link to a special website containing documents related to the bankruptcy proceedings of a certain company. I thought “this document seems like something I should keep a copy of” and of course logging in to said website is a pain. The creators of the website had used a heavily disabled variant of the pdf.js application – a library created by Mozilla in order to prevent anyone “leaking” the documents. I wasn’t planning on (and won’t) leak said documents, but the ability to generally dump data from a running Javascript application is useful.
In Chrome, hit F12 for the developer console.
Look for some indication that pdf.js is in use. This could take the form of a reference to pdf.js in the webpage itself, one of the resources loaded or something in the network console.
Go to the Memory tab and take a heap snapshot.
In the resulting profile, change the perspective to Containment.
The Containment view provides a hierarchical space (or perhaps reference) view of the application. You’re looking for something big that isn’t “native” or something pdf related.
Chrome makes it easy to store this object as a global variable. Right click on PDFViewerApplication and select Store as Global Variable. It’ll be stored as something like temp1.
You can now explore the object… there seems to be a cool method called getData(), which returns a promise.
Inside the Promise, the PromiseResult has the data of the PDF. Store the data as another global variable. Paste the following code in the console to create a function to save the data as a file:
var downloadBlob, downloadURL;
downloadBlob = function(data, fileName, mimeType) {
var blob, url;
blob = new Blob([data], {
type: mimeType
});
url = window.URL.createObjectURL(blob);
downloadURL(url, fileName);
setTimeout(function() {
return window.URL.revokeObjectURL(url);
}, 1000);
};
downloadURL = function(data, fileName) {
var a;
a = document.createElement('a');
a.href = data;
a.download = fileName;
document.body.appendChild(a);
a.style = 'display: none';
a.click();
a.remove();
};
Finally, enter downloadBlob(temp2, 'some-file.pdf', 'application/octet-stream');
– assuming the data you stored as a global variable was called temp2.
And that’s it, you’ve got that copy-protected PDF in its original glory.