Jared Burrows Blog: Scribd

Monday, August 29, 2011

How to hack Scribd to download documents for free

How to download documents for free
I was looking at an online document that SOMEONE ELSE UPLOADED and it was very helpful, so I wanted to download it. Scribd however, wanted to charge a daily fee of around $5 dollars to download the content, when it clearly says someone uploaded it.

Here was the document that I wanted: http://www.scribd.com/doc/90924585/The-Dark-Monk-Excerpt-by-Oliver-Potzsch
For Public ID's(2012):

*document ID* = 90924585

http://www.scribd.com/mobile/documents/*document ID*/
or
http://www.scribd.com/mobile/documents/*document ID*/download

Insert the number id in the here ^
Sometimes using the first download link is better because it creates a download button. It should should show a download link.

Update(2/18/12): For Private ID's:

Example URLs:
http://www.scribd.com/doc/39976170/Anthro-2B-Midterm-Study-Guide http://www.scribd.com/doc/33840335

Right-click > View Page Source > Save as to a document on your computer *Make sure you do this to get the entire "Generated Source" (I used Mozilla FireFox)

I saved this as "doc.html" Open the file "doc.html" > Search for "page.u" You should see something like "// page.uuid : 3kw800775slntll"

page.uuid = "3kw800775slntll"

Now search for the page.uuid or "3kw800775slntll" and you should see something like: "http://html2.scribdassets.com/3kw800775slntll/pages/542-d2422da938.jsonp"

The URL can range from "html1.scribdassets.com" to "html4.scribdassets.com".

Here is a simple Linux Bash Script to download the images:
Make SURE you change the *page.uuid* to your page.uuid.

cat "doc.html" | grep pages | cut -d"/" -f6 | cut -d. -f1 | grep -vi scrib | while read ID; do wget "http://htmlimg1.scribdassets.com/*page.uuid*/images/$ID.jpg"; done;

However, once you find the page.uuid, you can substitute it in the URL. "http://htmlimg1.scribdassets.com/*page.uuid*/images/$ID.jpg"

Update(2/25/12): For Protect(Preview) ID's:

When there are previews on Scribd, they are trying to selling a document and other allow users to few select pages. Downloading these would be illegal and the script above only downloads the images of the "protected" documents.