How to repair a damaged or corrupt PDF not starting with %PDF-

Just this morning I received a PDF from an international supplier, which we needed urgently. Because of the time difference they were asleep when we discovered there was a problem with the PDF. We needed a document to clear some goods arriving at a port. If you search online there are a myriad of programs that can fix your PDF’s online or offline, but most of them cost money or leave some nasty watermark on your PDF. We needed the original intact without modification and I didn’t think I should have pay $50 for the privilege.



The Error – Acrobat could not open xxxxxx.pdf becasue it is either not a supported file type or because the file has been damanged…

acrobat-error

So what did I do? These are the steps that worked for me.

Spoiler – My PDF file didn’t start with %PDF-. This will not work if you have a different problem!

1. Open the PDF with Notepad or Wordpad

On windows you can use either notepad or wordpad to open any file to see the contents. Beware if it is large you may be waiting a while! To do this find the file in windows explorer or save it on your desktop while you are working with it. Right click and choose “Open with”. You then click on “Choose default program”. Make sure you untick ‘Always use the selected program…’ so you don’t associate all your PDF’s with a text viewer!

choose-default-program

2. Search for %PDF-

All PDF documents should start with %PDF- to be valid. Sometimes the software creating the PDF doesn’t do this but the rest of the PDF is valid and just needs to be modified a bit. After looking at the start of my document I could immediately see the problem. The PDF included part of the email inside the PDF file, in plain text. The %PDF- marker was way down the file after a bunch of stuff. Apparently this should be in the first 1024 bytes of the file to be valid.

pdf-in-wordpad

search-for-%pdf

Based on this I decided that I would manually remove all the garbage before the %PDF- marker and save the file. To do this properly you have to use a binary safe program as all the special characters you see in the PDF is binary data that will be further corrupted if you edit it with notepad or wordpad.

3. Download a Fee Hex Editor

A hex editor is a program that allows you to see the contents of the file and change it without corrupting any of the other binary data in the file. We could have gone straight to this point, but, if your file started with %PDF- when viewed with notepad or wordpad, there is no need for a hex editor. There is something else wrong with your PDF!

I downloaded Free Hex Editor Neo. Installed it and opened my troublesome PDF.



4. Search for %PDF- and delete all the garbage

Once you have the PDF open you should see a bunch of hexadecimal numbers on the left and a narrow column of characters on the right. From here you can search for %PDF- by pressing Ctrl + F. You should select everything before the % all the way to the top of the file and then press delete. Once you are finished the first character that appears in the right column should be the % in %PDF-. See the series of images below for the steps I took with my PDF.

hex-editor-neo-01

hex-editor-neo-02

hex-editor-neo-03

hex-editor-neo-04

5. Save file and Test Opening it

Save the file in your hex editor, which will write your changes to the file.

Then cross your fingers and open the file up with your usual PDF viewing program. If all went well you should see what I saw, a lovely PDF with it’s data intact.



fixed-pdf