IBM Bluemix with Watson is Awesome

IBM WatsonI am always interested in the high availability could services offered by Google (google cloud), Microsoft (Azure), Amazon (AWS) and IBM (Bluemix). I checked all of them out and tested some of the different features of each recently for a project. I eventually settled on AWS based on pricing, flexibility and performance. But, the most interesting feature I came across was on IBM’s Bluemix platform.


There are several ways you can tap into IBM’s AI, Watson. You can harness the power of this AI to use in your website or web applications. I was skeptical at first, assuming it would be garbage, but I was wrong. There is something called ‘personality insights’. Where you enter a bunch of text written by someone and Watson will attempt to tell you about the personality of that person based on that text. I copied and pasted some text from an old university assignment, which was the first thing i could find. It pinpointed a couple of things about me that I would agree with! Try it yourself console.ng.bluemix.net/catalog/services/personality-insights


Personality profile

 

So I decided to explore more. There are a heap of different services that Watson offers such as¬†interactive chat / help for users on your website, translation, tone and¬†emotion analysis and visual recognition tasks. There is a full list available here www.ibm.com/cloud-computing/bluemix/watson. My problem when I last tested the services was that the API was only accessible from Node.js, and I know nothing about Node.js, but a lot about php. After giving up on my second Node.js tutorial (it just wasn’t sinking in) I threw the towel in. But now I see they are offering their services in CURL, which I know php supports!


I think this is a fantastic service and the other major cloud service providers are going to try and offer their own alternative, some already are. But IBM have a head start as they have been developing Watson for a while now. I’m sure you’ve read about Microsoft’s recent failure, Tay, who lasted 16 hours before being corrupted into asking people for sex and becoming a fascist en.wikipedia.org/wiki/Tay_(bot).

How to repair a damaged or corrupt PDF not starting with %PDF-

Just this morning I received a PDF from an international supplier, which we needed urgently. Because of the time difference they were asleep when we discovered there was a problem with the PDF. We needed a document to clear some goods arriving at a port. If you search online there are a myriad of programs that can fix your PDF’s online or offline, but most of them cost money or leave some nasty watermark on your PDF. We needed the original intact without modification and I didn’t think I should have pay $50 for the privilege.



The Error – Acrobat could not open xxxxxx.pdf becasue it is either not a supported file type or because the file has been damanged…

acrobat-error

So what did I do? These are the steps that worked for me.

Spoiler – My PDF file didn’t start with %PDF-. This will not work if you have a different problem!

1. Open the PDF with Notepad or Wordpad

On windows you can use either notepad or wordpad to open any file to see the contents. Beware if it is large you may be waiting a while! To do this find the file in windows explorer or save it on your desktop while you are working with it. Right click and choose “Open with”. You then click on “Choose default program”. Make sure you untick ‘Always use the selected program…’ so you don’t associate all your PDF’s with a text viewer!

choose-default-program

2. Search for %PDF-

All PDF documents should start with %PDF- to be valid. Sometimes the software creating the PDF doesn’t do this but the rest of the PDF is valid and just needs to be modified a bit. After looking at the start of my document I could immediately see the problem. The PDF included part of the email inside the PDF file, in plain text. The %PDF- marker was way down the file after a bunch of stuff. Apparently this should be in the first 1024 bytes of the file to be valid.

pdf-in-wordpad

search-for-%pdf

Based on this I decided that I would manually remove all the garbage before the %PDF- marker and save the file. To do this properly you have to use a binary safe program as all the special characters you see in the PDF is binary data that will be further corrupted if you edit it with notepad or wordpad.

3. Download a Fee Hex Editor

A hex editor is a program that allows you to see the contents of the file and change it without corrupting any of the other binary data in the file. We could have gone straight to this point, but, if your file started with %PDF- when viewed with notepad or wordpad, there is no need for a hex editor. There is something else wrong with your PDF!

I downloaded Free Hex Editor Neo. Installed it and opened my troublesome PDF.



4. Search for %PDF- and delete all the garbage

Once you have the PDF open you should see a bunch of hexadecimal numbers on the left and a narrow column of characters on the right. From here you can search for %PDF- by pressing Ctrl + F. You should select everything before the % all the way to the top of the file and then press delete. Once you are finished the first character that appears in the right column should be the % in %PDF-. See the series of images below for the steps I took with my PDF.

hex-editor-neo-01

hex-editor-neo-02

hex-editor-neo-03

hex-editor-neo-04

5. Save file and Test Opening it

Save the file in your hex editor, which will write your changes to the file.

Then cross your fingers and open the file up with your usual PDF viewing program. If all went well you should see what I saw, a lovely PDF with it’s data intact.



fixed-pdf