Extracting Javascript From PDF, and Replacing Javascript Into PDF

Not being able to find easy to use tools to extract the javascript code contained in a pdf file in plaintext, nor tools to update the javascript in a pdf without changing anything else. I wrote those two small command line java tools using the excellent iText library.

PDF2JS will extract the javascript from the pdf passed as argument. On windows an example way to use it is: “java -jar pdf2js.jar myinput.pdf”, and it will output the javascript in myinput.pdf.ps.

JS2PDF will replace the javascript in the input pdf file, and write the result into the output pdf file. On windows and example way to use it is: “java -jar js2pdf.jar original.pdf javascript.js output.pdf”, the file output.pdf will contain the updated javascript.

For those interested the very simple code for each tool is provided within the archives.

Also if the pdf file you are working with has usage restrictions, make sure you remove them using the excellent qpdf with the following command line: qpdf –decrypt protected.pdf output_unprotected.pdf

Using Eclipse as IDE and Debugger for Python

Having decided to clean-up my machine, I had to find a solution for the Python mess. Having Eclipse for Java and C++, I decided to give it a try for Python.

Here are some of the benefits I found:

1. You can reuse the dev environment without additional install
2. Eclipse with PyDev is a unified IDE and debugger
3. It is a familiar environment
4. You can use both Python 2.x and 3.x for different projects in your workspace
5. Download and disk footprint are smaller than the best alternatives
6. It is super easy to install and use

Here are the simple steps to install:

1. Run Eclipse, go to Help menu, and click “Install New Software…”
2. Click the “Add” button, enter “http://pydev.org/updates” in location and click OK
3. Select PyDev, click Next and follow the install steps accepting the licence

Here is how to make a simple Python test project

1. Go to File menu, click New then PyDev Project
2. Enter your project name, select the grammar version, and interpreter. If you have not set up any yet, just click “Click here to configure an interpreter not listed”, click New, browse to your python.exe directory and click Ok.
3. Right click your project, then New, then Source folder
4. Right click your source folder, then New, then PyDevModule

Porting a VS project to Eclipse CDT (C++)

Few days ago I tested Eclipse C++ (see here) and was actually pleased with the results. Today in office I needed to work with a C++ pipeline tool and decided to convert it to Eclipse C++ while I was working on it.

Overall it went really smoothly and I like Eclipse C++ more and more. Here are the few things I needed to do.

  1. To edit your project building properties, go into the “Project Explorer” panel on the left, right click your project and select “properties”. Then unfold “C/C++ Build” and select “Settings”. The good thing is you can edit both debug and release configurations to set identical settings, so in the “Configuration” drop down select “[ All Configurations ]”
  2. There is no #pragma comment(lib,”mylib.lib”) to link with libraries. So make sure you go to “MinGW C++ Linker” then “Libraries”. Add the name of the library without .a nor .lib in “Libraries”, and add the path in “Library search path”. The specific project I worked on used DevIL the image library and the linker had no problem linking with the provided libraries. Again the trick is to remove any suffix from the library name. If you enter Devil.lib into the settings, the linker will complain about not being able to find the library. Change that to Devil (without .lib) and the linker will find it.
  3. If you compile your project with default settings you will most likely need to ship it with libraries like libgcc_s_dw2-1.dll or libstdc++-6.dll, so again in the settings page, go to “MinGW C++ Linker” then “Miscellaneous” and in the linker flag box add -static. This will increase a bit your executable size but will make it easier to deploy and overall will be smaller than shipping with all the Dlls needed.

Overall I already made two small projects with Eclipse CDT this week and it went as smoothly as it should, and as smoothly as VS, so the next test will be a project with more Windows / DirectX specific code. But so far, Eclipse CDT seems like a perfectly viable free alternative to VS.

Javascript Performance Playground

Javascript is surprisingly fast in modern browsers, but as for every language you will need to optimize at some point. I found a very useful website called JSPerf which provides a test framework, allows you to create and edit existing test framework, provides online persistence allowing to get wider results than if you ran them by yourself.

A simple example. I was wondering what was the fastest way to convert from a float to an int in Javascript. I had several options, and the best way was not as suggested by some colleagues. And the way to make sure what is the fastest way is to benchmark, and not to guess. The results of the benchmark are here, you can test on your machine by running the test. Your results will be stored and shared with everyone, enhancing the global knowledge. The fastest I found is the following, if you find faster, please let me know.

myInt = myFloat | 0

A second example. What is the fastest typed array to write and read integers? The results are here and sadly there is no fastest answer for all browsers. So if you do not want to complicate your code with browser tests and everything that goes with it, the best solution is to use the following.

var myArray = new Int32Array( size );

A third interesting example about the fastest way to clear a buffer and copy it in a canvas every frame. The results are here. The fastest way is to create a working buffer outside your frame, create another one initialized with the values you want, and copy using the method set every frame.

At initialization time

var external_data_1024 = ctx.createImageData(1024, 1024);
var external_source_1024 = ctx.createImageData(1024, 1024);

During the frame processing

ctx.putImageData(external_data_1024, 0, 0);

Finally a word of caution, make sure you benchmark the right things.

  1. The initial float to int did not include the two solutions I added to the benchmark |0 and >>0, so having something that look fast does not mean you have found the optimal solutions.
  2. The original array speed test included array creation for the write, so the test was measuring creation+write instead of just write, and the results were very different. So make sure to measure what really matters to you.
  3. The last example was a subtle example of the points #1 and #2 above. The initial “fastest” function included unnecessary code in clear_canvas(), adding a new test without that code proved to be faster. The loop test was looping with a double indirection “data.data[i] = 0;”, removing that indirection made it much faster and measured the real filling time instead of the sum of filling time+overhead with “buffer[i] = 0;”.

Testing the performance of Javascript

I had an interesting conversation last Sunday with a friend about the different technologies available on the web, and we talked about benchmarking them with simple things. So I decided to give Javascript a try, and here is what I came up with. You can get the source code here.

To give credit where it is due, I looked at the code of this sample to get a quick start and learn the basics. I will share my learnings in a future blog post.

Trying Eclipse for C++

I have been doing a lot of Java and Flash programming recently and both development environment have been relatively enjoyable with the Eclipse framework, so I decided to give it a try for C++. Yes I know there are other alternatives for C++ like Netbeans, Codeblocks, Codelite, and even MonoDevelop now. But having to be familiar with a single GUI is actually quite enjoyable.

  1. First download and install MinGW it will be your compiler. Untick everything except C compiler and C++ compiler.
  2. Get the latest Eclipse CDT it is the eclipse environment customized for C++.
  3. Launch Eclipse, open the “File” menu, then “New”, then “C++ project”
  4. Expand the Executable box (click on the left of it, sometimes the triangle does not show up on Windows 7). Select “Hello World C++ Project”, put a name, select “MinGW GCC”, and click Finish.
  5. Go to the “Project” menu and select “Build Project”.
  6. You should now be able to run the project and see “Hello world” in the console.

This seems all good, but actually when you try the simplest things like printf, the environment will have many issues like text not showing up. If you google for it you will find pseudo solutions recommending setvbuf( stdout, NULL, _IONBF, 0 ); but this makes things worse and create a whole lot of issues. The best is really to emulate the VS behavior and to have a separate console if you need one.

  1. Create gdbinit.txt in C:\MinGW\bin and put “set new-console” without the quotes in it.
  2. Go to the “Run” menu, then “Debugger” tab, and put  “C:\MinGW\bin\gdbinit.txt” without the quotes in “GDB Command Line” and click “Apply”

Now your printf will output to the external console as they do in VS, and OutputDebugString will output text to the Eclipse console.

Overall although the printf issue is kind of bad, at the end Eclipse seem like a good environment. I like things like CTRL+SHIFT+N to automatically add a missing include file. Next time I need to write a small C++ tool, I will give it a more serious try.