Monday, September 19, 2011

How to Batch Convert Documents at the Command Line

Convert it!There comes a time when you need to convert one or more files to another format. Say, for example, you have a bunch of .rtf files that you want to turn into OpenDocument files. Chances are you don’t need to do that too often, but when the time comes, opening each file in a word processor and saving it can be a real chore.

If you use OpenOffice.org or LibreOffice, then you can save a lot of time by letting a command line utility called JODConverter do the work for you. Don’t let the fact that it’s a command line tool scare you. JODConverter is easy to use and fast. It’s also very effective.

Let’s take a look at how to use it.

Getting Started

Before you go and grab a copy of JODConverter, you’ll need OpenOffice.org or LibreOffice and theJava Runtime Environment for your operating system installed on your computer. Because it’s written in Java, JODConverter runs on Linux, Mac OS, and Windows.

Once you’ve made sure all of that is installed, download a copy of JODConverter. There are two version available: 2.x and 3.0. The main difference between the two version is that version 3.0 supports both LibreOffice and OpenOffice.org. It also seems, from my unscientific observations, to be a bit faster than version 2.x.

Extract the contents of the archive you downloaded to somewhere on your hard drive. For example, if you’re using Linux and downloaded version 3.0 of JODConverter, you can extract the archive to the directory /opt. JODConverter will be installed in /opt/jodconverter-core-3.0-beta-4.

Supported Conversions

JODConverter is very flexible. It can convert between the following formats:

  • Word (.doc) to or from Word OpenDocument Text (.odt)
  • Excel (.xls) to or from OpenDocument Spreadsheet (.ods)
  • PowerPoint (.ppt) to or from OpenDocument Presentation (.odp)

The utility can also handle conversions from RTF and WordPerfect, older OpenOffice.org files, and can even convert all supported formats to PDF.

Doing a Basic Conversion

Now that everything is installed, you’re ready to go. The first thing you need to do is start OpenOffice.org in the background. To do that, open a terminal window and then type the following command:

soffice -headless -accept="socket,host=127.0.0.1,port=8100;urp;" &

Stay in the terminal window and change to the directory containing the files that you want to convert. Then, run the following command:



java -jar [path to JODConverter]/lib/[.jar file] source_file output_file

The path and the name of the .jar file will be different depending on the version of JODConverter that you’re using. If, for example, you’re using version 3.0, here’s a sample path:



java -jar opt/jodconverter-core-3.0-beta-4/lib/jodconverter-core-3.0-beta-4.jar source_file output_file

Let’s say you have a Word file named Secure_Email_Report.doc that you want to convert to OpenDocument format. Just run the following command:



java -jar [path to JODConverter]/lib/[.jar file] Secure_Email_Report.doc Secure_Email_Report.odt

Doing Batch Conversions


Using JODConverter to convert a single file is a waste. You can just as easily do the deed in OpenOffice.org or LibreOffice. But converting large numbers of files is where JODConverter shines. And that’s easy to do.


How? Just run this command:



java -jar [path to JODConverter]/lib/[.jar file] *.input_type -o output_type

In the command above, input_type is the extension of the files that you want to convert andoutput_type is the extension of the target format. For example:



java -jar [path to JODConverter]/lib/[.jar file] *.doc -o pdf

This converts all Word files (*.doc) in a directory to PDF files. All you need to do is substitute the extensions of the files types that you want to convert from and to.


Scripting the Conversion


Typing that long string at the command line can be a chore. And if you only use JODConverter infrequently, it’s easy to forget the path and name of the .jar file that you need to include in the command.


To get around that problem, you can write a script or a batch file. If you’re running Linux, for example, here’s a good tutorial on shell scripting.


But you don’t even need to do that. If you have Python (a popular scripting language and interpreter) installed on your computer, you can download and use a script named DocumentConverter.py.


To use the script, just type:



python DocumentConverter.py input_file output_file

The script doesn’t seem to work with batch conversion, though.


Final Thoughts


If you’re looking for a fast and efficient way to convert multiple documents to another format, then JODConverter is definitely worth a look. It’s fast, it’s efficient, and it’s flexible. On top of that, it’s fairly easy to use. Even if you only convert documents once in a blue moon, JODConverter is a good addition to your toolkit.

No comments:

Post a Comment