Straighten text in scanned documents with Deskew

Deskew-727x1024

Deskew is an open source package which can detect skewed text in scanned text documents, and output a straightened version. It’s a command line tool, which is, well, inconvenient, but don’t let that put you off entirely -- it’s still probably easier than you’d expect.

The program doesn’t force you to install Tesseract or any other bulky components, for instance. The single 4MB includes Windows, OS X and Linux binaries and you can run any of them right away.

You don’t have to find or generate any sample scans, either, because there are several provided.

Deskew also has a RunTests.bat to pass these sample files through the program, and show you how well it works.

Having that batch file also gives you practical examples of how Deskew can be used. Here’s the first example.

deskew -t a -a 5 -o TestOut/Outa1.tif ../TestImages/1.tif

The -t sets a threshold value, which is set to "automatic" here so doesn’t require any knowledge. The -a value sets a maximal skew to consider of 5 degrees, which again can probably be ignored most of the time, and the other arguments provide output and input file names.

If you’re happy building batch files that probably won’t take long to master, and wide file format support -- BMP, JPG, PNG, GIF, TGA, PSD, TIF, more -- means your scripts will work on just about anything.

Deskew still isn’t convenient to use, but it worked well in our tests, and if you need a document straightening function then it’s worth a closer look.

Deskew is available now for Windows, Linux and Mac.

Comments are closed.

© 1998-2024 BetaNews, Inc. All Rights Reserved. Privacy Policy - Cookie Policy.