Skip to Store Area:

Click2Scan Ltd

The best way to batch convert a folder of tiff or pdf to searchable PDF

Searchable PDF is fast becoming the most desired way of reproducing scanned pages. With a standard tiff or PDF file, the image is in effect a photograph. Searchable PDF creates a text layer under the document that can be searched to help you jump to a keyword or phrase. 

It is possible to create searchable PDF's as you scan, and Canon Capture Perfect is a great program for this. However this becomes more difficult when you are scanning at speeds over 50 pages per minute. The PC driving the scanner will struggle to keep up and will slow down the scanning process. The answer then is to batch convert your scanned files as a post scanning operation. 

Below we review the best ways to take a folder of scanned images and convert them to searchable PDF. This can be a slow process because each scanned page needs to be read and OCR'd (optical character recognition). We have tried a number of applications and tested the reliability. Our endeavours are as follows - 

 

 

Desktop Programs - £100 +

 

Adobe Acrobat Professional - £170 approx
 

The professional version of Acrobat has the ability to Batch Process files. Not many people realise that you can quickly create a bespoke processing job to convert a whole folder of files to searchable PDF. This works for both PC & Mac.

What we like about Acrobat OCR & PDF processing is: 

1) Reliability - gets on with the job and can be left unattended. Errors are handled and the process continues. 

2) Visual - you can see how you are progressing and the results at the end show any errors experienced. 

3) Accurate - the OCR engine is accurate and text is read well. 

4) Page Orientation - Acrobat very cleverly rotates pages to the correct way which means if you are processing as part of a post scanning job, this tidies up your output and means the scanning person doesn't have to spend time rotating pages afterwards or you don't need to employ unnecessary overhead on the scanner driver. 

 

 

eCopy Paperworks £100 approx

eCopy Paperworks is a neat application that is used in Enterprise to read and annotate PDF files. A little known feature though, hanging off the File menu is a "Convert" program that allows conversion of Tiff or PDF files to Searchable PDF. eCopy has recently been taken over by Nuance and the recognition engine has changed from IRIS to Nuance. 

What we like about eCopy Paperworks is :

1) Whole folder structure and sub folders can be read and the folder structure replicated for the sPDF version. This works really well if you need a complete carbon copy of a folder and all it's sub folders. 

2) Easy to setup - point to a source folder, select a target path for the converted files, tick whether yiu want original files deleted. 

 

 

Abbyy FineReader Corporate Edition - £300 +

This addition of the award winning FineReader has a batch conversion facility that they classify under "Scheduled Jobs". From here you can select a folder and then have the folder converted. Not as easy to use as we would like and no flexibility for sub folders. 

 

 

Server Programs

 

Abbyy Recognition Server


This server product is a robust document server program, designed to handle significant volumes of information and to process 24/7. Whether you are creating searchable PDF's or extracting barcode values into an XML file, Abbyy will handle your volumes in an efficient and reliable way. 

Pros: Robust & Reliable. Use of multiple cores.

Cons: Complex pricing structure based on volumes.

Read More

 

Iris Document Server 9

The program is relatively easy to set up and use and can operate by setting up "jobs" which can either monitor hot folders or process a specific folder. There are a wealth of options here for the user with various different ways of receiving the extracted data - Word, Excel, XML, sPDF etc. You can also use this program to separate your documents using barcodes or blank pages and for the purpose of creating simple searchable PDF it will reproduce a folder structure of identicle images. 

Read More

 

Maestro Recognition Server 

Utilising the Nuance recognition engine, this solution works in the same way as the other two server editions above and is again a robust and reliable solution for bulk conversion of scanned documents. 

It is very simple to set up and the recognition is user friendly with an ongoing log file showing you whats being recognised and the progress of the task. 

This solution has a flexible pricing model based on volume and is well worth considering for those with significant volumes. 

Read More