Thursday, August 2, 2007

Scanning Documents to Word

Ginnie Gessford (Tutoring/LRC) asked for assistance on scanning a 20 page document into Microsoft Word so she could edit it.

In LR 110 we have just the right tool for this type of task. The Epson GT-2500 document scanner will scan single-sided pages into an Adobe Acrobat document or TIFF or JPEG. The image types could be helpful, but the real power is PDF. We purchased this scanner because it is Network-Aware and any computer in our lab can access the scanner.

Our first try was to use the Epson Scanner software on a Windows computer. The scanner signaled it understood the request, but the pages did not feed into the scan bed. (I'll have to investigate further). Our next try was to use Adobe Acrobat Pro v8 and the Create a PDF > From Scanner option. This time a dialog box asked to select the device and TWAIN did not work ... but Epson GT2500 did. Other options in the dialog box were left at the original settings and ... volia! the scanner responded by feeding pages.

After the last page, Acrobat began automatically processing the IMAGE into LIVE TEXT by OCR (optical character recognition). Now, how does this Acrobat (PDF) document get to be a Word file? Under the Acrobat > File > Export menu are three options of interest: 1) Microsoft Word, 2) Rich Text, 3) Text > Plain Text.

The Word export created a faithful rendition of the pages except that the text is in individual boxes on the page. What a pain to edit. The Rich Text version did the same. The best option for words that flow into paragraphs and can easily be selected is Plain Text. Now it is easy to select text, set paragraph styles and format text as needed.

Our experimentation, learning and end result took less than 20 minutes. Ginnie opened Outlook Web Access and emailed herself the final Word document as an attachment.