THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

John Paul Cook

Free OCR

Today I saw something on the screen that was text. It was needed as text, but I couldn’t select it. OCR to the rescue! If you have OneNote, there’s an easy way to do OCR without any extra software. There are also free websites that can convert screen captures into text.

But wait, there’s more! You can make the text in the image searchable in OneNote.

Let’s begin with this sample screen capture. Once it is in your copy buffer, paste it into OneNote.

screenCapture

Figure 1. Screen capture of text.

After pasting the image into OneNote, right-click the image and select Copy Text from Picture.

image

Figure 2. Invoking OneNote’s OCR capability with a right-click. Notice the menu option to make the text searchable.

When you paste the text into Word, you’ll notice that it doesn’t have any formatting. Notice that although the text in the Word document below says “This is not text”, it really is. This is yet another example of what you observe empirically not matching the truth. What appears below really was text that was pasted into Word.

image

Figure 3. Screen capture converted to raw text pasted into Word.

 

If you select the OneNote option to Make Text in Image Searchable, you can add some really useful functionality to your document. Notice that OneNote’s search starts finding matches before you enter all of your search string. If you are going to rely on the search feature in OneNote and it doesn’t work quite like you expect, extract the text and see if OneNote accurately converted it to text. Sometimes the document being converted can’t be accurately converted to text, which can explain suboptimal search results.

image

Figure 4. Searching text embedded within an image in OneNote.

There are limitations to what OneNote OCR can do. Light text against a dark background may result in no OCR, no text. Look at the following example.

image

Figure 5. Light text and dark background causes difficulties for OCR in OneNote.

There is a workaround to enable OCR to work. Change the color scheme, which can be done easily with Paint. Paste the screen capture into Paint and right-click to bring up action menu that ends with Invert color. It will transform light to dark and dark to light. Copy the inverted image and paste it into OneNote to do the conversion.

image

Figure 6. Inverting colors to enable OneNote to perform OCR on light text with a dark background.


On the Insert tab, OneNote has a built-in screen clipping tool to make screen captures easy to do from within the tool.

image

Figure 7. Screen Clipping feature in OneNote.

If you don’t have OneNote, there are several websites that will convert image files into text. I like http://www.free-ocr.com/ for converting images when I don’t have access to OneNote. You should not upload any sensitive or confidential screen captures to any website offering OCR services.

Published Friday, March 08, 2013 8:58 PM by John Paul Cook

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Sunil said:

Thank you John.Got a chance to know the hidden feature provided by Microsoft(OneNote) which would be very useful at work.

March 11, 2013 11:10 AM

Leave a Comment

(required) 
(required) 
Submit

About John Paul Cook

John Paul Cook is both a Registered Nurse and a Microsoft SQL Server MVP experienced in Microsoft SQL Server and Oracle database application design, development, and implementation. He has spoken at many conferences including Microsoft TechEd and the SQL PASS Summit. He has worked in oil and gas, financial, manufacturing, and healthcare industries. Experienced in systems integration and workflow analysis, John is passionate about combining his IT experience with his nursing background to solve difficult problems in healthcare. He sees opportunities in using business intelligence and Big Data to satisfy healthcare meaningful use requirements and improve patient outcomes. John graduated from Vanderbilt University with a Master of Science in Nursing Informatics and is an active member of the Sigma Theta Tau nursing honor society. Contributing author to SQL Server MVP Deep Dives and SQL Server MVP Deep Dives Volume 2.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement