How to Extract Text from an Image or PDF File

Have you ever wondered if there was a way to extract text from an image or an ebook?  When would someone actually need to do this?

Well there are many different reasons, but the main thing I use it for is when I am taking notes.  If you’ve noticed, some ebooks won’t let you copy and paste text from them because of the security features they have.  This can be really annoying when you’re simply trying to take proper notes from an ebook that you are reading.   The same thing goes for some checklists or other study materials that are actually images.

I personally like to take notes on all study material I purchase, whether it be ebooks, video courses, or anything.   In this post, I want to share the methods I use to extract text from an image or PDF  file.  Now, I am sure there are many ways to do this, probably much better than mine, but what I have found works really well for me, and doesn’t take very long, so I figured I’d share it with you.

>> Outsourcing Simplified – Learn More <<

 

First Step: Screenshot

Before you can extract text, you’ll need to grab a screenshot of the text (and ONLY the text), without any images in the background or elsewhere.  This works best when the text is on a white background.  To do a screenshot of your screen, you can press the “print screen” button on your keyboard, and then open up a graphics editor like MS Paint, and paste it in there EDIT > PASTE.  then you can just crop off what you don’t need, and save the image when your finished.  I’ve used that method for YEARS, but it’s a pain in the butt, so I’m going to show you a much faster way as well.

I use a program called “Snagit” that’s made by the Techsmith (Same people who made Camtasia), and it makes everything much easier.  It does cost around $50 though, so if you can’t afford it, just go with a free option.  However, I highly recommend it, as it’s made my life a lot easier!  They have versions for both Windows and Mac here: Snagit Software.  And yes,  they do offer a free trial ;) With Snagit, all you have to do is press the print screen button, and highlight your text, and it will save the image for you instantly!

There are a few screen capture plugins for firefox as well that are free, and you can check them out here:Firefox Screen capture Addons.  (I personally only use Snagit, so you’ll have to test some out yourself..  If you find one that works, please share it in the comments below for others to use!)

Second Step: Extract the Text!

Now, this is the part that took me forever to figure out.  Snagit, as much as I recommend it, has a feature that is supposed to do this, but I can’t seem to get it to work.   Luckily, I found two really cool websites that help extract text from images.

The first one is: http://www.free-ocr.com/ .  This site is really easy to use, but I have about a 25% success rate with extracting text.   It’s good to have more than one site just in case one of them is giving you problems.

The second site I use seems to be more reliable, and that website is: http://www.onlineocr.net/default.aspx .

This one is the one that I use the most.  The text that it spits out is almost spot on, but you will likely have to fix a few words that it didn’t read properly.   You will notice that when one doesn’t work, the other one probably will.  Sometimes neither will work (pretty rare for me), but then I just type out what was missed  by hand.  No big deal.

That’s really all there is to it.  I know it’s simple, but I also know that at one time, I didn’t have a clue how to do this, so I hope this will help some of you out!

Leave me a comment and let me know what you think, or if you have another solution you’d like to share with the rest of us.

About Ben C.

Ben is the co-founder of ProfitBlog.com and a regular writer and contributor to the site. He has been involved in internet marketing since 2005, and has created many profitable websites in several different niche markets. While many of his sites are still consistently making money, (and a few have been sold off for a nice payday), he's recently grown a passion for sharing his knowledge of online marketing with others through his various training courses, as well as through blogging. You can learn more about Ben and his story by Clicking Here

Get The Latest Posts

Sign up for the ProfitBlog.com newsletter to get the latest blog updates direct to your inbox.

Follow Profit Blog

Get social and follow us on one of the social networks below.

15 thoughts on “How to Extract Text from an Image or PDF File

  1. Rojish

    Thanks for sharing, this is something really useful in my daily life :) I have the habit of taking photos of important notes using my mobile phone, the ability to extract text from it will make my tasks easier.

    Reply
  2. Ileane

    Hi Ben, I like these tips! Most of the time, if I have a PDF, I just upload it to Google Docs and make my notes there.
    If I want to doc a screen capture I use Contol + Print Screen and paste the image into Microsoft Word. Once I do that I can drag that image into Google Docs and edit it. Your method sounds faster, so I’ll give it a try. Thanks.

    Reply
    • Ben Post author

      Yeah, there are many different ways to do it, but onlineocr.net seems to work best for me. I love Google docs by the way. I use it for just about everything!

      Reply
  3. Sean

    I’m not sure how well it works, but you could save the file and upload it to your Evernote account. Evernote OCR’s all documents you upload so you can search through them.

    Reply
  4. Alec Cairns

    Hi Ben, thanks for tips. I’ve enjoyed reading your blog website and profile. I’m a new kid on the block to online businesses but really like what I’m learning so far. In your profile you mention product creation, what do you mean by that?

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>