How to Extract Text from an Image or PDF File
Have you ever wondered if there was a way to extract text from an image or an ebook? When would someone actually need to do this?
Well there are many different reasons, but the main thing I use it for is when I am taking notes. If you’ve noticed, some ebooks won’t let you copy and paste text from them because of the security features they have. This can be really annoying when you’re simply trying to take proper notes from an ebook that you are reading. The same thing goes for some checklists or other study materials that are actually images.
I personally like to take notes on all study material I purchase, whether it be ebooks, video courses, or anything. In this post, I want to share the methods I use to extract text from an image or PDF file. Now, I am sure there are many ways to do this, probably much better than mine, but what I have found works really well for me, and doesn’t take very long, so I figured I’d share it with you.
>> Outsourcing Simplified – Learn More <<
First Step: Screenshot
Before you can extract text, you’ll need to grab a screenshot of the text (and ONLY the text), without any images in the background or elsewhere. This works best when the text is on a white background. To do a screenshot of your screen, you can press the “print screen” button on your keyboard, and then open up a graphics editor like MS Paint, and paste it in there EDIT > PASTE. then you can just crop off what you don’t need, and save the image when your finished. I’ve used that method for YEARS, but it’s a pain in the butt, so I’m going to show you a much faster way as well.
I use a program called “Snagit” that’s made by the Techsmith (Same people who made Camtasia), and it makes everything much easier. It does cost around $50 though, so if you can’t afford it, just go with a free option. However, I highly recommend it, as it’s made my life a lot easier! They have versions for both Windows and Mac here: http://www.techsmith.com/snagit.html. And yes, they do offer a free trial With Snagit, all you have to do is press the print screen button, and highlight your text, and it will save the image for you instantly!
There are a few screen capture plugins for firefox as well that are free, and you can check them out here:Firefox Screen capture Addons. (I personally only use Snagit, so you’ll have to test some out yourself.. If you find one that works, please share it in the comments below for others to use!)
Second Step: Extract the Text!
Now, this is the part that took me forever to figure out. Snagit, as much as I recommend it, has a feature that is supposed to do this, but I can’t seem to get it to work. Luckily, I found two really cool websites that help extract text from images.
The first one is: http://www.free-ocr.com/ . This site is really easy to use, but I have about a 25% success rate with extracting text. It’s good to have more than one site just in case one of them is giving you problems.
The second site I use seems to be more reliable, and that website is: http://www.onlineocr.net/default.aspx .
This one is the one that I use the most. The text that it spits out is almost spot on, but you will likely have to fix a few words that it didn’t read properly. You will notice that when one doesn’t work, the other one probably will. Sometimes neither will work (pretty rare for me), but then I just type out what was missed by hand. No big deal.
That’s really all there is to it. I know it’s simple, but I also know that at one time, I didn’t have a clue how to do this, so I hope this will help some of you out!
Leave me a comment and let me know what you think, or if you have another solution you’d like to share with the rest of us.