What is OCR?
If you’re just joining us, Optical Character Recognition is an automated system that translates an image of text into encoded selectable text. Google uses OCR to scan your pictures and PDF files, it then turns the scan into an editable Google Doc format. Over the past 2 years, Google has been using human input from reCAPTCHA puzzles to increase their success at identifying complex words.
What Languages were added?
Along with the additional languages, Google also improved OCR quality for the 5 previously implemented languages: English, Italian, German, Spanish, and French. The 29 new languages that have been added are the following:
1. Bulgarian 2. Catalan 3. Chinese (Simplified Han) 4. Croatian 5. Czech 6. Danish 7. Dutch 8. Filipino 9. Finnish 10. Greek 11. Hungarian 12. Indonesian 13. Japanese 14. Korean 15. Latvian | 16. Lithuanian 17. Norwegian 18. Polish 19. Portuguese 20. Romanian 21. Russian 22. Serbian 23. Slovak 24. Slovenian 25. Swedish 26. Thai 27. Turkish 28. Ukrainian 29. Vietnamese |
When uploading images or PDF files to Google Docs, be sure to Select the language that the text in your file is written in! To do so, put your file in queue to be uploaded, then Check the box for Convert text from PDF or images files to Google Docs documents. A Document Language drop-down menu will appear, there you can Select your language.
Have you tried out Google’s OCR technology for scanning old family journals, books, or whatever else you have laying around the house? You can also try it out on your iPhone or Android phone if you have the Google Goggles app!
1 Comment
Leave a Reply
Leave a Reply
![](https://www.groovypost.com/wp-content/uploads/2017/10/groovy-logo.png)
apteka
April 14, 2011 at 7:28 pm
According to the World Economic Forum’s 10th annual Global Information Technology Report, at