This tech seems easy to duplicate. Google must want to implement their OCR capabilities ASAP. Maybe one day I’ll be able to upload my own documents and get them OCR’ed by Google. I have plenty of one of kind real estate books that I’d love to get digitally transcribed and hosted on my website [link removed]
GL with the acquisition Google. |
I'm happy, but I hope the improvements they make are released for use by other parties... |
tech is easy to duplicate-- userbase is not. also, potential patent violations or something? ::shrug:: |
Google is Only Wasting his Money :P :D |
they can afford to. but also, no, you're wrong, i don't think they are. they always have a plan! ....except with lively...what the fuck, google? |
I hope they will TRANSLATE the words :) |
I think lively was kind of like... Ninianne Wang comes in and does her thing really well for Google and since she had roots in MSFlight Sim they let her do her own thing with lively expecting great results... |
Related rumor that Google will capture Brightcove to.
<<Google is in talks to buy Web video provider Brightcove for $500 million to $700 million, PBS MediaShift editor Mark Glaser says on Twitter, citing a source with knowledge of the deal.>>
http://www.businessinsider.com/google-to-buy-brightcove-2009-9 |
i think that goes in a new thread websonic... |
Did Google acquired a company specialized in OCR a few years ago? |
It's important to remember that reCAPTCHA already has a lot of users worldwide. They didn't buy the technology, but the community (once again!). Of course they can make a strong OCR in a couple of weeks, but this is easier for them, because it's already up and working. |
According to reCAPTCHA.net most of the software is open source. reCAPTCHA originated as part of Carnegie-Mellon University's CAPTCHA project.
"reCAPTCHA is mostly powered by open source software. " http://recaptcha.net/aboutus.html |
CMU's CAPTCHA page has several links. http://www.captcha.net
Three types of CAPTCHA can be tried.
reCAPTCHA http://recaptcha.net/learnmore.html
SQUIGL-PIX (SQ-PIX) requires identifying images. http://server251.theory.cs.cmu.edu/cgi-bin/sq-pix (flash) The CAPTCHA the test asks you to outline all of a particular type of object in three images. The way it is organized, if the server only knows some of the correct answers, the results could be used to have humans do a task when it is too hard for a computer.
ES-PIX requires naming what several images have in common. http://server251.theory.cs.cmu.edu/cgi-bin/esp-pix/esp-pix
There is also a link to a page with games. http://gwap.com/ "Our new site, GWAP.com contains many addictive games that help computers learn to think more like humans. You play the games, computers get smarter!"
|
Is it that much useful for OCR? it seems it would just become some "reCaptcha format" decipherer, that is like a specialist in just one specific task (here in one specific font type). |
Just to point out.. ReCaptcha's Prof. Von Ahn also licensed his ESPGame to Google. It is now Google's Image Labeler.
|
Here is a link to a Carnegie Mellon University announcement of this. http://www.cmu.edu/homepage/computing/2009/summer/perfect-fit.shtml |