Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Google Scribe - Get autocomplete suggestions as you type (googlelabs.com)
93 points by abraham on Sept 7, 2010 | hide | past | favorite | 64 comments


Interesting. It can also be used as a Markov chain text generator. Type one word, then just accept every suggestion.

"In the case of these two types of information that is not appropriate for all users of the catalogue should also be noted that there is anything you would not believe how much I loved them."


You can even get it to repeat itself after a while:

    I have been able to find anything in these search results from RT on your
    Google searches by subscribing to the feed via email to state their case
    and their ownership of their owners and are strictly for viewing and printing
    of these books are nothing but another form of therapy for these patients
    is not known whether these are the only ones who can not afford to pay for
    their own users and groups to their Friends / Favorites list yet, so I'ma
    keep popping up in their own right and do not want to be related to their
    particular field or industry in which they are attached to their respective
    owners and are strictly for viewing and printing of these books are nothing
    but another form of therapy for these patients is not known whether these are
    the only ones who can not afford to pay for their own users and groups to
    their Friends / Favorites list yet, so I'ma keep popping up in their own
    right and do not want to be related to their particular field or industry in
    which they are attached to their respective owners...


Congratulations, you just found a fixed point attractor.


I think a fixed point attractor would be a word or phrase where it goes naturally but then doesn't have any more suggestions: a dead end.

This is more of a periodic-solution attractor.


"Facebook helps you connect and share with the people in your life and your business could subject you to lawsuits and leave your operating systems without patches"


Typing in "Larry and Sergey" produces

Larry and Sergey Dump Shares of the Company and their respective mascots, events and info in the top right corner of the world's largest international multimedia news agency

but after a few more phrases the story gets into a loop.


"It was a dark and stormy night in the city of New York City Area Directory of All Stores International Sites & Affiliates Forum & Resources for Teachers and Students at the University of California at Berkeley and the University of California at Berkeley and the University of California at..."


The cadence reminds me of a certain character from BSG. You tell me:

Kara Thrace and here is their website and buy this product again and again and I'ma let you finish but Beyonce had one of these days I'll bet your life on the road today and they are nothing but another form of therapy for these patients is not known whether these are the only ones who can not afford to pay for their own users and groups to their Friends / Favorites list yet.

That was... trippy.


I did this until it suggested "I'ma let you finish but Beyonce had one of the..."

I guess this particular phrase has been on the internet so much that it skews the machine learning algorithm.


I have another question for yourself at their website and buy this product again and again and I'ma let you finish but Beyonce had one of these days I'll bet your life on the road today and


"Hacker News Mashups: Deliver Your Project Faster with Virtualized Data Services Across Internal & External Sources of Funding for these projects"


Barack Obama is a master at grabbing and keeping his audience's attention to the fact that the two are not the only one who can not afford to pay for the cost of the project is to develop a new generation of protein database search pro tab by The Red Jumpsuit Apparatus lyrics are property and copyright of the article is that I have no idea what to do.


Indeed. The first sentence it generated for me was not a Markovian nonsense phrase, but rather a phrase that appears on a very large number of web pages: "The following content has been identified by the YouTube community as being potentially offensive or inappropriate." This gives some insight into their seed data.


Their seed data is probably the n-gram dataset that they released (or maybe a more recent one):

http://googleresearch.blogspot.com/2006/08/all-our-n-gram-ar...

Microsoft also has an n-gram dataset that you can access through a web service:

http://web-ngram.research.microsoft.com/info/


"to their libraries and their users are not likely to become an editor... and really good food and good service is our number one priority is to provide and... they are nothing but another form of therapy for these patients is not known whether these are the only ones."


To prevent loops you can vary the selection, not just picking the top item, but perhaps doing something like 1-2-3-2-1-2-3

This actually produces less intelligible sentences, but more randomness

"dogs and azithromycin zithromax one dose chlamydia and gonorrhea are the most common form of dementia"


Just few spacebar and enter wrote this sentence: "What the hel* is this guy on their team and their fans are the best in the world of their owners and are strictly for viewing and printing of these books."


hello my name is brenda and I am anxiously awaiting their arrival in the United States and Canada


I'd actually like to see the exact opposite: an editor that warns me every time I use a too common and worn out phrase or sequence of words.

That shouldn't be too hard to do, should it? :)


1 I'd actually like to see

2 the exact opposite

3 worn out phrase

4 that shouldn't be too hard

5 should it?


The other thing it needs (which I also wish I could use more precisely in Google search) is some concept of punctuation beyond the occasional comma; playing with it seems to result endless run-on sentences. But hey, early days.

I wonder if/how it selects for linguistic quality? While it won't be too useful to the general public if it defaults to the dense academic language of textbooks, I too was a bit startled by the frequency with which 'I'ma let you finish but Beyonce had the best syntactic parser of all time' keeps popping up.


I shared a similar app here on HN a while ago that used Google's own suggest feature to guess the next word or two: http://chir.ag/projects/ktype/

KType was a demo for something like Scribe that I wanted to make for disabled users.


http://imgur.com/8YJIB.png

Looks like it includes someone's source code, probably that of the page itself?


I thought perhaps someone was trying to see Scribe with XSS exploits.


I tried it with the entire alphabet.. you can see it here: http://blog.cankoklu.com/google-scribe-abcs-nothing-but-anot...

The attractor does seem to be "nothing but another form of therapy"..

Almost all letters end in this loop..


There are so many things to say about this. Thanks to Google for creating this tool which I must say still seems a bit random in its objectives though.

I think the main issue here is that logic behind scribe is based on statistical frequency of co-occurrence of strings rather than any semantic logic. Granted, automating the semantics here could be tough and very expensive with the amounts of data available to Google's. However, there are some ways to do that if the index can be structured more efficiently and by using low-level semantic logic like taxonomies - I wrote a white paper about this if anyone is interested: http://www.exorbyte.com/index.php/White-Papers/ajax-incremen...


Interesting, thanks. I'm not sure if I agree, though; I think our natural understanding of language is acquired mainly though statistical inference, even though it may be encoded in the brain using a semantic taxonomy for efficiency.

I doubt that a lot of people will use this, although the obvious place to deploy it is in mobile communications, where per-word prediction is still very slow and text entry is very inefficient. This seems 'good enough' for most functional communications. But I would imagine that it is or will be running in the background soon on all Google pages featuring text entry, like docs or mail, because that will provide a huge flow of data to refine their models against.


From my very limited knowledge of language acquisition, my guess is you might be wrong (although I'm happy to be corrected). Don't human children acquire language much faster, more accurately and from poorer signals than would be possible via statistical inference?


I could be - I'm no expert on this either. My understanding of academic linguistics is that there are two main schools of thought: on one side you have Noam Chomsky as the most famous exponent, saying that all languages have a sort of universal grammar that somehow reflects our brain structure, and on the other the behaviorism of BF Skinner, that languages are largely arbitrary, syntactic processing is learned behavior, and complexity is both a result of and a selector for intelligence as an evolutionary trait.

I don't have a really strong opinion, though I lean a bit towards the behaviorist approach. There have been some suggestions from ethnomusicologists that small children everywhere sing the same sort of melody on the same scale, which inclined me to the opposite view for a while, but I think that was more of a theory than a finding and still needs evidence.


I must say I quite disagree too. I know a thing or two about Computational Linguistics and Statistical Inference is really a small potion of the corpus of work there. But the tool is funny and maybe useful nonetheless. Far from baked in my opinion though...


Agree partially. Statistical inference is at the heart of language acquisition but language processing is far more complex. Semantics always existed though. Long before we built computers, and long before we found out that computer programs and the hardware running them were somewhat limited in their ability to process language as fast and cleverly as our brains do. I look forward to the day this happens. In the meantime, I believe that the very nonsense that this nifty scribe puts out reminds us of how limited the automation of language still is.


Neat! So is this just a fun project or does Google plan to use this technology in any of their products?

Edit: You can use Scribe on any web page. From the help section: "Google Scribe can be used anywhere, on any web page, using the Google Scribe Bookmarklet.

From the Google Scribe home page, drag the Google Scribe Bookmarklet (located below the text box) to Bookmarks toolbar (or Favorites toolbar depending on your browser). To use Google Scribe on a web page, click on the Google Scribe Bookmarklet. Google Scribe will then enable itself on the active text field on the webpage. Enabled text fields display the icon at top end corner of the active field."


So is this just a fun project or does Google plan to use this technology in any of their products?

It might make sense to use this as one predictor for transcribed text in Google Voice. Then again, once the results begin to diverge, this would amplify the problem.


Seems as if they blacklisted inappropriate suggestions - the most outrageous thing suggested so far was "why do atheists gravitate towards stuffed animals"


"Homosexuality is a" yields interesting results


Be sure to check out the bookmarklet. It works on any text field (I'm using it now).


Start with a word, hit tab 10 times, paste your result.

> Moneywatch MovieTome mySimon NCAA Showtime SmartPlanet TechRepublic The Insider on TV and radio.


Hacker News is an online business directory that has the most complete and up to date with the latest active trackers for this torrent download


Not sure if its the UI. But the auto-complete is actually more obtrusive than what I would expect from a text editor. If its gonna pop a suggestion box for every keystroke, it hinders my writing flow. It would be better if the editor could suggest better phrases (than the one I already wrote), or better words to make the text more formal.


it predicts what you're going to type next. this is really nothing new. perhaps the most interesting use of predicting input in this fashion is dasher:

http://www.inference.phy.cam.ac.uk/dasher/

it uses this prediction to make it efficient for people who cannot type, type.


predicting text and predicting text well are two very different things.


what's the relevance of that comment?

is your claim that dasher does not predict well?

or rather, that google scribe predicts better than dasher?

it would surprise me: dasher adapts to the linguistic habbits of the particular user. it learns in real time.

google scribe doesn't appear to do this. it doesn't know who i am.


my point is that saying "nothing new here" often dismisses what really matters, the quality of something.

I've never used dasher so I can't comment on it


It's too slow to be useful -- suggestions don't keep up with typing at a reasonable speed. Plus, the fact that 10 suggestions need to be scanned before you can proceed slows the process even further. Maybe it's useful as a way to work around writer's block or to remember a word that's slipping off your mind.


If only it auto-inserted a space after words it would be so much more efficient.


Neat technology. If it was just slightly more responsive, I am sure it could be interesting for some use cases such as helping out non-native English speakers write with proper grammar and diversified vocabulary.


Hmm, doing this seems to be slower than actually writing a text.


I would imagine the appeal is less towards native/fluent speakers at this point, and more for people who are learning or uncomfortable with english grammar.

I have several friends who use Google to as a first editor of their grammar by making sure that phrases they aren't sure are correct return lots of search results. This seems like a convenient way for them to streamline that process.


That's really a clever trick, though my initial reaction was horror at the idea of checking one's grammar against the Internet.

Looks like Scribe is only looking one or two words ahead, though - it's not actually checking what you've written, just throwing the Google Suggest box into your writing.


I use it all the time to check difficult words. It works well. Google usually even suggests "did you mean" the right spelling when you get it wrong.


This works especially well for words you can’t find in the dictionary or when you want to know whether the context in which you use a word is appropriate.


Most of Google's users don't know how to touch type. For hunt-and-peck folk, this is a huge time-saver.


On a desktop keyboard maybe, but I'd much prefer this method of input on a phone with a touch-screen keyboard


Try Swype.


I would be interested to see if they can improve the results speed to the point where choosing the results would be faster than typing.


If you type a semi-colon, you get ";:::;" - Where did they mine that from, it looks emacsy!


Funnily enough, the auto-completion censors profanity.


once upon a time in the world of technology we were not significantly different from those that have already passed their driving test



Reminds me of Clippy.


Today


From


today and from


Osama bin Laden dead or alive xtreme beach volleyball.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: