The Sentences Computers Can't Understand, But Humans Can

The Winograd schema is a language test for intelligent computers. So far, they're not doing well. MORE LANGUAGE FILES: ruvid.net/group/PL96C35uN7xGLDEnHuhD7CTZES3KXFnwm0

Written with Gretchen McCulloch and Molly Ruhl. Gretchen's podcast Lingthusiasm is at lingthusiasm.com/ - and Gretchen's new book, BECAUSE INTERNET, is available:
Tom Scott
5 months ago
And that's the last in this run of the Language Files! There may well be more later in the year, but for now: thank you so much to co-authors Gretchen and Molly. Pull down the description for more about Gretchen's podcast and book!
I actually would like to offer a major counterpoint here: The interpretation has nothing to do with language experience and could actually be very easily solved with just bruteforce data dumping and a little bit of neural network trickery. A) Context. Languages without gendered pronouns - most languages - solve the issue arising from unclear referents with context. The same can be done in english in some cases. B) Messing around a bit with multiplexing and demultiplexing - something that neural networks desperately need advancement in anyway for all sorts of applications - would allow contextual clues to be picked up easier. C) Context is also important because the natural unit of a single sentence is optimized for our human way of processing and understanding. There is a reason we dont often see sentences like Victor Hugo's famous 810 word sentence. Telephone numbers are capped at 7 digits in part because that's the most that human short term memory can handle. D) Computers are incredibly strong at lexical knowldege and with access to the internet stronger at determining the meanings of words than we ourselves are. While writing this i just said "ok google define suitcase" and it came back with a much more clear and concise definition than i could have ever given. Cross checking is something neural networks absolutely excel at already, so determining that something associated with "carrying/containing" is often importantly associated with size qualifiers such as big or small should be trivial. Meanwhile the cup shape of trophies is the only thing associated with carrying something and thus less likely that it would be associated with language talking about that. Word-space distance has been successfully used to track the development of human consciousness and prediction of mental problems. Very interesting TED talk about this btw. E) Finally if the context doesn't help, if raw data dumping doesn't help, if looking at the sentence as a whole doesn't help, and if word-space distance and using definitions as added input doesnt help....try introducing basic logic. Google had a NOT operator in its search engine for almost two decades. As someone else in the comments stated, this is not yet implemented properly in the language processing software. It is also the crux of the statement in your example. "not fit" is negating and from that point on the only thing required is an understanding of what fit means and that it can only happen if the something that fits is smaller than the container, which is a simple definition that can either be programmed in or can be distilled from text trawling.
I'm making the smartest AI that can solve those sentences.
This is one thing I like about the swedish language. They have a pronoun for the subject of a sentence aswell as the object.
Matrix Mirage So you use souvenir and satchel?
Could this be solve by some how teaching context to code because the cloth and table sentence still remains ambiguous because a table could be used to protect a cloth and the same goes for the other way around
finally, an advantage point for humans
Update on that AI. It now uses different models, but it's enhanced massively within the past few months. The details that AI Dungeon can now keep track of, and use effectively are quite incredible.
Myotis Welwitschii
That is the reason for languages like Lojban. Lojban (and similar languages) not like that because the grammar is more linear, and the problem doesn't appear at all. If humans would agree on using one of those languages it would make things a lot easier.
Ahh, back in the 70's when people thought "teaching computers how to identify objects on a picture is a good exercise for the semester break" and "language is clearly defined and it is easy to teach computers about language". Sometimes I feel where are only half an inch closer to "intelligent" computers then we where 50 years ago.
The trophy, would not fit in the brown suitcase, because it, was too big
Josh Combs 9 days ago
I don't understand computers, so someone smarter than me please explain this. Why can't we teach a computer what a suitcase and trophy are without the computer needing experience with those items? It seems all you'd have to do is create a dictionary for all nouns, but give definitions that would be useless to humans, but adequate for a computer. For instance the definition of Suitcase can be Container. The definition of Table can be Surface. Then when trying to decide which noun the "It" is in the sentence, just look at the preposition before it, if it's the word "on' or "off" then look for the noun that means surface, if it's the word "in" or "out" then look for the word that means container. If the words "on" and "off" were intended as "connected or not connected", as in "the light is off" it would recognize that there's only one noun in the sentence and the definition of "off" would change.
dmdz 12 days ago
I feel like training an AI to properly recognise context and syntax would have to be a grueling, manual process. In fact, you could probably crowdsource it. I'd be totally down to help in crowdsourced AI language training.
Replace 'AI' and 'computers' with 'kids' and you get the life of a parent
Arterexius 28 days ago
The solution seems rather simple, although incredibly long winded to do. "just" define what each and every word means and run the schema on it again. It should - logically speaking - now know that a trophy won't fit inside the suitcase, if it's too big. There's lots of holes in this of course and it needs a lot of defining everything, in order to actually understand what you're trying to tell it
Nedko066 Month ago
GPT-3 came out just recently and it seems to be a big improvement, maybe in this regard too. They have not released yet the trained models but they have a long list of generated examples. They claim humans are fooled without cherry picking. Of course we should take it with a grain of salt until we have actual models but it is worth noting.
So basically we need an Ai with its own body that can grow and experience the _real world_ for some years, before it`ll be any good at understanding things like a Human?
Harry Giles Month ago
Ben Bruland Month ago
Glider Fan Month ago
Most of the examples in the video would not work in Polish. I put the cloth on the table to protect it - in Polish "it" refers to the closes preceding noun, no exception, so you protect the table with zero doubt. To protect the cloth you should say something like "to protect the cloth, I spread it on the table". Same for "the trophy would not fit in the brown suitcase because it was too big" - but the mechanism is bit different. We have grammar genres that help. "Puchar nie wejdzie do walizki, bo jest za duży" - the form "duży" refers to the trophy, suitcase would be "duża". So we can say it other way round, "Puchar nie wejdzie do walizki, bo jest za duża" - bit silly, suitcase is too big for trophy to fit in. If you need perfect fit it may be the case. This system allows us to put words almost in any order in the sentence. "Do walizki nie wejdzie puchar, bo za duży jest" = "Za duży jest puchar, nie wejdzie do walizki" = "Puchar za duży jest, do walizki nie wejdzie" etc.
Stefano Siclari Month ago
This should be what Captcha is. Easy for humans, impossible for computers.
T Trep4 Month ago
Instead of changing computer programming so it can understand human languages, perhaps we should change our languages to be simpler, consistent, and have fewer rules that are full of exceptions.
I don't think "sentient" computers are a threat of any kind, let alone possible. Any thoughts on this?
