Think of a language that French of Russian that has words agree grammatically with each other and the form of a determiner or an adjective will necessarily constrain the words that follow. Writing these languages with no clear idea of where you’re going is extremely hard and would often result in extremely unnatural output or incoherent language that “jumps around” from one concept to another just in order to keep the sentences grammatical. In fact when I write in these languages myself I often find myself backtracking because I realize that I wrote myself in a corner by not using the right gender for instance. ChatGPT can write these languages quasi-perfectly and naturally.
Stephan Wolfram oversimplifies things when he says that ChatGPT works one word at a time. He hides this complexity here (emphasis mine):
So let’s say we’ve got the text “*The best thing about AI is its ability to*”. Imagine scanning billions of pages of human-written text (say on the web and in digitized books) and finding all instances of this text—then seeing what word comes next what fraction of the time. ChatGPT effectively does something like this, except that (as I’ll explain) it doesn’t look at literal text; it looks for things that in a certain sense “match in meaning”.
That’s pretty hand-wavey. Isn’t “matching things in meaning” what I’m doing while I write this text?
Later in the same article Wolfram says:
If you had a big enough neural net then, yes, you might be able to do whatever humans can readily do. But you wouldn’t capture what the natural world in general can do—or that the tools that we’ve fashioned from the natural world can do. And it’s the use of those tools—both practical and conceptual—that have allowed us in recent centuries to transcend the boundaries of what’s accessible to “pure unaided human thought”, and capture for human purposes more of what’s out there in the physical and computational universe.
Basically if I interpret this correctly he says that the main limitation of neural nets is not being able to create or use external tools. AIs can’t (currently) build other AIs for instance, or modify themselves.
And a few paragraphs before that:
(For ChatGPT as it currently is, the situation is actually much more extreme, because the neural net used to generate each token of output is a pure “feed-forward” network, without loops, and therefore has no ability to do any kind of computation with nontrivial “control flow”.)
I think in the end that’s the part we’re all talking about, this “feed-forward” network that cannot really reflect on what it’s outputting. I just feel that explaining this as saying that “ChatGPT only generates one word at a time” is needlessly reductive and doesn’t explain well the current limitations of the model.
