How the quest to type Chinese on a QWERTY keyboard created autocomplete


If Huang Zhenyu’s mastery of a complex alphanumeric code weren’t impressive enough, consider the staggering speed of his performance. He transcribed the first 31 Chinese characters of Hu Jintao’s speech in roughly 5 seconds, for an extrapolated speed of 372 Chinese characters per minute. By the close of the grueling 20-minute contest, one extending over thousands of characters, he crossed the finish line with an almost unbelievable speed of 221.9 characters per minute.

That’s 3.7 Chinese characters every second.

In the context of English, Huang’s opening 5 seconds would have been the equivalent of around 375 English words-per-minute, with his overall competition speed easily surpassing 200 WPM—a blistering pace unmatched by anyone in the Anglophone world (using QWERTY, at least). In 1985, Barbara Blackburn achieved a Guinness Book of World Records–verified performance of 170 English words-per-minute (on a typewriter, no less). Speed demon Sean Wrona later bested Blackburn’s score with a performance of 174 WPM (on a computer keyboard, it should be noted). As impressive as these milestones are, the fact remains: had Huang’s performance taken place in the Anglophone world, it would be his name enshrined in the Guinness Book of World Records as the new benchmark to beat.

Huang’s speed carried special historical significance as well.

For a person living between the years 1850 and 1950—the period examined in the book The Chinese Typewriter—the idea of producing Chinese by mechanical means at a rate of over two hundred characters per minute would have been virtually unimaginable. Throughout the history of Chinese telegraphy, dating back to the 1870s, operators maxed out at perhaps a few dozen characters per minute. In the heyday of mechanical Chinese typewriting, from the 1920s to the 1970s, the fastest speeds on record were just shy of eighty characters per minute (with the majority of typists operating at far slower rates). When it came to modern information technologies, that is to say, Chinese was consistently one of the slowest writing systems in the world.

What changed? How did a script so long disparaged as cumbersome and helplessly complex suddenly rival—exceed, even—computational typing speeds clocked in other parts of the world? Even if we accept that Chinese computer users are somehow able to engage in “real time” coding, shouldn’t Chinese IMEs result in a lower overall “ceiling” for Chinese text processing as compared to English? Chinese computer users have to jump through so many more hoops, after all, over the course of a cumbersome, multistep process: the IME has to intercept a user’s keystrokes, search in memory for a match, present potential candidates, and wait for the user’s confirmation. Meanwhile, English-language computer users need only depress whichever key they wish to see printed on screen. What could be simpler than the “immediacy” of “Q equals Q,” “W equals W,” and so on?

Tom Mullaney

COURTESY OF TOM MULLANEY

To unravel this seeming paradox, we will examine the first Chinese computer ever designed: the Sinotype, also known as the Ideographic Composing Machine. Debuted in 1959 by MIT professor Samuel Hawks Caldwell and the Graphic Arts Research Foundation, this machine featured a QWERTY keyboard, which the operator used to input—not the phonetic values of Chinese characters—but the brushstrokes out of which Chinese characters are composed. The objective of Sinotype was not to “build up” Chinese characters on the page, though, the way a user builds up English words through the successive addition of letters. Instead, each stroke “spelling” served as an electronic address that Sinotype’s logical circuit used to retrieve a Chinese character from memory. In other words, the first Chinese computer in history was premised on the same kind of “additional steps” as seen in Huang Zhenyu’s prizewinning 2013 performance.

During Caldwell’s research, he discovered unexpected benefits of all these additional steps—benefits entirely unheard of in the context of Anglophone human-machine interaction at that time. The Sinotype, he found, needed far fewer keystrokes to find a Chinese character in memory than to compose one through conventional means of inscription. By way of analogy, to “spell” a nine-letter word like “crocodile” (c-r-o-c-o-d-i-l-e) took far more time than to retrieve that same word from memory (“c-r-o-c-o-d” would be enough for a computer to make an unambiguous match, after all, given the absence of other words with similar or identical spellings). Caldwell called his discovery “minimum spelling,” making it a core part of the first Chinese computer ever built. 

Today, we know this technique by a different name: “autocompletion,” a strategy of human-computer interaction in which additional layers of mediation result in faster textual input than the “unmediated” act of typing. Decades before its rediscovery in the Anglophone world, then, autocompletion was first invented in the arena of Chinese computing.

Leave a Reply

Your email address will not be published. Required fields are marked *