Big scary blocks of Chinese text. They are intimidating as heck. They make learning and reading Chinese seem harder than it is. Sometimes just the sight of blocks of Chinese is enough to make one's head hurt.

Even for me. I'll gladly watch a movie or TV show with Chinese subtitles that use short sentences, but when presented with a long letter or article to read, I'll hesitate (even though I can read it perfectly fine).

Over the past few years, a lot of work has gone into making learning Chinese characters and learning to read Chinese easier. Besides better tools like Chinese dictionaries, Chinese readers, annotators, SRS programs, etc. there have been a few novel ideas that might really make a difference.

While working on 3000 Hanzi's Chinese reader, I decided to add a feature that Chinese language purists might find offensive: spaces between words. I did so because a lack of spaces (and word boundaries) makes reading Chinese harder for Chinese learners.

Obviously, I'm not a purist. I'm of the opinion that adding spaces between words is a potential killer feature for Chinese learners. I've even started adding spaces between Chinese words on all the posts I create.

Benefits of Spaces in Chinese

  1. Easier Readability: Adding word boundaries lowers the cognitive load for reading Chinese. Without spaces, one must figure out where words begin and end at the same time as one figures out what the text means. It makes reading--already a difficult task--harder.
  2. More accurate Natural Language Processing (NLP): Before starting any NLP with Chinese text, you have to segment the text first. It's a hard problem to solve well: there are fast methods that aren't very accurate (~90%), and there are slower methods that are more accurate (~94-97%), but no method is perfect. If spaces were built into Chinese, there would be no need for segmenters and NLP software would instantly become more accurate.
  3. Easier for students: Spaces make understanding words and sentences easier. When I just started learning Chinese, I spent lots of time looking up "words" that didn't exist because I didn't fully understand a sentence. Spaces would prevent these types of mistakes and allow learners to more efficiently acquire vocabulary.

Why Chinese shouldn't have spaces

  1. What's a word? The concept of word is very fuzzy in Chinese. Where does a word begin and end? In English, it's a pretty straightforward question. But if you asked a bunch of Chinese people what constitutes a word, they'd probably have trouble telling you. Part of their difficulty comes from the fact that Chinese people aren't used to seeing their words separated by spaces.
  2. Tradition: Chinese has done well without spaces for over 3000 years, why add them now? If Chinese didn't need them before, why does it need them now? Keeping something the same just because it's the way it's always been isn't a good reason.
  3. It looks nicer with out spaces. Maybe. Maybe not. Personally, I could easily get used to Chinese with spaces.

How to go about adding spaces to Chinese?

IMEs need to take charge. If Sogou's IME started adding spaces to Chinese tomorrow, spaces would be commonplace by the end of the week. After a few months, even native Chinese speakers would be used to the spaces.

It might seem impossible, but such changes have occured before. 150 years ago, Chinese didn't have punctuation. Period. (bad pun?) Now Chinese would be strange without it. If Chinese added spaces, I believe the same thing would happen.

Do you think Chinese should have spaces? Why or why not? Let me know inthe comments

Leave a comment