Pīnyīn News

news and discussions mainly related to Chinese characters and romanization
Updated: 6 hours 10 min ago.

Taoyuan International Airport to adopt new style for signs

Thu, 06/23/2011 - 04:26

Taoyuan International Airport (or “Taiwan Taoyuan International Airport” as it is called in Taiwan’s official Chinglish form) will be replacing its signage, adopting a new color scheme and typeface.

Currently, the signs in the airport have a black background and yellow or white letters.

The new signs will be modeled after those in the Hong Kong International Airport, with white letters on a blue background. But signs for facilities such as restrooms and restaurants will have white letters on a dark red background. (Perhaps like these?)

Taiwan will also duplicate Hong Kong’s choice of font face: Fang Song (fǎng-Sòngtǐ / 仿宋體). One of the reasons for this is that some Chinese characters — such as for yuán (園) and guó (國) — appear similar if viewed from a distance, according to the president of the Taoyuan International Airport Corp. “Passengers can clearly see the words on the [new] signs even if they view them from 30 meters away,” he added.

The new signs will start to go up in August, with the change scheduled to be complete by the end of 2012.

I’ve made some samples (which, by the way, contain both 園 and 國) in three typefaces to help illustrate the look of Fang Song. Sorry not to have the right color scheme.

DF Fang Song:

DF Kai Sho:

DF Ming:



font samples:

additional material:

By the way, the contrast between the traditional and simplified versions of the of fǎng-Sòngtǐ (仿宋體 / 仿宋体) is a good illustration that to the untrained eye the conversion from one system to another is not necessarily self apparent.

體 vs. 体

Simplified Chinese characters being purged from Taiwan government sites

Thu, 06/16/2011 - 15:50

Taiwan’s government Web sites have begun removing versions of their content in simplified Chinese characters at the instruction of President Ma Ying-jeou (Mǎ Yīngjiǔ).

This isn’t just a matter of, say, writing “臺灣” (Taiwan) instead of “台灣” (which, yes, the government here is encouraging). This is much bigger. Entire pages, entire Web sites even, written in simplified Chinese characters are being eliminated.

The Tourism Bureau, for example, removed the version of its site in simplified Chinese characters from the Web on Wednesday. This comes at a time that the government’s further lifting of restrictions against individual Chinese tourists is aimed at bringing in more travelers from China.

The Presidential Office’s spokesman quoted Ma as saying “To maintain our role as the pioneer in Chinese culture, all government bodies should use traditional Chinese in official documents and on their Web sites, so that people around the world can learn about the beauty of traditional characters.” (Is that what pioneers do? I’ll try to find the original Mandarin-language quote later if I get a chance.)

It’s one thing to urge businesses not to remove traditional Chinese characters and replace them with simplified Chinese characters (as the government did on Tuesday). It’s quite another to remove alternate versions in another script — one that a very sizable target audience would have an easier time with.

During the administration of President Chen Shui-bian the government began adding versions in simplified Chinese characters of the Mandarin texts of official Web sites. The Office of the President was one such site. Now the simplified version is gone. That’s happening across government sites.

Here, for example, are some screen shots I took.

This was the language/script selection at the National Palace Museum‘s Web site as of Thursday morning. (Click to see an image of the entire front page.)

“简体中文” (jiǎntǐ Zhōngwén) is brighter because I had my mouse over it to highlight that text.

And here the language/script selection at the National Palace Museum’s Web site as of Thursday evening:

As you can see, the choice of viewing the site in simplified Chinese characters has been removed.

Here at Pinyin.Info I often have material in Hanyu Pinyin. So I’m certainly not unsympathetic to the idea that sometimes the medium really is a major part of the message. But I doubt that President Ma’s tough-love approach in this area will accomplish anything useful for Taiwan or the survival of traditional Chinese characters; indeed, I believe it will be counter-productive.

To be more blunt about this, this seems like a really, really bad idea.

some sources:

Google Translate and romaji revisited

Tue, 06/07/2011 - 09:28

OK, Google has improved its Pinyin converter some, though it still fails in important areas. So that’s the present situation for Google and Mandarin.

How about for Google and Japanese?

Professor J. Marshall Unger of the Ohio State University’s Department of East Asian Languages and Literatures generously agreed to reexamine Google’s performance in conversions to rōmaji (Japanese written in romanization).

Below is his latest evaluation.

For his initial analysis (in December 2009), see Google Translate and rōmaji.

I ran the test passage through Google Translate again. There’s some improvement, but it’s still pretty mediocre.

Original Google Translate 6日午後4時35分ごろ、東京都千代田区皇居外苑の都道(内堀通り)の二重橋前交差点で、中国からの観光客の40代の男性が乗用車にはねられ、全身を強く打って間もなく死亡した。車は歩道に乗り上げて歩いていた男性(69)もはね、男性は頭を強く打って意識不明の重体。丸の内署は、運転していた東京都港区白金3丁目、会社役員高橋延拓容疑者(24)を自動車運転過失傷害の疑いで現行犯逮捕し、容疑を同致死に切り替えて調べている。 6-Nichi gogo 4-ji 35-fun-goro, Tōkyō-to Chiyoda-ku Kōkyogaien no todō (uchibori-dōri) no Nijūbashi zen kōsaten de, Chūgoku kara no kankō kyaku no 40-dai no dansei ga jōyōsha ni hane rare, zenshin o tsuyoku Utte mamonaku shibō shita. Kuruma wa hodō ni noriagete aruite ita dansei (69) mo hane, dansei wa atama o tsuyoku utte ishiki fumei no jūtai. Marunouchi-sho wa, unten shite ita Tōkyō-to Minato-ku hakkin 3-chōme, kaisha yakuin Takahashi nobe Tsubuse yōgi-sha (24) o jidōsha unten kashitsu shōgai no utagai de genkō-han taiho shi, yōgi o dō chishi ni kirikaete shirabete iru.  同署によると、死亡した男性は横断歩道を歩いて渡っていたところを直進してきた車にはねられた。車は左に急ハンドルを切り、車道と歩道の境に置かれた仮設のさくをはね上げ、歩道に乗り上げたという。さくは歩道でランニングをしていた男性(34)に当たり、男性は両足に軽いけが。 Dōsho ni yoru to, shibō shita dansei wa ōdan hodō o aruite watatte ita tokoro o chokushin shite kita kuruma ni hane rareta. Kuruma wa hidari ni kyū handoru o kiri, shadō to hodō no sakai ni oka reta kasetsu no saku o haneage, hodō ni noriageta toyuu. Saku wa hodō de ran’ningu o shite ita dansei (34) niatari, dansei wa ryōashi ni karui kega.  同署は、死亡した男性の身元確認を進めるとともに、当時の交差点の信号の状況を調べている。 Dōsho wa, shibō shita dansei no mimoto kakunin o susumeru totomoni, tōji no kōsaten no shingō no jōkyō o shirabete iru.  現場周辺は東京観光のスポットの一つだが、最近はジョギングを楽しむ人も増えている。 Genba shūhen wa Tōkyō kankō no supotto no hitotsudaga, saikin wa jogingu o tanoshimu hito mo fuete iru.


  • The use of numerals dodges a plethora of errors, but “6-Nichi” is still wrong for Muika.
  • Lots of correct capitalizations have been added, but “uchibori” was missed and “Utte” capitalized by mistake.
  • Some false spaces or lack of spaces persist: “hane rare”, “oka reta”; “hitotsudaga” and “niatari” were correctly hitotsu da ga and ni atari in the original test.
  • Names still get butchered (“hakkin” for Shirogane, “nobe Tsubuse” for Nobuhiro.
  • The needless apostrophe in “ran’ningu” is still there.
  • Interestingly, “toyuu” is a new error: it should be to iu.
  • There’s evidence of some attempt to use hyphens, but why not in “kankō kyaku” or “Nijūbashi zen”?

So, to update: Google gets kudos for conscientiousness, but I stick by my original comments.

For more by Prof. Unger, see Pinyin.info’s recommended readings, which includes selections from The Fifth Generation Fallacy: Why Japan Is Betting Its Future on Artificial Intelligence, Literacy and Script Reform in Occupation Japan: Reading Between the Lines, and Ideogram: Chinese Characters and the Myth of Disembodied Meaning.

Google Translate’s Pinyin converter revisited

Thu, 06/02/2011 - 10:49

When Google Translate‘s Pinyin converter was first released about a year and a half ago, it sucked. Wow, did it ever suck. Since then, however, Google has instituted some changes. So it seems about time this was reexamined.

Fortunately, Google’s Pinyin converter is now much better than before.

Here’s the sort of FUBAR romanization — it certainly doesn’t deserve to be called Hanyu Pinyin — Google used to produce:

tán zhōng guó de“yǔ“hé” wén” de wèn tí, wǒ jué de zuì hǎo néng xiān liǎo jiè yī xià zài zhōng guó tōng yòng de yǔ yán。… rú guǒ nǐ shǐ yòng zhōng guó de gòng tóng yǔ yán pǔ tōng huà, nǐ liǎo jiě zhè ge yǔ yán de yǔ fǎ(bǐ rú“de, de, de“ hé“le” de bù tóng yòng fǎ) ma?zhī dào zhè ge yǔ yán de jī běn yīn jié(bù bāo kuò shēng diào) zhǐ yǒu408gè ma?

Now the same passage will look like this:

Tán zhōngguó de “yǔ” hé “wén” de wèntí, wǒ juéde zuì hǎo néng xiān liǎo jiè yīxià zài zhōngguó tōngyòng de yǔyán…. Rúguǒ nǐ shǐyòng zhōngguó de gòngtóng yǔyán pǔtōnghuà, nǐ liǎojiě zhège yǔyán de yǔfǎ (bǐrú “de, de, de “hé “le” de bùtóng yòngfǎ) ma? Zhīdào zhège yǔyán de jīběn yīnjié (bù bāokuò shēngdiào) zhǐyǒu 408 gè ma?

At last! Capitalization at the beginning of a sentence and word parsing! But — you knew there was going to be a but, didn’t you? — Google’s Pinyin converter falls significantly short because it still fails completely in two fundamental areas: capitalization of proper nouns and proper use of the apostrophe.

1. Proper Nouns

Google’s Pinyin converter fails to follow the basic point of capitalizing proper nouns. For example, here are some well-known place names. I have prefixed the names with “在” because Google automatically capitalizes the first word in a line; so to see how it handles capitalization of place names something other than the name must go first.

Google Translate gets these right, other than the odd truncation of Chang’an. But the Pinyin converter (see the gray text at the bottom of the image above) fails to capitalize these, even though it correctly parses them as units and thus must “know” their meanings.

The same thing happens with personal names.

Input this:


Google Translate provides this:

Is Ma Ying-jeou
Mao Zedong
Chen Shui-bian

Those are correct, if the missing Iss are discounted.

But the Pinyin appears as “Shì mǎyīngjiǔ Shì máozédōng Shì chénshuǐbiǎn“. So even though the software understands that these names are units, the capitalization and word parsing are still wrong and they are still not rendered as they should be in Pinyin: “Mǎ Yīngjiǔ,” “Máo Zédōng,” “Chén Shuǐbiǎn.”

There is nothing obscure about capitalizing proper nouns. How did this get missed?

2. Apostrophes

The cases of Xi’an and Chang’an above already demonstrate apostrophe omission. Let’s try a few more tests, including some words that are not proper nouns.

Input this:


The Pinyin is rendered as “Āěrbāníyǎ Ránér Rénài Liánǒu” rather than the correct forms of Ā’ěrbāníyǎ, rán’ér, rén’ài, and lián’ǒu.

As always I want to stress that, whatever you might have heard elsewhere, apostrophes are not optional. But the rules for their use are easy — so easy that I suspect a fairly simple computer script could fix this problem quickly and simply. (Only about 2 percent of Mandarin words, as written in Hanyu Pinyin, have apostrophes.)

As is the case with the mistakes with proper nouns, these apostrophe errors are all the more puzzling because Google Translate does not appear to share them. Fortunately, these problems should not be particularly difficult to fix, especially if the Pinyin converter can make better use of Google Translate’s database.

Although Google’s failures to implement capitalization of proper nouns and apostrophe use are significant problems, they could likely be corrected quickly and easily. (I strongly suspect this would take considerably less time than it has taken for me to write this post.) The result would be a vastly improved converter. So I am hopeful that Google will work on this soon.

3. Additional work

Once Google gets those basics fixed, it should focus on the simple matter of correcting spacing before and after some quotations (which would surely take just a few minutes to take care of) and any other such spacing errors, and fixing its word parsing related to numbers (which is a bit more complicated, though the basics are easy: everything from 1 to 100 is written solid).

Next would come something requiring a bit more care: the proper handling of Mandarin’s three tense-marking particles: zhe, guo, and le.

And Google should attach the pluralizing suffix -men to the word it modifies rather than leaving it separate (e.g., háizimen, not háizi men).

Then, with all of those taken care of, Google would have a pretty good Pinyin converter that I would be happy to praise. Of course even then it could still use other improvements; but those would most likely deal more with particulars than the fundamentals of how Pinyin is meant to be written.

A separate post, to be written soon, will compare the performance of several Pinyin converters (including Google’s). Stay tuned.

Oxford Chinese Dictionary goes online

Wed, 06/01/2011 - 15:07

Oxford University Press has just announced that its massive Oxford Chinese Dictionary is now available through its Oxford Language Dictionaries Online subscription service.

I haven’t seen the online version yet myself; but from the publisher’s description it appears to be largely the same as the published edition, whose paucity of Pinyin is disappointing. The publisher, however, is promising that “Pinyin will be added to all Chinese translations” in November, which should be a major step forward.

Perhaps some of you at universities have institutional access. I would welcome reports.

source: What’s New, Oxford Language Dictionaries Online, May 2011.

old fashioned

Wed, 05/25/2011 - 14:16

Here’s a shot of some Hanzified, Mandarinized English I recently came across. Qiǎokèlì (巧克力) is of course a well-established loan word, from the English “chocolate” (though here the English is given in the more Japanese-English form of choco, as befits a Japanese donut chain store in Taiwan). Ōufēixiāng (歐菲香) is a rendering of “old fashioned.” Although the “old” is missing from the English above, it can be seen in both of the tags pictured below.

Bái kěkě ōufēixiāng (白可可歐菲香) and yuán wèi ōufēixiāng (原味歐菲香).

And if that’s not enough to fill you up with Hanzified English, perhaps try a piece of Bōshìdùn pài (波士頓派), i.e., “Boston [cream] pie.”

A clang on the Taipei MRT announcements

Tue, 04/26/2011 - 11:05

People generally don’t listen carefully to the announcements on the Taipei MRT, a subway/elevated train mass-transit system. With four languages to get through — Mandarin, Taiwanese, Hakka, and English — that’s a lot of talking. And anyway, the cars can be so full that it’s hard to hear such things clearly over all the background noise anyway. Still, you’d think that at least the people who make the recordings would be paying attention.

Below is a link to a recording of a relatively new announcement, advising people on the Danshui line that Minquan West Road is the place to change trains for the Luzhou line, which opened late last year: “Mínquán West Road Station. Attention: passengers transferring to Sānchóng, Lúzhōu, or Zhōngxiào-Xīnshēng please change trains at this station.“

Or at least what I typed above is what the announcement is supposed to give. As you may have noticed, however, “Zhōngxiào-Xīnshēng” is rendered “Zhongxiao-Xinshang,” with a very un-Mandarin shang that rhymes with the English words clang, pang, hang, and sang. And that’s without getting into the matter of tones.

I pointed out this error to Taipei City Hall and the authorities in charge of the MRT. As usual, I had to spend some time repeatedly explaining: “No, Xinshang is not the English pronunciation of Xīnshēng. Xīnshēng isn’t English. It’s Mandarin. What the announcement gives is simply an error….” I was pleasantly surprised, however, that the main person I spoke to at TRTS did not require the usual explanations. He understood the problem and said it would be fixed.

This, however, was a couple of months ago. The recordings have not yet been changed. I haven’t been holding my breath over this, though, because the official with the MRT system warned that it would take time to run a public bid notice for a new recording, make the new recording, and then install the recording in the front and back cars of some 100 trains. Still, the system has been known to move fairly quickly; unfortunately, this usually happens only when the change is for the worse, such as renaming Xindian City Hall as Xindian City Office (now Xindian District Office), or renaming the whole Muzha line because some superstitious nitwits thought that a joking, non-official nickname was bringing the system bad luck.

For longtime residents of Taipei, the shang mispronunciation will likely bring back memories of the bad old days when the MRT system first opened. Back then the signage was predominantly in bastardized Wade-Giles, with the pronunciations in the English announcements matching what a clueless Westerner might say when shown names like Kuting and Nanking (properly: Gǔtíng and Nánjīng, respectively). Perhaps the most offensive pronunciation on the system then was given to Dànshuǐ, which at the time was [mis]spelled Tamshui on the MRT system. This was pronounced as three syllables: Tam (rhymes with the English word “dam”) + shu (“shoe”) + i (as in “machine”).

By the way, the Xinbei City Government has been changing signs around Danshui from Danshui to the old Taiwanese spelling of Tamsui (note: not Tamshui). But more about that in a different post.

Feichang nankan!

Tue, 04/12/2011 - 11:09

The sign in the photo below has been up for years; but only recently did I finally get a chance to take a halfway decent photo of it. It’s just outside the second terminal of Taiwan’s main international airport and thus is the first example of road signage that many visitors to Taiwan see.

The atrocious typography displayed in how “Nankan” (南崁/Nánkàn) is written is certainly a good introduction to the chabuduo world of Taiwan’s signage.

Truly nánkàn (ugly)!

Conferences in Hawaii

Thu, 03/31/2011 - 06:50

Tomorrow morning I’m off to Honolulu for the Zhang Liqing Memorial International Conference on Hanyu Pinyin. This promises to be a tremendously exciting event, with select scholars from throughout the United States, Asia, and Oceania participating. I’ll have more to say about this after the gathering.

While I’m in Hawaii I may drop in on the joint conference of the Association for Asian Studies and the International Convention of Asia Scholars (March 31–April 3). You might think, though, that with nearly 800 sessions on just about everything under the sun, at least a few of them would discuss romanization. (But nooo.) Still, session 282, “Beyond Cultural Essentialism: Neo-Orientalism in Chinese Studies” (Friday morning, 10:15-12:15), sounds interesting, especially Edward McDonald’s talk on character fetishization in Chinese studies. McDonald’s new book, Learning Chinese, Turning Chinese: Challenges to Becoming Sinophone in a Globalised World, also covers this topic.

If you know of anything else particularly interesting going on at the AAS-ICAS conference or in Honolulu at large, please let me know. (For example, what’s the best bookstore there?)

Spreading the good news

Wed, 03/23/2011 - 13:44

Behold, I bring you good tidings.

As I keep having to note, most of the things that are supposedly in Pinyin are terrible. This is not because Pinyin itself is inherently poor or difficult. It’s because most people who produce such things have a fundamental lack of understanding of Pinyin as a system. (And, yes, that includes most users in China.) So it is with amazement that I report today on a journal that not only offers dozens of pages in Hanyu Pinyin — good Hanyu Pinyin — but does so twice every month. It’s also well worth noting that the journal is aimed primarily at adult native speakers of Mandarin, not foreigners trying to pick up the language, though certainly it could also be read by people in the latter group.

From what I’ve seen so far, this journal gets right the things most commonly written incorrectly elsewhere, including:

And it doesn’t use the atrocious ɑ that some people mistakenly believe is required either.

Unfortunately, punctuation and alphanumerics are not included in the Pinyin. But other than that there’s very little that doesn’t follow standard Pinyin orthography, the main exception being the indication of the tone sandhi related to the special cases of yī and bù, (e.g., the journal gives “bú shì” and “búdà” instead of the standard “bù shì” and “bùdà,” and “yìhuíshì” and “yí wèi” instead of the standard “yīhuíshì” and “yī wèi“). That said, though, tone changes related to yi and bu can be something of a pain. So although this isn’t standard, I can see why it was done and am not entirely unsympathetic to this approach.

Here are a few sample lines (click to enlarge):

It would be nice if this were in Unicode, to help aid searches and cutting and pasting. The text, however, appears to have been made in a system devised years ago by the people at the journal. Regardless, I’m happy to see the Pinyin.

Overall, despite the lamentable absence of punctuation and Arabic numerals in the Pinyin, this is quality work, which is perhaps all the more remarkable in that the Pinyin and simplified Hanzi edition of this journal is not truly free to circulate in the land of its target audience. That’s because its publishers are Jehovah’s Witnesses, a group suppressed by the PRC (though it appears that at least at the moment their sites are not blocked by the great firewall). The journal, Shǒuwàngtái, may be more familiar to you by its English name: Watchtower. Whatever you might think of Jehovah’s Witnesses, I hope you’ll recognize the considerable accomplishment of those who put together this publication.

Getting to the Jehovah’s Witnesses Web pages that link to Shǒuwàngtái can be tricky. (Go to the magazines page, select “Chinese (Simplified)” for the language; then choose the month and file with Pinyin.) So I’m providing direct links to some documents below:

I haven’t found any Pinyin editions other than those. Perhaps old ones are taken offline.

With thanks to Victor Mair.

Weishenme Zhongwen zheme TM nan?

Tue, 03/15/2011 - 06:37

David Moser’s essay Why Chinese Is So Damn Hard — which is one of the most popular readings here on Pinyin Info, with perhaps half a million page views to date (nothing to dǎ pēntì at!) — has been translated into Mandarin: Wèishénme Zhōngwén zhème TM nán? (为什么中文这么TM难?). (Gotta love the use of Roman letters there.)

Although the translation has been online for only 24 hours or so, it has already received more than 150 comments.

A suggestion for readers and translators looking for something similar: Moser’s Some Things Chinese Characters Can’t Do-Be-Do-Be-Do.

Ni neibian ji dian?

Sat, 03/12/2011 - 05:04

Here are some photos of a large, elaborate, and no-doubt expensive sundial outside the Nangang high-speed rail station (next door to the Nangang train station and Nangang MRT station).

These were taken at 11 a.m. (The one of the sundial itself was taken on a different day.) But as you can see below, the sundial certainly isn’t indicating the time is 11:00. Rather, it’s pointing toward 9:20 or so.

The disc labeled IX is actually XI (11). I took the photo from a reverse vantage point, so the number is upside down in the photo.

Perhaps whoever erected the main part of the sundial doesn’t know Roman numerals. (Sorry: that’s about as close as this post gets to talking about scripts.) But that wouldn’t account for the dial indicating 9:20 instead of 9:00.

I contacted the Taipei City Government about this. They said to contact the Taiwan High Speed Rail Corporation, which I did. They, in turn, responded that I’d reached the wrong office and should write a different office; but they didn’t forward the message or provide me with the correct e-mail address. Once I’d tracked down another office I e-mailed the folk there. That was more than a week ago. There has been no response.

I spoke with someone at the site who appeared to be in a position of authority. He told me that the sundial hadn’t been adjusted yet and that they would get to it next year. He was too busy to answer any more questions though, such as “Next year?” Also, I suspect that it won’t be easy to rotate that huge thingamajig, so why didn’t they get it right the first time?

Still, at least someone in authority seems to understand there’s a problem.

*For anyone who doesn’t recognize the title of this post, it’s an allusion to the 2001 movie Nǐ nèibiān jǐ diǎn (《你那边几点》/ What Time Is It There?).

Bing Maps for Taiwan

Wed, 03/02/2011 - 11:18

The maps of Taiwan put out by GooGle are plagued with errors in their use of Pinyin. But what about that other big company with deep pockets? You know: Microsoft. How good a job does Microsoft’s Bing do with its maps of Taiwan?

I won’t keep y’all waiting: After examining Bing’s maps of Taiwan the two words that came first to mind were incompetent and atrocious.

The country-level map is odd, offering Wade-Giles. And although the use of the hyphen is irregular, I will give Bing points for getting at least Wade-Giles’ apostrophes right. So, although some place names on the map are decades out of date (e.g., Hsin-chuang, Chungli, Chunan, Kuang-fu), at least they’re not horribly misspelled within that system.

It’s at the street level that Bing’s weirdness becomes most apparent. For example, below is part of Bing’s map of Banqiao.

I added the highlighting.

This tiny but representative fragment of the map has not one but four romanization systems:

  • MPS2: Gung Guang, Min Chiuan, Shin Fu (Even within MPS2, none of those should have spaces or extra capital letters.)
  • Hanyu Pinyin: Banqiao (This is the only properly written place name on this map fragment.)
  • Tongyong Pinyin: Jhancian, Sianmin, Sin Jhan
  • Gwoyeu Romatzyh(!): Shinjann (This is the same road as the one marked “Sin Jhan”. In Hanyu Pinyin, which is what officially should be used here, this is written “Xinzhan”.)

A few more points about this small fragment of the map:

  • Wen Hua could be either MPS2 or Hanyu Pinyin, but not Tongyong Pinyin. And it should be Wenhua.
  • Minan is missing an apostrophe. (It should be Min’an.)
  • Banchiao is just wrong, regardless of the system. They were probably going for MPS2 but erroneously used an o instead of a u: Banchiau.
  • Sec 1 Rd should be Rd Sec. 1.
  • Mrt should be MRT.

So that’s four systems, plus additional errors.

There’s much, much more that’s wrong with this than is right. That’s even more evident on a larger map — and that’s without me bothering to mark orthographic problems in the Pinyin (e.g., Wen Hua instead of the correct Wenhua).

Here bastardized Wade-Giles (e.g., “Mrt-Hsinpu” at top, center — and, FWIW, in the wrong location) has been added to the mix, making a total of five different romanization systems, as well as some weird spellings, e.g., U Nung, Win De, Bah De, Ying Sh — and that’s without including my favorite, JRLE, because that one is correct in MPS2 (“Zhile” in Hanyu Pinyin).

The main point is that vast majority of names are spelled wrong. And among the few that are spelled correctly, those that are written with correct orthography can be counted on one hand. So, to the words above (incompetent and atrocious) let me add FUBAR.

The copyright statement lists not only Microsoft but also Navteq. The Taiwan maps on the latter company’s site, however, are different from those on Bing. Navteq’s are generally in Hanyu Pinyin, though almost invariably improperly written (e.g., Tai bei Shi, Ban Qiao Shi). And despite the prevalence of Hanyu Pinyin, they still contain other romanization systems (e.g., Jhong Shan) and outright errors (e.g., Shin Jahn).

So an update from Navteq wouldn’t be nearly enough to fix Bing’s problems, which are fundamental.

US grad enrollments in Mandarin fall

Wed, 02/16/2011 - 11:23

Although the number of people studying Mandarin in the United States has continued to rise (more about that in a later post), enrollments there in graduate courses in Mandarin have declined.

No. of U.S. Graduate School Enrollments in Mandarin from 1998 to 2009

Grad School Enrollments in Mandarin as a Percentage of Total U.S. Post-Secondary Enrollments in Mandarin

Here’s something I wrote the last time I addressed this topic.

The much-ballyhooed but also much-deserved increase in students studying Mandarin has all been at the undergraduate level. Given that the grad enrollment as a percentage of total enrollment for Mandarin is about the same as that for French (2.63 percent and 2.73 percent, respectively) it might appear that Mandarin has simply reached a “normal” ratio in this regard. But native speakers of English generally need much more time to master Mandarin than to master French. Simply put, four years, say, of post-secondary study of French provides students with a much greater level of fluency than four years of post-secondary study of Mandarin.

Also, there is a great deal more work that needs to be done in terms of translations from Mandarin. I do not at all mean to belittle the work being done in French — or in any other language…. I just mean that Mandarin has historically been underrepresented in U.S. universities given the number of speakers it has and its body of texts that have not yet been translated into English. U.S. universities need to be producing many more qualified grad students who can handle this specialized work. And right now, unfortunately, that’s not happening.

That still holds, except that grad enrollment as a percentage of total enrollment for Mandarin is even lower than before (1.96% vs. 2.37% for French, 1.99% for Spanish, and an impressive 4.68% for Korean).


Banqiao — the Xinbei ways

Wed, 02/09/2011 - 11:09

Xinbei, formerly known as Taipei County and now officially bearing the atrocious English name of “New Taipei City,” has made available an online map of its territory.

Interestingly, the map is available not just in Mandarin with traditional Chinese characters and English with Hanyu Pinyin (most of the time — but more on that soon) but also in Mandarin with simplified Chinese characters. A Japanese interface is also available.

The interface for all versions opens to a map centered on Xinbei City Hall. What struck me upon seeing this for the first time was that, in just one small section, Banqiao is spelled four different ways:

  • Banqiao (Hanyu Pinyin)
  • Panchiao (bastardized Wade-Giles)
  • Ban-Chiau (MPS2, with an added hyphen)
  • Banciao (Tongyong Pinyin)

Click the map to see an enlargement.

I want to stress that these are not typos. These are the result of an inattention to detail that is all too common here.

The spelling for the city, er, district is also wrong in the interface, with Tongyong used. Since Banqiao is the seat of the Xinbei City Government and has more than half a million inhabitants,*, it’s not exactly so obscure that spelling its name correctly should be much of a challenge. Tongyong and other systems also crop up in some other names outside the interface.

It should be admitted, however, that the Xinbei map’s romanization is still better overall than the error-filled mess issued by GooGle.

*: including me

China and U.S. study abroad programs

Mon, 02/07/2011 - 09:25

China remained the fifth most popular destination for U.S. students studying abroad during the 2008/09 school year, and it continued to account for 5 percent of U.S. study abroad.

In the previous academic year, growth for the PRC as a destination increased 19.0 percent, while study abroad as a whole increased 8.5 percent. But for 2008/09 growth for China was a much smaller 3.9 percent, while the total worldwide figure declined -0.8 percent. Figures for the top four destinations also dropped.

The order of the top 10 remained the same as in the previous year, except Mexico and Germany switched places.

Top 10 destinations for study abroad by U.S. students in the 2006-07, 2007-08, and 2008-09 school years

Some other figures of possible interest:

  • Japan was in 11th place with 5,784 students, a 1.3 percent increase over the previous year.
  • Taiwan’s total grew 3.3 percent to 597.
  • Hong Kong grew 5.7 percent to 1,155.
  • South Korea grew a dramatic 29.1 percent to 2,062.
  • Singapore grew 7.7 percent to 612.

Study in Asia increased slightly.

Percent of study abroad performed in Asia

source: Open Doors data portal

Previous posts on this subject:

Going south with official Taiwan map

Tue, 02/01/2011 - 14:48

In the past, when I found romanization errors in official government documents I often contacted the agencies in charge so they could make improvements. But as those who live in Taiwan may have noted, this practice has had limited success. And in the process I’ve built up a great deal of bile from encountering bureaucratic roadblocks to fixing mistakes. So is it any wonder that when I see things like this map, I often think, “Wǒ hǎo xiǎng tù.” Maybe now it’s time to start going with that feeling — metaphorically speaking. And what could be more appropriate, given that we are about to have a tùnián? (I know, I know: That pun’s probably not going to make any of the New Year cards.)

So today I’ll post in public about one such mess. I recently looked over a map of southern Taiwan issued by Taiwan’s official Tourism Bureau and was not surprised to find errors — a lot of errors. (This particular map was published in June 2010 and is, as far as I know, the most recent edition.)

Most of the errors are cases of remnants of Tongyong Pinyin (e.g., Cingshuei for what is written Qingshui in Hanyu Pinyin). Oddly, on this map Tongyong Pinyin is often seen in only part of a name (e.g., what is written 豐丘 in Chinese characters is given as Fengciou, which has Hanyu Pinyin’s Feng rather than Tongyong’s Fong but Tongyong’s ciou rather than Hanyu’s qiu).

What at first glance would appear to be another example of this mixing is Xizih, a bay next to Gaoxiong. There being no xi in Tongyong Pinyin and no zih in Hanyu Pinyin, one might guess this should be Xizi. But in fact this should be Sizi (written Sihzih in Tongyong). Or is also a typo in the Chinese characters (四子灣) and thus should be something else?

Other errors are even more mysterious, such as Tainan’s “Eternal For Cves” for 億載金城 (yì zǎi jīnchéng). I suspect they were going for “Eternal Fortress” but got lost somewhere along the way.

I estimate the map has about 100 errors. Of course, here I’m referring to just the map side itself and not the text on the reverse, which is filled with similar mistakes. Also, it’s just for southern Taiwan. The other two or three maps needed to cover most of the country likely each have just as many mistakes or more.

Turning back to the map at hand, here are some errors in just the area covering the southern tip of Taiwan (map sections C8 and C9).

On the map Should be Haikau Desert Haikou Desert Kenting National Forest Recreation Area Kending National Forest Recreation Area Kenting National Park Kending National Park Kenting National Park Administration Kending National Park Administration Natural Center Nature Center Ping-e Ping’e (Shizih) (Shizi) Shuangliou Shuangliu Sihchongxi Sichongxi Sihchong River Sichong River Sihchongxi Hot Springs Sichongxi Hot Springs Syuhai Xuhai Syuhai Hot Springs Xuhai Hot Springs Syuhai Prairie Xuhai Prairie

Keep in mind that more than half of the area in sections above is water and thus lacking in any place names that could be misspelled.

I should note that Kenting for what should be Kending appears to be what might be labeled an official error — another case of the government mistakenly believing that using old, misleading spellings from the days of bastardized Wade-Giles is necessary lest foreigners be confused. (The worst examples of this are the names of counties and many cities, such as Taichung rather than Taizhong, Pingtung rather than Pingdong, Hualien rather than Hualian, and Chiayi rather than Jiayi.) But if Kenting somehow ended up being official, then the map is still wrong, because the correct Hanyu Pinyin spelling “Kending” (which is also the correct spelling in Tongyong Pinyin) is also seen.

In short, this map is, regrettably, another example of the Taiwan government’s failure to maintain quality control in its use of romanization. It’s been said before but perhaps it needs to be said again: It’s a sad state of affairs when a country can’t manage even the simple task of correctly spelling the names of its own towns and special attractions on its own maps — not that anyone else has managed to get their maps of Taiwan correct either; and some that should be good remain awful. (Yeah, I’m talking about you, GooGle.)

Wenlin releases major upgrade (4.0)

Mon, 01/31/2011 - 11:04

One of my favorite programs, Wenlin (which bills itself as “software for learning Chinese”), has just released a major upgrade for both Mac and Windows versions. This doesn’t happen often; it has been three-and-a-half years since the most recent big change was issued (Wenlin 3.4) and heaven only knows how long since 3.0 came out. So, yes, this release has many substantial improvements.

One of the features nearest and dearest to my heart is that Wenlin 4.0 features greatly improved handling of Pinyin. I was among the field testers for the new version, so I’ve already spent a lot of time examining this feature. Here are a few important aspects of this:

  • Conversions from Chinese characters follow Hanyu Pinyin orthography much more closely than before. This is a major change for the better. (There’s still some room for improvement. But I don’t think we’ll have to wait years for this.)
  • In the past, using Wenlin to convert long texts in Chinese characters into Pinyin could be a real chore, with users having to examine example after example of Chinese characters with multiple pronunciations in order to select the proper pronunciation for that particular context. But now users may, if they so desire, tell Wenlin not to ask users for disambiguation input. Of course, that doesn’t mean that Wenlin will always guess right; but many users will be happy that this trade-off allows them to skip the frustration of, for example, having to tell the program over and over and over that, yes, in this case 說 is pronounced shuō rather than shuì.
  • Relative newcomers to Mandarin may appreciate that for common words tone sandhi is indicated in Wenlin with additional marks (a dot or line below the vowel). This feature can also be turned off, for those who want standard Pinyin.

There are, of course, many improvements beyond the area of Pinyin. Here are a few:

  • One limitation of Wenlin 3.x was that its English dictionary wasn’t very large. But Wenlin 4.0 includes not only the ABC Chinese-English Comprehensive Dictionary but also the excellent new ABC English-Chinese, Chinese-English Dictionary (now finally in stock in the printed version).
  • The flashcards are now set up to handle not just individual characters but polysyllabic words.
  • There’s full Unicode Unihan 6.0 support for more than 75,000 Chinese characters.
  • And for those who think 75,000 just isn’t enough, users can now access Wenlin’s CDL technology. Through this, users can create new, variant, and rare characters; moreover, these can be published and shared with other Wenlin users or CDL-friendly devices.
  • Seal script versions of more than 11,000 characters are provided.
  • Wenlin contains an e-edition of the Shuowen Jiezi (Shuōwén Jiězì / 說文解字 / 说文解字).
  • Coders will be interested to know that Wenlin appears to be headed toward becoming open-source.
  • Both Mandarin and English entries are marked with grade levels, which aids learners by indicating relative frequency of use. The levels for Mandarin words are based on the Hanyu Shuiping Kaoshi (Hànyǔ Shǔipíng Kǎoshì / 汉语水平考试 / 漢語水平考試 / HSK).

The full version (i.e., the CD with the program comes in a box and is likely packaged with a hard copy of the manual) is US$199, or US$179 if you download it from the Wenlin Web store. Upgrades from 3.x cost US$49.

For more information, see the summary of features and outline of what’s new in Wenlin 4.0.

Xin Tang no. 1: articles in Gwoyeu Romatzyh

Sun, 12/19/2010 - 12:47

I’ve just put up another issue of Xin Tang.

As you may have noticed already, the name on the cover is given not as Xin Tang but as Shin Tarng. That’s because the journal started out being published in the Gwoyeu Romatzyh romanization system. But using the Hanyu Pinyin spelling here helps me keep track of these better.

Almost all of this issue is in Mandarin written in Gwoyeu Romatzyh. One article also has an en face translation into English. And as is the case with the other issues of Xin Tang, a variety of topics are covered.

Shin Tarng no. 1 (September/Jiǔyuè 1982)

Hanyu Pinyin Cihui

Thu, 12/02/2010 - 15:39

Today, for all you orthography junkies (Hello? Hello? Anybody there?), I have added a selection from the 1963 edition of Hanyu Pinyin Cihui (汉语拼音词汇 / Hànyǔ Pīnyīn Cíhuì).

The book, which is fully alphabetized by Hanyu Pinyin (i.e., like the ABC dictionary series, not like the Hanzi-by-Hanzi Pinyin ordering seen in most dictionaries published in the PRC), is a long list of Mandarin words as written in Hanyu Pinyin and Chinese characters. It’s meant as a reference for word division and other such orthographic concerns. It’s the sort of thing that just cried out to have been made into a full dictionary (especially since that’s what it looks like, minus definitions); but, unfortunately, it never was. But it was an important influence on the ABC series.

One can see some interesting instances of differences between Pinyin orthography then and now. For example, in this old edition of Hanyu Pinyin Cihui de tends to be appended to words and written as d, e.g. ái’áid, rather than the current ái’ái de (皚皚的). Similarly, zi is written z at the end of a word, e.g. ǎigèz, rather than the current ǎigèzi (矮个子).

Also interesting is the mixed use of simplified and traditional Chinese characters. (It will be easier to see what I’m referring to if you open the PDF file of the introduction and A’s of Hanyu Pinyin Cihui.) The title on the cover is given as 汉语拼音词汇 in Chinese characters — perfectly standard. But below this is 增訂稿 (zēngdìng gǎo / revised edition); note how dìng is written as 訂 rather than as 订.

More striking, though, for the modern reader is the script in the foreword. Here, what was written 汉语拼音词汇 on the cover is written 汉拼音汇, mixing traditional and simplified forms. The full traditional version of this would be written 漢語拼音詞彙. The text of the introduction is similarly mixed. This is because this was published before many simplified forms that are now standard were fully accepted officially.

The selection from this book here on Pinyin.info comprises the introduction and all of the entries beginning with the letter a.