Pīnyīn News

Syndicate content
news and discussions mainly related to Chinese characters and romanization
Updated: 1 hour 28 min ago

Google Translate’s Pinyin converter: now with apostrophes

Sat, 10/01/2011 - 08:40

Google has taken another major step toward making Google Translate‘s Pinyin converter decent. Finally, apostrophes.

Not long ago “阿爾巴尼亞然而仁愛蓮藕普洱茶” would have yielded “Āěrbāníyǎ ránér rénài liánǒu pǔěr chá.” But now Google produces the correct “Ā’ěrbāníyǎ rán’ér rén’ài lián’ǒu pǔ’ěr chá.” (Well, one could debate whether that last one should be pǔ’ěr chá, pǔ’ěrchá, Pǔ’ěr chá, Pǔ’ěr Chá, or Pǔ’ěrchá. But the apostrophe is undoubtedly correct regardless.)

Also, the -men suffix is now solid with words (e.g., 朋友們 –> péngyoumen and 孩子們 –> háizimen). This is a small thing but nonetheless welcome.

The most significant remaining fundamental problem is the capitalization and parsing of proper nouns.

And numbers are still wrong, with everything being written separately. For example, “七千九百四十三萬五千六百五十八” should be rendered as “qīqiān jiǔbǎi sìshísān wàn wǔqiān liùbǎi wǔshíbā.” But Google is still giving this as “qī qiān jiǔ bǎi sì shí sān wàn wǔ qiān liù bǎi wǔ shí bā.”

On the other hand, Google is starting to deal with “le”, with it being appended to verbs. This is a relatively tricky thing to get right, so I’m not surprised Google doesn’t have the details down yet.

So there’s still a lot of work to be done. But at least progress is being made in areas of fundamental importance. I’m heartened by the progress.

Related posts:

The current state:

Kindles and Pinyin

Fri, 09/30/2011 - 04:37

Sure, Amazon Kindles can store thousands of books, play mp3 files, provide Web access, and allow one to spend money on books with alarming ease. But can they handle Pinyin?

Yes!

This test was made on a Kindle 3 purchased at a U.S. retail store. All three typefaces — regular, condensed, and sans serif — worked well.

Yes, Kindles can display Hanzi as well — though there may be some problems with those appearing correctly in book titles in the device’s index.

Below are links to my files, in case you want to test this yourself. I’d appreciate hearing about how Nook and other devices handle this. Thanks.

Script font for Pinyin

Wed, 09/21/2011 - 15:30

Unfortunately, relatively few fonts support Hanyu Pinyin (with tone marks, that is). So I was surprised to come across Pecita, by Philippe Cochy. This is the first script typeface I recall seeing that covers Pinyin … and a lot more.

It might be too individualistic for much Pinyin use. But I’m very glad to know it exists and hope to see many more creations like it.

Pecita is licensed under the SIL Open Font License, Version 1.1.

Additional links:

Google Web fonts and Hanyu Pinyin

Wed, 09/21/2011 - 11:30

Back in the last century, getting Web browsers to correctly display Pinyin was such a troublesome task that I remember once even employing GIFs of first- and third-tone letters to get those to look right. So there were a whole lotta IMG tags in my text. Sure, I put the necessary info in ALT tags (e.g., “alt=’a3′”), just in case. But, still, I shudder to recall having to resort to that particular hack.

Things are better now, though still far from ideal. Something that promises to considerably improve the situation of website viewers not all having the same font you may wish to use is CSS3′s @font-face, which allows those creating Web pages to employ fonts that are provided online. Google is helping with this through its Google Web Fonts. (Current count: 252 font families.)

But is anything in Google’s collection capable of dealing with Hanyu Pinyin? Armed with a handy-dandy Pinyin pangram, I had a look at what Google has made available.

Not surprisingly, most of the 29 font families marked as offering the “Latin Extended” character set failed to handle the entire Hanyu Pinyin set. The ǘǚǜ group is the most likely to be unsupported at present, with third-tone vowels also frequently missing.

Here are the Google Web fonts that do support Hanyu Pinyin with tone marks:
Serifs

  • EB Garamond (227 KB)
  • Gentium Basic (263 KB — and about the same for each of the three accompanying styles: italic, bold, bold italic)
  • Gentium Book Basic (267 KB — and about the same for each of the three accompanying styles: italic, bold, bold italic)
  • Neuton (56 KB — and about the same for each of the five accompanying styles: italic, bold, light, extra light, extra bold)

Note:

  • Neuton has relatively weak tone marks, so I wouldn’t recommend it for Web pages aimed at beginning students of Mandarin.

Sans Serifs

  • Andika (1.4 MB)
  • Ubuntu (350 KB) — available in eight styles

Some Ubuntu sample PDFs: Ubuntu regular, Ubuntu italic, Ubuntu bold, Ubuntu bold italic, Ubuntu light, Ubuntu light italic, Ubuntu medium, Ubuntu medium italic.

Andika sample PDF.

Note:

  • Andika’s relatively large size (1.4 MB) makes it unsuitable for @font-face use because of download time. (Its license, however, would permit someone with the time and energy to crack it open and remove lots of the glyphs not needed for Pinyin, thus reducing the size.) More fundamentally, though, I don’t much like the look of it; but YMMV.

Since Google is likely to expand the number of fonts it offers, I’m including the list of all 29 faces I tried for this experiment, which should make it easier for those wanting to test only new fonts. (It is possible, however, that Pinyin support will be added later to some fonts that fail in this area now. If anyone hears of any such changes, please let me know.) Use of bold indicates Pinyin support; everything else failed.

Display Faces with Latin Extended (all fail)

  • Abril Fatface
  • Forum
  • Kelly Slab
  • Lobster
  • MedievalSharp
  • Modern Antiqua
  • Ruslan Display
  • Tenor Sans

Handwriting Faces with Latin Extended (all fail)

  • Patrick Hand

Serif Faces with Latin Extended

  • Cardo
  • Caudex
  • EB Garamond
  • Gentium Basic
  • Gentium Book Basic
  • Neuton
  • Playfair Display
  • Sorts Mill Goudy

Sans Serif Faces with Latin Extended

  • Andika
  • Anonymous Pro
  • Anton
  • Didact Gothic
  • Francois One
  • Istok Web
  • Jura
  • Open Sans
  • Open Sans Condensed
  • Play
  • Ubuntu
  • Varela

Additional resource: SIL Fonts for downloading (including the full versions of Andika and Gentium).

Taiwanese romanization used for Hanzi input method

Wed, 09/14/2011 - 12:40

Since I just posted about the new Hakka-based Chinese character input method I would be amiss not to note as well the release early this year of a different Chinese character input method based on Taiwanese romanization.

This one is available in Windows, Mac, and Linux flavors.

See the FAQ and documents below for more information (Mandarin only).

Táiwān Mǐnnányǔ Hànzì shūrùfǎ 2.0 bǎn xiàzài (臺灣閩南語漢字輸入法 2.0版下載) [Readers may wish to note the use of Minnan, which is generally preferred among unificationists and some advocates of Hakka and the languages of Taiwan's tribes.]

source: Jiàoyùbù Táiwān Mǐnnányǔ Hànzì shūrùfǎ (教育部臺灣閩南語漢字輸入法); Ministry of Education, Taiwan; June 16, 2010(?) / February 14, 2011(?) [Perhaps the Windows and Linux versions came first, with the Mac version following in 2011.]

Hakka romanization used for new Hanzi input method

Wed, 09/14/2011 - 11:34

Taiwan’s Ministry of Education has released software for Windows and Linux systems that uses Hakka romanization for the inputting of Chinese characters.

This appears to be aimed mainly at those who wish to input Hanzi used primarily in writing Hakka, such as that shown here.

See also Taiwanese romanization used for Hanzi input method.

sources:

Pinyin pangram challenge

Thu, 09/08/2011 - 10:03

One of the many things I plan to do eventually is to put up some graphics of how Pinyin looks in various font faces. A Pinyin pangram would do nicely for a sample text. You know: a short Mandarin sentence in Hanyu Pinyin that uses all of the following 26 letters: abcdefghijklmnopqrstuüwxyz (i.e., the English alphabet’s a-z, minus v but plus ü).

But then I couldn’t find one. So I put the question out to some people I know and quickly got back two Pinyin pangrams.

Ruanwo bushi yingzuo; putongfan bushi xican; maibuqi lüde kan jusede. (57 letters)

and

Zuotian wo bang wo de pengyou Lü Xisheng qu chengli mai yi wan doufuru he ban zhi kaoji. (70 letters)

from Robert Sanders and Cynthia Ning, respectively.

James Dew weighed in with some helpful advice. And, with some additional help from the original two contributors and my wife, I made some additional modifications, eventually resulting in a variant reduced to 48 letters:

Zuotian wo bang nü’er qu yi jia chaoshi mai kele, xifan, doupi.

With tone marks, that’s “Zuótiān wǒ bāng nǚ’ér qù yī jiā chāoshì mǎi kělè, xīfàn, dòupí.”

I suppose xīfàn is not really the sort of thing one buys at a chāoshì. On the other hand, people probably don’t worry much about whether jackdaws really do love someone’s big sphinx of quartz, so I think we’re OK. Still, something shorter than 48 letters should be possible — though pangram-friendly brevity is more easily accomplished in English than in Mandarin as spelled in Hanyu Pinyin. As one correspondent noted:

Most of the “excess” letters are vowels. Trouble is that Chinese doesn’t pile up the consonants much. Brown, for example, takes care of b, r, w, and n, while only expending one little o…. There’s no word like string in Chinese (5 consonants; one vowel). Chinese piles up vowels: zuotian and chaoshi and doufu and kaoji all use more vowels than consonants.

I’m challenging readers to come up with more Pinyin pangrams.

But I don’t want this to be a reversed shi shi shi stunt, so let’s stay away from Literary Sinitic. And I’d prefer the equivalent of “The quick brown fox jumps over the lazy dog” to that of “Cwm fjord veg balks nth pyx quiz.” In other words, wherever possible this should be in real-world, sayable Mandarin.

One possible variant on this would be to use “abcdefghijklmnopqrstuüwxyz” plus all the forms with diacritics āáǎàēéěèīíǐìōóǒòūúǔùǘǚǜ.” (No ǖ is necessary.) But that would be even more work.

Those who devise good pangrams will will be covered in róngyào — or something like that.

Happy hunting.

DPP position on romanization

Wed, 08/31/2011 - 16:20

With Taiwan’s presidential election less than six months away and various position papers being issued, perhaps it’s time to take a look at where the opposition stands on romanization.

Sure, various politicians rant from time to time. But they may or may not be taken seriously. What about the party itself and its candidate?

Google doesn’t find any instances of “拼音” (“pīnyīn”) on the official Web site of the Democratic Progressive Party’s presidential candidate, Tsai Ing-wen (Cài Yīngwén / 蔡英文). But searching for “拼音” on the DPP’s official Web site does yield at least a few results. (See the “sources” at the end of this piece.) It’s probably no surprise that none of them contain anything but bad news for those who support Taiwan’s continued use of Hanyu Pinyin.

Typical is the “e-paper” piece from 2008 that states the change to Hanyu Pinyin will cost NT$7 billion (about US$240 million). (If the DPP candidate wins, will the DPP follow its own assertions and logic and say that it would be far too expensive for Taiwan to change from the existing Hanyu Pinyin to Tongyong Pinyin?) I have no more faith in that inflated figure than I have in the other claims there, such as that the use of Hanyu Pinyin would not be convenient for foreigners and that there is no relationship between internationalization and using the world’s one and only significant romanization system for Mandarin (Hanyu Pinyin).

Then there’s the delicious irony that the image of a Tongyong Pinyin street sign the DPP chose to use in that anti-Hanyu Pinyin message has a typo! The sign, shown at top right, should read Guancian, not Guanciao. (In Hanyu Pinyin it would be “Guanqian.”) That’s right: The DPP says Taiwan needs to use Tongyong — but the supposed expert who put together that very argument apparently doesn’t know the difference between Tongyong Pinyin and a hole in the wall..

That document is a few years old, though. What about something more recent? Just three months ago the DPP spokesman, Chen Qimai (Chen Chi-mai / 陳其邁), complained that the Ma Ying-jeou administration had replaced Tongyong Pinyin with Hanyu Pinyin, calling this an example of removing Taiwan culture and abandoning Taiwan’s sovereignty. So there’s nothing to indicate a change in position over time.

It’s worth remembering that there’s a lot of blame to go around for the inconsistencies and sloppiness that characterize Taiwan’s romanization situation. Historically speaking, the KMT is certainly responsible for much of the mess. And the Ma administration’s willingness to go along with “New Taipei City” instead of “Xinbei,” “Tamsui” instead of “Danshui,” and “Lukang” instead of “Lugang” demonstrates that it is OK with cutting back its own policy in favor of Hanyu Pinyin. Nevertheless, it’s now the DPP — or at least some very loud and opinionated people within it — that represents the main force for screwing up perfectly good signage, etc.

Back when I was more often around DPP politicians, I would occasionally ask them privately about their opinions of Hanyu Pinyin. For the most part, they had no opposition to Taiwan’s use of it, regarding this as simply a practical matter. But they would not say so publicly because President Chen Shui-bian’s dumping of Ovid Tzeng made it clear what fate would meet those who opposed Chen on this issue.

Even though Chen is no longer in the picture, I fear that many in the DPP have come to believe their own propaganda on this issue.

I urge individuals (esp. those with known pro-green sentiments) and organizations (Hey, ECCT and AmCham: that means you especially!) that want to avoid a return to the national embarrassment that is Tongyong Pinyin to tell Cai Yingwen and the DPP now that Taiwan’s continued use of Hanyu Pinyin is simply good policy and is supported by the vast majority of the foreign community here, including pro-green foreigners.

sources:

Key Chinese updated, adding new Pinyin features

Tue, 08/30/2011 - 12:03

The program Key, which offers probably the best support for Hanyu Pinyin of any software and thus deserves praise for this alone, has just come out with an update with even more Pinyin features: Key 5.2 (build: August 21, 2011 — earlier builds of 5.2 do not offer all the latest features).

Those of you who already have the program should get the update, as it’s free. But note that if you update from the site, the installer will ask you to uninstall your current version prior to putting in the update, so make sure you have your validation code handy or you’ll end up with no version at all.

(If you don’t already have Key, I recommend that you try it out. A 30-day free trial version can be downloaded from the site.)

Anyway, here’s some of what the latest version offers:

  • Hanzi-with-Pinyin horizontal layout gets preserved when copied into MS Word documents (RT setting), as well as in .html and .pdf files created from such documents.
  • Pinyin Proofing (PP) assistance: with pinyin text displayed, pressing the PP button on the toolbar will colour the background of ambiguous pinyin passages blue; right-clicking on such a blue-background pinyin passage will display the available options.
  • Copy Special: a highlighted Chinese character passage can be copied & pasted automatically in various permutations.
  • Improved number-measureword system: it now works with Chinese-character, pinyin and Arabic numerals.
  • Showing different tones through coloured characters (Language menu under Preferences).
  • Chengyu (fixed four character expression) spacing logic: automatic spacing according to the pinyin standard (Language menu under Preferences).
  • Option to show tone sandhi on grey background (Language menu under Preferences).
  • Full support of standard pinyin orthography in capitalization and spacing.
  • Automatic glossary building.

Some programs, such as Popup Chinese’s “Chinese converter,” will take Chinese characters and then produce pinyin-annotated versions, with the Pinyin appearing on mouseover. Key, however, offers something extra: the ability to produce Hanzi-annotated orthographically correct Pinyin texts (i.e,, the reverse of the above). If you have a text in Key in Chinese characters, all you have to do is go to File --> Export to get Key to save your text in HTML format.

Here’s a sample of what this looks like.

Běn biāozhǔn guīdìngle yòng《 Zhōngwén pīnyīn fāng’àn》 pīnxiě xiàndài Hànyǔ de guīzé。 Nèiróng bāokuò fēncí liánxiě fǎ、 chéngyǔ pīnxiěfǎ、 wàiláicí pīnxiěfǎ、 rénmíng dìmíng pīnxiěfǎ、 biāodiào fǎ、 yíháng guīzé děng。

Basically, this is a “digraphia export” feature — terrific!

If you want something like the above, you do not have to convert the Hanzi to orthographically correct Pinyin first; Key will do it for you automatically. (I hope, though, that they’ll fix those double-width punctuation marks one of these days.)

Let’s say, though, that you want a document with properly word-parsed interlinear Hanzi and Pinyin. Key will do this too. To do this, a input a Hanzi text in Key, then highlight the text (CTRL + A) and choose Format --> Hanzi with Pinyin / Kanji-Kana with Romaji.

In the window that pops up, choose Hanzi with Pinyin / Kanji-kana with Romaji / Hangul with Romanization from the Two-Line Mode section and Show all non-Hanzi symbols in Pinyin line from Options. The results will look something like this:

This can be extremely useful for those authoring teaching materials.

Furthermore, such interlinear texts can be copied and pasted into Word. For the interlinear-formatted copy-and-paste into Word to work properly, Key must be set to rich text format, so before selecting the text you wish to use click on the button labeled RT. (Note yellow-highlighted area in the image below.)

back to Tamsui

Mon, 08/29/2011 - 15:44

It’s time for another installment of Government in Action.

What you see to the right is something the Taipei County Government (now the Xinbei City Government, a.k.a. the New Taipei City Government) set into action: the Hanyu Pinyin spelling of “Danshui” is being replaced on official signage, including in the MRT system, by the old Taiwanese spelling of “Tamsui.” I briefly touched upon the plans for “Tamsui” a few months ago. (See my additional notes in the comments there.)

I have mixed feelings about this move. On the one hand, I’m pleased to see a representation of a language other than Mandarin or English on Taiwan’s signage. “Tamsui” is the traditional spelling of the Taiwanese name for the city. And it hardly seems too much for at least one place in Taiwan to be represented by a Taiwanese name rather than a Mandarin one.

On the other hand, the current move unfortunately doesn’t really have anything to do with promoting or even particularly accepting the Taiwanese language. It’s not going to be labeled “Taiwanese,” just “English,” which is simply wrong. It’s just vaguely history-themed marketing aimed at foreigners and no one else. But which foreigners, exactly, is this supposed to appeal to? Perhaps Taiwan is going after those old enough to remember the “Tamsui” spelling, though I wonder just how large the demographic bracket is for centenarian tourists … and just how mobile most of them might be.

So it’s basically another example — retroactively applied! — of a spelling that breaks the standard of Hanyu Pinyin and substitutes something that foreigners aren’t going to know how to pronounce (and the government will probably not help with that either): i.e., it’s another “Keelung” (instead of using “Jilong”), “Kinmen” instead of “Jinmen,” and “Taitung” instead of “Taidong.”

A key point will be how “Tamsui” is pronounced on the MRT’s announcement system. (I haven’t heard any changes yet; but I haven’t taken the line all the way out to Danshui lately.) The only correct way to do this would be exactly the same as it is pronounced in Taiwanese. And if the government is really serious about renaming Danshui as Tamsui, the Taiwanese pronunciation will be the one given in the Mandarin and Hakka announcements as well as the English one. Moreover, public officials and announcers at TV and radio stations will be instructed to say Tām-súi rather than Dànshuǐ, even when speaking in Mandarin.

Fat chance.

But, as years of painful experience in this area have led me to expect, my guess would be that the announcements will not do that. Instead, it will be another SNAFU, with a mispronunciation (yes, it is almost certain to be mispronounced by officialdom and those in the media) being labeled as “English”.

Of course, there’s nothing wrong about saying “Tām-súi.” But it’s a pretty safe bet that isn’t going to happen: the name will likely be given a pronunciation that a random clueless English speaker might use as a first attempt; then that will be called English. This sort of patronizing attitude toward foreigners really makes my blood boil. So I’m going to leave it at that for the moment lest my blood pressure go up too much.

So, once again, the MRT system is taking something that was perfectly fine and changing it to something that will be less useful — and all the while continuing to ignore miswritten station names, stupidly chosen station names, mispronunciations, and Chinglish-filled promotional material.

Please keep your ears as well as eyes open for instances of “Tamsui” and let me know what you observe. The city, by the way, has already started using “Tamsui” instead of “Danshui” on lots of official road signs, as I started seeing several months ago and which I noticed in increasing use just last week when I passed through that way.

I probably should have taken a more active stance on this months ago; but I was too busy working against the bigger and even more ridiculous anti-Pinyin change of “Xinbei” to “New Taipei City.” Fat lot of good that did.

A quarter century of Sino-Platonic Papers

Thu, 08/18/2011 - 12:24

These days, with the click of a mouse one can publish something that can instantly be seen by people around the world. But despite this ease it can still feel like a major accomplishment if someone has the tenacity to keep even a blog going past its first few years.

Consider, then, the days long before user-friendly blogging software, the days before blogs even. The days before desktop publishing was in the hands of more than a few, before most people had the ability to send or receive files electronically, before most people had even heard of the Internet. The days when typewriters were still common.

So these were also the days so long before Unicode that including Chinese characters or even common diacritics in a manuscript usually meant writing them in by hand.

The days when small-scale publishing meant trips to the copy shop and long sessions spent photocopying and stapling. When the international correspondence needed to issue a small journal meant trip after trip to the post office, paying postage to send something to what might well be the other side of the globe, and having to wait weeks, months, for a reply.

The days when receiving payment for issues meant paper checks sent through the regular mail and then taken during certain hours to the bank, where you would wait in line for a teller. And heaven help you with the endless paperwork and waiting if the check was not in U.S. dollars but a foreign currency.

The days when long-distance phone calls really cost something. And international calls? Ouch!

And all that’s on top of all of the other many challenges involved in running a peer-reviewed academic journal.

Those are just some of the situations Victor Mair had to deal with when his journal, Sino-Platonic Papers, was getting off the ground. And there have certainly been many challenges since.

So I think it’s worth noting that Sino-Platonic Papers has reached the age of twenty-five and is still going strong.

There are now more than two hundred issues, the majority of which are available in full for free on Sino-Platonic Papers’ Web site. The shortest issue is just four pages, while the longest to date stretches over three volumes and comprises approximately one thousand pages.

That this journal has published all manner of authors, from internationally renowned scholars to unaffiliated researchers out in the boondocks, helps demonstrate its willingness to take risks. (But, as Cameron Crowe reminds us, that’s how you become great.)

Sino-Platonic Papers has just released its thirteenth volume of book reviews (many of which are particular favorites of mine). But what is especially notable is that it marks the twenty-fifth anniversary of the beginning of this wide-ranging journal.

I congratulate SPP‘s editor, Victor Mair, on this milestone.

Here’s what the anniversary issue covers.

  • Preface
  • Ancient China and Its Enemies: The Rise of Nomadic Power in East Asian History by Nicola Di Cosmo
  • The Prehistory of the Silk Road by E. E. Kuzmina, ed. Victor H. Mair
  • Mozi: A Complete Translation by Ian Johnston
  • Envisioning Eternal Empire: Chinese Political Thought of the Warring States Era by Yuri Pines
  • The Politics of Mourning in Early China by Miranda Brown
  • The Revelation of the Magi: The Lost Tale of the Wise Men’s Journey to Bethlehem by Brent Landau
  • A Story Waiting to Pierce You: Mongolia, Tibet and the Destiny of the Western World by Peter Kingsley
  • Rome and China: Comparative Perspectives on Ancient World Empires, ed. Walter Scheidel
  • The Camel’s Load in Life and Death: Iconography and Ideology of Chinese Pottery Figurines from Han to Tang and Their Relevance to Trade along the Silk Routes by Elfriede Regina Knauer
  • Ethnic Identity in Tang China by Marc Abramson
  • Mélange tantriques à la mémoire de Hélène Brunner/Tantric Studies in Memory of Hélène Brunner, ed. Dominic Goodall and André Padoux
  • Imperial China, 900-1800 by F. W. Mote
  • Local Religion in North China in the Twentieth Century: The Structure and Organization of Community Rituals and Beliefs by Daniel L. Overmyer
  • Tibetan Market Participation in China by Wang Shiyong
  • Chinese as It Is: A 3D Sound Atlas with First 1000 Characters by Conal Boyce
  • Language Choice and Identity Politics in Taiwan by Jennifer M. Wei
  • ABC English-Chinese, Chinese-English Dictionary, ed. John DeFrancis and Zhang Yanyin
  • Learning Chinese, Turning Chinese: Challenges to Becoming Sinophone in a Globalized World, by Edward McDonald

Disclaimer: I volunteer as SPP’s technical editor and maintain its Web site. But I certainly didn’t have any such position twenty-five years ago!

How to write adjectives in Hanyu Pinyin

Thu, 08/11/2011 - 05:05

Today’s selection from Yin Binyong’s Xīnhuá Pīnxiě Cídiǎn (《新华拼写词典》 / 《新華拼寫詞典》) deals with how to write Mandarin’s adjectives.

This reading is available in two versions:

QIM becomes freeware

Wed, 08/10/2011 - 11:29

I’m back from abroad now and starting to catch up on this and that. So here’s another post for you Mac users.

QIM, a popular pinyin-based Hanzi-input method, has become freeware. It was formerly US$20.

iOS app for writing Pinyin with tone marks

Sat, 07/16/2011 - 03:55

Those of you who, unlike me, own an iPhone, an iPad, or an iPod Touch may find the new Pinyin Typist Mac application of use.

Taffy of Tailingua had a look at this for me.

I’ve had a play with the Pinyin application and I’m generally quite positive about it. It’s clean, unfussy, and gets the job done. The automatic positioning looks to be flawless (i.e. typing zhuang1 gives you zhuāng, not zhūang)…. Overall though I like it, as it does what it set out to do without any showboating or unnecessary steps (excepting apostrophes).

Although I wish the apostrophe and hyphen were right there on the main screen instead of on a secondary one, the program allows people to do what they need to do: type Pinyin with tone marks.

It sells for US$3.99 US$2.99.

[Headline changed from "Mac app for writing Pinyin with tone marks"]

The where and why of missing second tones

Wed, 07/13/2011 - 04:09

My previous post mentioned that not all tonal permutations exist in the real world. For example, modern standard Mandarin has zhōng, zhǒng, and zhòng, but doesn’t have zhóng. I did not, however, get into any of the reasons for the absence of second-tone zhong.

Fortunately, my friend James E. Dew, who is much more qualified than I to discuss such fine points of linguistics, was kind enough to send in the explanation below. Jim used to teach the Chinese language and linguistics at the University of Michigan; and for many years he directed the Inter-University Program (a.k.a. the Stanford Center) in Taipei. He is also the author of 6000 Chinese Words: A Vocabulary Frequency Handbook and coauthor of Classical Chinese: A Functional Approach.

Most simply stated, Mandarin syllable shapes with unaspirated occlusive initials and nasal finals don’t occur in second tone. This can be restated a bit less opaquely for those who have not studied Chinese historical phonology, as follows:

Syllables that begin with unaspirated stops b, d, g, or affricates j, zh, z, and end in a nasal n or ng, as a rule don’t have second-tone forms. There are a few exceptions, such as béng (甭 / “needn’t”) and zán (咱 / “we”), which were new words formed by contraction — from búyòng and zámén, respectively — after the tone class split described below took place.

This came about because when Middle Chinese (of Sui-Tang times) píngshēng 平声/平聲 split into yīnpíng 阴平/陰平 (modern Mandarin “first tone”) and yángpíng 阳平/陽平 (M “second tone”), syllables with aspirated initials went into the new yángpíng class, while those with unaspirated initials all fell into the yīnpíng (M first tone) group, thus leaving no unaspirated syllables with nasal finals in the modern Mandarin second tone class.

An interesting corollary to this rule is that among Mandarin “open” syllables (those that end in a vowel) with the above-listed initials, almost all of the second-tone syllables derive from Middle Chinese rùshēng 入声/入聲, and their cognates have stop endings in the southern dialects that preserve rùshēng, as illustrated by the Cantonese examples given below.

For those who like to pronounce what they read, Cantonese rùshēng syllables have level tones, either high, mid or low. In the Yale romanization used here, high tone is marked with a macron (e.g., dāk), mid tone is unmarked, and low tone is signified by an h following the vowel. A double “aa” sounds like the “a” in “father,” while a single “a” is a mid central vowel. Thus baht sounds like English “but” and dāk sounds like English “duck.”   Mandarin Cantonese 拔 bá baht 白 bái baahk 薄 báo bohk 別/别 bié biht 伯 bó baak 博 bó bok 答 dá daap 德 dé dāk 敵/敌 dí dihk 毒 dú duhk 格 gé gaak 閣/阁 gé gok 國/国 guó gwok 急 jí gāp 極/极 jí gihk 集 jí jaahp 夾/夹 jiá gaap 結/结 jié git 節/节 jié jit 菊 jú gūk 覺/觉 jué gok 決/决 jué kyut 雜/杂 zá jaahp 澤/泽 zé jaahk 閘/闸 zhá jaahp 宅 zhái jaahk 哲 zhé jit 執/执 zhí jāp 直 zhí jihk 竹 zhú jūk 濁/浊 zhuó juhk

Pinyin’s never-used letter?

Tue, 07/05/2011 - 11:29

As most people reading this blog know, Mandarin has about 1,300 syllables (interjections and loan words complicate the count a little). If tones — a basic part of the language — are disregarded, the number of drops to 400 and something syllables.

Given 410 or so basic syllables and 4 tones — one of these days I need to write something more on the wrongful neglect of the so-called neutral tone — some people might expect there to be more like 1,640 syllables instead of about 1,300. The reason for the lower number is that not all syllables exist in all four tones. For example, quite clearly the official language of Zhōngguó does not lack zhōng … or zhǒng or zhòng. But zhóng is another matter.

So not all possible tonal variations of those 400-something syllables appear in modern standard Mandarin. But what about letters?

If you look at the official alphabet for Hanyu Pinyin, it’s exactly the same as that for English (other than in pronunciation, of course), which is a bit odd, especially considering that Pinyin doesn’t use the letter v (or at least isn’t supposed to for Mandarin words).

So in this case, I’m excluding v but otherwise being expansionist about the glyphs I’m calling letters. To be specific: I’m referring to a-z, minus v, but including ā, á, ǎ, à, ē, é, ě, è, ī, í, ǐ, ì, ō, ó, ǒ, ò, ū, ú, ǔ, ù, ü, ǖ, ǘ, ǚ, and ǜ. (Even though Ī, Í, Ǐ, Ì, Ū, Ú, Ǔ, Ù, Ü, Ǖ, Ǘ, Ǚ, and Ǜ never come at the beginning of a word, let’s not automatically eliminate them, because there is an occasional need for ALL CAPS.)

Are there any of those possible glyphs that don’t appear at all — at least as given in the large ABC Comprehensive Chinese-English Dictionary?

The answer, perhaps surprisingly, is yes.

Which letter is it?

a. ǖ b. ǘ c. ǚ d. ǜ

Have you made your choice?

It doesn’t take much thought to eliminate C as the answer. “Nǚ” (woman) is one of those first-couple-of-Mandarin-lessons vocabulary terms. And the word for green (lǜsè) is hardly obscure either. It might be harder to think of a word with the letter ǘ; but there are some. Donkey (lǘ) is probably the most common. So the answer is A: ǖ.

It’s important to note that the lack of ǖ is in appearance only. The sound ǖ occurs in plenty of Mandarin words; it’s just that Pinyin’s simplified orthography calls for writing “u” instead where ǖ follows j, q, x, or y.

But even though I didn’t find an example of ǖ, I’d encourage font designers not to scratch it from their list of must-have glyphs for Pinyin faces, especially since teachers will no doubt want to continue giving tone-pattern drills based on four tones for all vowels, regardless. Also, someone with a searchable edition of the Hanyu Da Cidian or maybe the new Oxford online edition is probably about to use the comments to point me to some obscure entry there….

How to handle ‘de’ and interjections in Hanyu Pinyin

Fri, 07/01/2011 - 11:41

Today’s selection from Yin Binyong’s Xīnhuá Pīnxiě Cídiǎn (《新华拼写词典》 / 《新華拼寫詞典》) deals with how to write Mandarin’s various de‘s, mood particles, and interjections.

This reading is available in two versions:

I’ve already written about the principles in previous posts. For example, see

How to write numbers and measure words in Hanyu Pinyin

Thu, 06/30/2011 - 05:40

Today’s selection from Yin Binyong’s Xīnhuá Pīnxiě Cídiǎn (《新华拼写词典》 / 《新華拼寫詞典》) is about writing numbers and measure words.

This reading is available in two versions:

For more on this, see these posts and the PDFs linked to therein.

How to write verbs in Hanyu Pinyin (Mandarin text)

Wed, 06/29/2011 - 15:18

Here’s the first of several selected readings from Yin Binyong’s Xīnhuá Pīnxiě Cídiǎn (《新华拼写词典》 / 《新華拼寫詞典》). It covers the writing of verbs.

This reading is available in two versions:

For those who would like to read about this in English, see

important book on Pinyin to be excerpted on this site

Sun, 06/26/2011 - 15:33

Xīnhuá Pīnxiě Cídiǎn (《新华拼写词典》 / 《新華拼寫詞典》), is the second of Yin Binyong’s two books on Pinyin orthography. The first, Chinese Romanization: Pronunciation and Orthography, is in English and Mandarin; much of it is already available here on Pinyin.Info.

Although Xinhua Pinxie Cidian is only in Mandarin, the large number of examples makes it easy to get the point even if you may not read Mandarin in Chinese characters very well.

This week I will begin posting some excerpts from this invaluable work. What’s more, I have made a version in traditional Chinese characters, which I hope that readers in Taiwan, Hong Kong, and elsewhere will take advantage of. So those not used to reading simplified Chinese characters will have a choice (which is more than the government of Taiwan is providing these days).

I’m extremely happy to be able to bring you this information and with to acknowledge the generosity of the Commercial Press. Stay tuned.