を: “wo” or “o” (and others)

Is there a preference for the way を is pronounced? “wo” or “o”? I’ve heard both.

I could also ask about ん: “n” or “m”?

Or ひ: “hī” or “shī”? Particularly when used at the beginning of a word.



With the disclaimer that I might be wrong about some of this…

Generally, in modern speech, “o”. The “wo” pronunciation can sometimes be heard in songs etc., but I was under the impression that that’s more a certain affectation or a way to add emphasis.

It’s not pronounced as “shi”, instead, the “h” before the “i” becomes a [ç], the same sound that is used e.g. in Standard German in the word “ich”. If you can’t produce that sound, I think “h” is preferable to “sh”, which is pronounced at a completely different place in the mouth.

Depends on the context. It will usually change its articulation to correspond to the following consonant. For example, before a labial sound (a sound produced by closing your lips) such as b or p, it becomes an “m” sound (which is also labial). Before “k”, it becomes “ng”, etc.

Also, when word-final or before vowels, it tends to almost disappear while the preceding vowel is spoken “through your nose” (a bit like in french).


Thanks! I’ve been predominantly using “o” for を so glad to hear that.

And thanks for clarifying the rest, in particular, ひ. I really have trouble finding where to make that sound in my mouth when pronouncing 人 or 女の人, for example (例えば, thank you WK). My wife speaks some German so maybe she can help me with that. :slight_smile:


It’s the same phoneme as the Spanish J.
To me, it’s almost like pronouncing a K but letting air pass through.


Cool, changing that K sound with my tongue helps a lot!

As I start to experiment with it, I think the other thing that is happening is that the “i” sound in ひ is being dropped when pronouncing 人. Similar to dropping other kana vowels like でした (desh-ta) or -ます (mas-).

That’s referred to as devoicing, it happens very commonly with い and う syllables.


Regarding the ん, there are actually about 8 different ways it gets pronounced. I think it depends on where it falls in the word, and what syllables come next. There are a number of videos out there about this, but the best advice I can give is to just listen to the audio for each vocab and try to imitate that.


Thanks, I’m really trying to pay attention to all of nuances of pronunciation: pitch accent, “devoicing,” (per CodingFox), “ng” sound of g kana (e.g. 映画, not sure what that is called), as well as the others I’ve mentioned. I’m trying to have good pronunciation from the get go. I don’t need sound native, just good enough to be understood well. Surprisingly, a lot of these aren’t stressed in textbooks much. I would say of all of these, pitch accent would be the most important one to start with. The last thing might be the rhythm of sentences and phrases. I really love discovering all of these intricacies of a language.



I feel like putting emphasis on pitch accent is not that important, despite the tons of people online who will tell you otherwise. Just listen to the words and try and copy that.

Perhaps 40+ years ago when access to native materials was difficult and exposure to native audio was a challenge, pitch accent would have been more important, but now you have a million options to listen to native content and naturally pick up the rhythm of the words and language.

I think the most important thing to realize is that romanji is not phonetically accurate, it’s just a decent approximation. Nothing beats actively listening to a native speaker.


I definitely think just being aware of these things is super helpful! For pitch accent, like importance-wise, I’d probably compare it to stress in English. It’s functionally the closest equivalent (though they definitely have plenty of differences), and really if you think of how stress works in English, pitch is part of that too. For example if you say “syllable” the first syllable is stressed, and it’s also the highest in pitch. So from that standpoint pitch accent isn’t really a crazy concept, it’s just a different way in which those sorts of distinctions are made.

For the “ng” thing, I’d say that’s really not something you have to worry about. It’s not like a set rule that all speakers do, it’s just a variation that some people do in all cases where it’s appropriate, some sometimes, and some not at all. It’s totally fine to just use “g”, as far as I know no one would bat an eye. It’s good to be aware of when listening to other people though so you can figure out what they’re saying.

(I’m not an expert or anything so feel free to call out if I’m wrong somewhere, I’m just a linguistics student trying to learn Japanese haha)


Some never use it, sure, but those who do don’t do it all the time. There are rules. There are situations where no one will ever use nasal g, so you would sound strange using it there.


For sure, I think it’s generally safer not to unless you specifally have an awareness of when it does and doesn’t happen. “All of the time” is definitely too broad, you’re right, I more meant “all of the time where it is possible” but that’s a totally different statement.

If you haven’t already found it I would recommend the Tofugu podcast about pronunciation (there may also be an article, but for obvious reasons the podcast is better for this!). It covers the variation with ん pretty well.

If you listen to Kenichi throughout Wanikani, you can get a good start on trying to figure out the rules.

Honestly, that’s all the reason you need to learn pitch accent or at least listen out for it as you study.

There’s a saying in MMOs that applies here too: sometimes you can optimize all the fun out something.

Optimal is good and maybe leaving pitch accent out until the end is the most optimal way to learn Japanese, but I’d say suboptimal but interesting is better. :wink:


Had a dude in my class in Japan that would hard pronounce it like WOAH.

Teachers kept telling him over and over in the beginning but they just stopped trying.


I’m not very good with pronunciation, so these are the rules that my wife (native japanese) told me:
を-> o (in case of the object)
へ → e (in case of destination)
きれい → kireh
とう → toh
くさい → ksai


Id say its 100% not lol



The normal “i” and “u” sounds in Japanese are voiced (meaning that you vibrate your throat when pronouncing them), but they are commonly devoiced when they occur between two unvoiced consonants (like “h” and “t”) or after an unvoiced consonant at the end of a word. Instead of being devoiced, they may also be silent, particularly at the end of a word.