Testing GPT-4.0 accuracy rate on Japanese language

mariodesu · June 9, 2023, 8:58am

here’s a transcription of its answer:

I apologize for any confusion earlier. Let’s review the answers again:

Problem #1:

A) 3）みれば - The conditional form “みれば” is appropriate here as it translates to “if it rains”.

B) 1）で - The particle “で” is used to indicate the time when an action takes place, in this case, “now”.

C) 2）も - The particle “も” is used to express “even if”, making the sentence “Even if you try to call Tanaka-san, it doesn’t connect.”

D) 1）が - The particle “が” is used to indicate the subject of the sentence, in this case, “ピアノ” (piano).

E) 2）だけ - The particle “だけ” is used to express “only”, making the sentence “If it’s a taxi, it’s a distance that doesn’t cost more than 1000 yen.”

F) 4）ので - The particle “ので” is used to express “because”, making the sentence “The more work there is, the less free time there is.”

G) 2）んでいた - The verb form “んでいた” is used to express a hypothetical situation in the past, making the sentence “If he hadn’t helped, she would have died.”

H) 1）を - The particle “を” is used to indicate the direct object of the verb, making the sentence “Such a beautiful person likes me, sometimes I can’t believe it.”

Problem #2:

I) 2）ま - The kanji “目” is read as “ま” in the context of this sentence.

J) 1）きゅうる - The kanji “誰何” is read as “きゅうる” in the context of this sentence.

These answers are based on my understanding of Japanese grammar and kanji. However, please consult with a Japanese language expert or teacher for accurate answers.

have no clue as well, but on 5+ tests answers using wolfram were all more correct so…

heck, I don’t know why but it finally managed to get the last one right:

I apologize for the confusion. The kanji “誰何” is read as “すいか” in this context, which means “to question” or “to interrogate”. So, the correct answer for J) is:
J) 王妃が、見極めるような目で(誰何)した。
Answer: 4）すいか

Vanilla · June 9, 2023, 8:59am

Which is especially interesting because not only is すいか the only way to read 誰何, but also because すい and か are the only onyomi readings of 誰 and 何 respectively.

Just asking chatgpt how its read directly yielded this (which is wrong)

In Japanese, the characters 誰何 can be read as “shika” (し・か) or “daika” (だ・い・か), depending on the context and pronunciation preference.

mariodesu · June 9, 2023, 9:18am

you know what’s funny? In the quiz, 9 times out of 10 it’ll answer きゅうる
note that you repeated the 8th question twice

WeebPotato · June 9, 2023, 10:35am

Or not feeding confidential data into it, but whoops Samsung .

It’s like with Stable Diffusion. Given the right combination of prompts it can give amazing results, but if you try to deviate too much, you’ll end up with 3 legs and 4 arms.

By the way, have you tried talking in Japanese to it? I found that to be way more useful than just telling it in English to reason something about Japanese. A couple of vTubers tried it in Japanese and the results were pretty good.

mariodesu · June 9, 2023, 10:41am

Didn’t get the reference…?

Actually no, it’s interesting. At the moment I’m just incapable of asking questions in JP but I’ll try using DeepL and if I have a chance I’ll ask that Japanese friend

WeebPotato · June 9, 2023, 10:44am

Some people are extremely liberal with how they use generative AI at work and feed code snippets with core company asset names and/or user data in it. There was an incident at Samsung concerning that. This is of course true for any AI model which can be fed data by users like DeepL. Worse yet, given shorter prompts, some of that data can leak out as output generated by the model. Happened to me once in DeepL when I gave the model a short prompt and as an output got a full name of an institution in English, only vaguely related to my prompt.

anon9242278 · June 9, 2023, 11:25am

When it makes the difference between someone trying or not I support it.

But, also it is interesting to ask it questions you already know the answer to to see how it does over time as a way to keep tabs on its capabilities. It does frequently generate nonsense, but it also occasionally generates something unexpectedly useful one might not think of on their own. It has improved, and it will keep improving.

If the question or concern is quality, everything is crap these days anyway thanks to managers/management pushing people to just get something out. I’m starting to think management actually wants crap and is not interested in quality at all. If it is going to be crap anyway, why not speed up hammering out the crap with this and have a bit of fun?

mariodesu · June 9, 2023, 11:32am

Yes same thought, at times it feels like brainstorming with myself. I treat GPT like a kid that has to be told exactly how to do things and always keep in mind the chance of its answers being a bug or a misconception, but honestly in the last days of use its have noticed incredibly few hallucinations, but perhaps this judgement is totally biased for some reason, don’t know (here’s why I decided to share my interactions over here).

I remember reading somewhere that using AI’s to code is relatively safe because you quickly know if it works or not, lol. I don’t program so I’m curious about more experienced opinions on this.
Anyway in the end I suppose it’s matter of how big that ‘relatively’ is

Kazzeon · June 9, 2023, 11:53am

I’m still waiting. Am I missing something?

mariodesu · June 9, 2023, 11:59am

Brother, I created the thread yesterday night and fell asleep 3 mins later, today morning I was ~~supposed to be~~ at work and just got out, so patience a little bit more I’m at a pub reading something rn, as soon as I have some questions I’ll post what I get from GPT

mariodesu · June 9, 2023, 12:14pm

(1)

General+context specific question on the word 原型

can anyone confirm/disprove?

Spot on
Generally accurate
Often off the mark
Completely inaccurate

0 voters

Fryie · June 9, 2023, 12:31pm

I know anecdotally of people who have gone down rabbitholes trying to implement suggestions that ChatGPT suggested to them that didn’t work because e.g. the API they were trying to use just didn’t even offer that capability that ChatGPT dreamed up.

WeebPotato · June 9, 2023, 12:38pm

It’s kind of like this:

to know if the model is right or wrong you need to know that beforehand (apriori) to verify
to know if the code is correct, you have to test it (posteriori)

So as long as you can reason about the code snippet the AI model generated and/or test it, it is safe.

mariodesu · June 9, 2023, 12:39pm

Oh I see, I suppose that it wasn’t adequately prompted about the API’s limitations, that’s a chance?

WeebPotato · June 9, 2023, 12:41pm

That’s not how it works, I believe. In order for the model to know about the features of a specific API, it would need to be fed contextual information about that specific API, based on which the model can infer keywords, create tokens + weights for them and then be ready to respond when asked.

mariodesu · June 9, 2023, 12:43pm

this sounds like an endless doubt spiral to me, what am I missing

congratulations on not choosing the word “posteri” like many would have had

That’s what I said

WeebPotato · June 9, 2023, 12:47pm

To me “API limitations” refers to the things the API doesn’t consider, edge-case checks it misses, restrictions on input data types/sizes, etc. That’s different from the inclusive API feature set.

By the way, if you have any experience with Stable Diffusion I can use metaphors around that since it’s easier for me as well.

mariodesu · June 9, 2023, 12:50pm

well, I’d like to tel you that I oversimplified and misunderstandings happen when you do, but… I’m just ignorant on the subject

Unfortunately, zero… but I got struck by the interest on LLM and related lately, and decided that this the flame burns I’ll be reading stuff about it.

WeebPotato · June 9, 2023, 12:52pm

Definitely an interesting field, I think! I treat it more like art, though, since generative AI models are more suited to being creative, rather than offering concrete evidence for something .

mariodesu · June 9, 2023, 12:52pm

Also I’d appreciate if anyone can provide verbal feedback after these posts so I 1) know where it was wrong and train it, and 2) Get a better idea of its accuracy level

Topic		Replies	Views
The ChatGPT Thread Wiki Japanese Language	37	1512	June 9, 2024
Using ChatGPT for mnemonics Tips & Tricks	30	1571	September 17, 2024
Practicing Japanese with ChatGPT Resources	21	5205	March 14, 2024
GPT for vocabulary and grammar explanations Resources	23	585	February 1, 2025
Working on a SRS Site for Grammar (unlike bunpro) Grammar	31	996	November 26, 2024

Testing GPT-4.0 accuracy rate on Japanese language

Related topics