Testing GPT-4.0 accuracy rate on Japanese language

Yeah, the problem I saw was that the system needs more instructions in order to be more accurate, and going by, “is X right?” doesn’t accomplish that.

But then you have to almost be an expert in the subject (or at least in asking questions) in order to get it to make a good explanation.

2 Likes

Oh, I scored it like I was marking a test. First part: one correct observation, bad example, no rendaku, one mark. Second part: basically correct, three marks. Third part: not relevant, zero marks. Total score: four.

2 Likes

EDIT: answered to the wrong person…

I don’t understand what this means, could you explain further? I’m not sure I ever approached a 1-10 vote with this logic (not that I got what your logic exactly was here)

Basically, for everthing ChatGPT got right, it earned a point.

1 Like

OT

Interesting article from a Turing Award.
Summarized conclusion: analysis suggests that no current AI systems are conscious, but also shows that there are no obvious barriers to building conscious AI systems.

1 Like

Not an entirely surprising conclusion, but it’s always good to see what more intelligent people with more knowledge of the subject think :slight_smile:

3 Likes

@mariodesu thought this may interest you

2 Likes

Thanks for tagging, that confirmed my thoughts and it’s basically how I use it (in referemce to the part about gpt)
It also gave me the idea of trying to follow some japanese account on 𝕏 , I think it could be a nice learning source especially when I have no time to sit and read something serious

Maybe this is more appropriate to post in my computer science thread but, since the subject was widely discussed here as well, and since this is properly about checking GPT responses validity, here we go:

The answer’s format is due to some custom instructions I’ve been trying, seems to work better but I’m not really sure about accuracy.

Accuracy
  • Completely wrong
  • Mostly wrong
  • Mostly right
  • Completely right
0 voters

I’m not sure it interpreted your question correctly. When you say “padding around the RGB triples grid” that makes me think padding outside the pixel array - its answer however is purely about padding inside the pixel aray, at the end of each row.

Is that what you were asking about or did it answer the wrong question?

In case that’s unclear:

The file format is essentially

Metadata
[padding]
Pixel array
[padding]
Metadata

And the pixel array is essentially

Row of pixels
[padding]
Row of pixels
[padding]
Row of pixels
[padding]
Row of pixels

Extended as far as needed, of course.

Where I interpret your question as you asking about the padding in the first block, but the actual answer is about the padding in the second block.

2 Likes

Yes, my question was about the pixel array! :ok_hand:

I thought it was more something like:

[padding] [pixel triple] [pixel triple] [...] [padding]
[padding] [pixel triple] [pixel triple] [...] [padding]
...
..
.

Where the padding is only added if the number of pixels in a row is not a multiple of 4 (for example, in case of a 3x3 grid).

1 Like

Ah right, in that case the answer is about the correct padding and it checks out to me.

The padding is added to the end of each row, for the reasons GPT mentions - every row must start on an address that is a multiple of 4 for efficiency reasons, with no padding at the beginning, only at the end. So you get:

[pixel triple] [pixel triple] [pixel triple] [...] [padding]
[pixel triple] [pixel triple] [pixel triple] [...] [padding]
[pixel triple] [pixel triple] [pixel triple] [...] [padding]

And so on.

2 Likes

I’ve been using ChatGPT to fill in on nuance but with answers like this I’m afraid it’s just confidently incorrect most of the time.

I could have sworn it was better before.

5 Likes

I love how it doubles down on ほぼく, but when you start to go “… are you really really sure of that?” it falters, and decides to just erase the whole reading from existence.

Heh, it summarised a long text and decided to omit some details.

3 Likes

Is that 3.5 or 4?
Anyway, in my experience it shouldn’t be trusted on that type of questions (especially), regardless of the model…

Ouch, this one’s pretty bad. I haven’t seen it make mistakes like that before. @mariodesu I think we used to get better results on questions about kanji readings, right?

Sounds almost like a Seven Ocean translation of School of the Elites, lol.

1 Like

I think so. I keep seeing posts on 𝕏 from people claiming that it got performance downgraded, I think it’s possible, also because I imagine they’re prioritizing safety of use over accuracy

1 Like

I guess that’s fair. But if the model can’t do simple dictionary look-ups + extra context, it becomes significantly less useful for these types of questions.

Not that it should be used for such in the first place :sweat_smile:

1 Like

I got access to the new multimodal GPT-4V today, which can accept images as input, so I tried to give it a screenshot of a N1 mockup test. Unfortunately it made so many errors in scanning the text that it couldn’t answer correctly…

My guess is that GPT-4 could pass the N1 easily, but it’s a pain to enter manually an entire test, so I waited for the image version to try :sweat_smile:

3 Likes

That is weird, I think it has to do with the training data it was trained on because atm it can do some amazing things



How did you access the image function? I got access to the voice mode by using a VPN which was able to speak in Italian even before the language support was added haha

@Arzar33 would you mind me sharing that screenshot on twitter?

Anyway, recently I realized that the only reason GPT4 is still as inaccurate in foreign languages like Japanese (and it occasionally produces some senseless sentences in Italian as well) is because the current models are not trained on enough data, or that’s my guess

1 Like