Testing GPT-4.0 accuracy rate on Japanese language

pm215 · June 13, 2023, 2:12pm

What I was trying to get across is simply that CS isn’t the same as programming or software engineering; this remains true whether you’re interested in the former or the latter.

But from a purely practical point of view, there are a lot more jobs in software engineering than in computer science – the latter is a bit like being a mathematician in terms of the number of job opportunities and the competition for them.

WeebPotato · June 13, 2023, 2:30pm

Though to be fair, if the goal is AI and/or machine learning in general, a degree in CS would be useful. At least based on how my colleagues at work are managing so far.

mariodesu · June 13, 2023, 2:30pm

because the weight might be right from the start?
Is bias a fixed variable or can it change during process?

so clear to be understood in a sleep deprived condition

Well, I don’t have any major… after high school I worked for the family business and there wasn’t time for it but I kept self studying a variety of subjects that completely exclude programming or mathematics. This is an old inspiration of mine but I neglected it for a long time.
But I’m so confident in my self teaching skills that I believe that are few the things I can’t achieve provided enough time.

I think it’s a gigantic achievement!! I feel like the poor guy in the rich club now

Not glad to hear this… but is it for master in artificial intelligence as well? I expect this sector to grow, no?

mariodesu · June 13, 2023, 2:31pm

that was my expectation

pm215 · June 13, 2023, 2:39pm

I expect there to be a lot more jobs doing “wire somebody else’s LLM up into our software and fiddle about with the user interface to it” than doing the cutting edge research into actually creating and improving the LLMs themselves. But yes, for that sort of cutting edge job in industry you likely want the academic qualification. (Do the research on that rather than trusting my gut feeling, obviously!)

mariodesu · June 13, 2023, 2:59pm

This is what I was expecting. That’s the place they give to the greatest promises (the chance to actually do research), but even if not for career, I’d still be doing it for curiosity so I’ll jump in and see what I can find.

mariodesu · June 13, 2023, 7:17pm

phase 2 & 3

Phase 2: Computer science

1.Computer Science Fundamentals

1.1. Algorithms and Data Structures: Understanding of basic data structures (arrays, linked lists, stacks, queues, hash tables, trees, graphs), algorithmic complexity, searching and sorting algorithms, recursion, dynamic programming

1.2. Computer Architecture: Basics of computer organization and design, assembly language, processor architecture, memory hierarchy

1.3. Operating Systems: Process management, memory management, file systems, concurrency, security and protection

1.4. Networks and Communications: Network models, data transmission, network topologies, routing and switching, network security

1.5. Databases: Relational databases, SQL, NoSQL, database design, normalization, transaction management, concurrency control

1.6. Software Engineering: Software development methodologies, version control, testing, debugging, system design and analysis

2.Programming

2.1. Python: Syntax, data types, control flow, functions, error handling, file I/O, libraries, object-oriented programming, functional programming

2.2. Java: Syntax, data types, control flow, object-oriented programming, error handling, file I/O, libraries, data structures, GUI programming

2.3. C/C++: Syntax, data types, control flow, functions, error handling, file I/O, libraries, pointers, memory management, data structures, object-oriented programming

3.Web Development

3.1. HTML/CSS: Elements, attributes, layout, styling, responsive design

3.2. JavaScript: Syntax, data types, control flow, functions, DOM manipulation, event handling, AJAX, frameworks and libraries (React, Angular, Vue)

3.3. Backend Development: Server-side scripting (Node.js, Express.js), databases (SQL, MongoDB), RESTful API design, authentication and authorization

4.Data Structures and Algorithms

4.1. Advanced Data Structures: Balanced search trees, heaps, hash tables, disjoint set union

4.2. Advanced Algorithms: Graph algorithms, greedy algorithms, divide and conquer, dynamic programming, network flows, NP-completeness

5.Theory of Computation

5.1. Automata and Formal Languages: Finite automata, context-free grammars, Turing machines

5.2. Computability and Complexity: Church-Turing thesis, decidability, time and space complexity, P vs NP problem

6.Computer Systems

6.1. Computer Organization: Digital logic, computer arithmetic, instruction set architecture, CPU design

6.2. Systems Programming: Process and thread management, inter-process communication, memory management, I/O management

6.3. Computer Networks: Internet protocols, network architectures, wireless and mobile networks

7.Software Development

7.1. Software Design: Object-oriented design, design patterns, user interface design

7.2. Software Testing: Unit testing, integration testing, system testing, test-driven development

7.3. Software Maintenance: Debugging, refactoring, legacy code management

Phase 3: Artificial Intelligence and Large Language Models

1.Machine learning

1.1. Supervised Learning

1.1.1. Regression: Linear regression, polynomial regression, ridge regression, lasso regression

1.1.2. Classification: Logistic regression, k-nearest neighbors, support vector machines, decision trees, random forests

1.1.3. Evaluation: Accuracy, precision, recall, F1 score, ROC curve

1.2. Unsupervised Learning

1.2.1. Clustering: K-means, hierarchical clustering, DBSCAN

1.2.2. Dimensionality Reduction: Principal component analysis, t-SNE

1.2.3. Association Rules: Apriori, Eclat

1.3. Neural Networks

1.3.1. Perceptron: Model, learning algorithm

1.3.2. Multi-Layer Perceptron: Backpropagation, activation functions

1.3.3. Deep Learning: Convolutional neural networks, recurrent neural networks, long short-term memory, autoencoders

2.Natural Language Processing

2.1. Text Processing

2.1.1. Tokenization: Word tokenization, sentence tokenization

2.1.2. Stemming and Lemmatization: Porter stemmer, WordNet lemmatizer

2.1.3. POS Tagging: Penn Treebank, universal POS tags

2.1.4. Named Entity Recognition: Person, organization, location, expressions of time, quantities, monetary values

2.2. Vector Space Models

2.2.1. Bag of Words: Term frequency, document frequency, TF-IDF

2.2.2. Word Embeddings: Word2Vec, GloVe

2.2.3. Document Embeddings: Doc2Vec, BERT

2.3. Sequence Models

2.3.1. RNN: Model, vanishing and exploding gradients, applications

2.3.2. LSTM: Model, forget gate, input gate, output gate

2.3.3. GRU: Model, update gate, reset gate

3.Large Language Models

3.1. Transformer Models

3.1.1. Attention Mechanism: Scaled dot-product attention, multi-head attention

3.1.2. Transformer: Encoder, decoder, positional encoding

3.2. GPT

3.2.1. Architecture: Transformer decoders, masked self-attention

3.2.2. Training: Fine-tuning, transfer learning

3.2.3. Evaluation and Applications: Text generation, translation, summarization, question answering

4.Artificial Intelligence

4.1. Search Algorithms

4.1.1. Uninformed Search: Breadth-first search, depth-first search, uniform-cost search

4.1.2. Informed Search: Greedy best-first search, A* search

4.1.3. Local Search: Hill climbing, simulated annealing, genetic algorithms

4.2. Knowledge Representation

4.2.1. Logic: Propositional logic, first-order logic

4.2.2. Semantic Networks: Concepts, instances, attributes

4.2.3. Frames: Slots, fillers, inheritance

4.3. AI Ethics

4.3.1. Fairness: Bias in data, bias in algorithms

4.3.2. Accountability: Traceability, explainability

4.3.3. Transparency: Openness, communication

4.3.4. Privacy: Data protection, anonymity

@pm215 @WeebPotato @Vanilla @Kazzeon thoughts?

I think this will be a self learning journey, no courses or at least not joining one in the next 2 years, so I want a detailed map.

Also, I’m considering starting a thread-diary to track my studies. Would any of you be willing to drop in occasionally to check on my progresses?

WeebPotato · June 13, 2023, 7:39pm

If it helps you, sure why not.

Regarding phase 2 and 3. Is that a curriculum pulled from somewhere? Like a CS course?

A good starting point I think is the CS50 Harvard course. However, you don’t actually need to learn also Java and C++. Worth dipping into C to understand how memory management works under the hood, but the rest you can do in Python.

Same for databases. Knowing a couple of SQL databases like PostgreSQL, MySQL and maybe the Microsoft one + writing and understanding queries is enough. You don’t need to learn NoSQL at all.

Since your goal is AI, you can cut down on a lot of fluff. It’s more important to get the CS fundamentals right.

mariodesu · June 13, 2023, 7:49pm

GPT-4

Ok great, how would you adjust Phase 2 accordingly then?

Of course it would help! Even just watching the thread. I think it’d help with my consistency and motivation as well

WeebPotato · June 13, 2023, 7:59pm

Which ChatGPT compiled from existing CS undergrad curricula . Like, the format looks box-standard. That being said, I would recommend checking actual CS curricula to make sure ChatGPT is not mixing up things and in general relying less on ChatGPT. Google-fu is an important skill for IT people in itself.

You could also ask ChatGPT for that . But more seriously, 1.4, 2.2, 3 whole can be kicked out.

Udemy and Coursera have pretty decent courses and they tend to go on discounts. CS50 from Harvard without grading is for free if I’m not mistaken.

sergiop · June 13, 2023, 8:29pm

GTPChat is really, but incredibly good… at making made-up stuff that appears correct but it is not. Example: it may fail to get the onyomi reading correct even when there is a single onyomi reading.

pm215 · June 13, 2023, 8:30pm

I love the way it hedges about its training data only being up to 2021 and maybe the readings for the sutra have changed since then But also if you believe Wikipedia the correct reading for 深 in that line of the sutra is indeed じん, so the problem is it doesn’t have the courage of its convictions when you press it on the matter…

WeebPotato · June 13, 2023, 9:01pm

in the context of the Heart Sutra

I think one improvement we don’t need in GPT-4 is it lying harder

yamitenshi · June 13, 2023, 9:03pm

In its defense, it’s not wrong

It’s just outside the context of the Heart Sutra as well

EDIT: oh wait, apparently it is wrong? Whelp

WeebPotato · June 13, 2023, 9:18pm

Arguably, things like that make it unintentionally good at philosophical discourse.

mariodesu · June 13, 2023, 9:19pm

I never blindly trust it but I was a bit lazy this time, I provided starting point and end goal, and it gave back that entire roadmap.

I also have to check if MIT has anything on the subject

mariodesu · June 13, 2023, 9:20pm

(6)

Can anyone point out inaccuracies?

yamitenshi · June 13, 2023, 9:21pm

I can recommend CS50, took it myself years ago and it does teach some solid basics

mariodesu · June 13, 2023, 9:22pm

Can I ask, what background do you have on the subject?

yamitenshi · June 13, 2023, 9:28pm

Computer science? In a formal sense, not much, but I do write software for a living

Topic		Replies	Views
The ChatGPT Thread Wiki Japanese Language	37	1635	June 9, 2024
Using ChatGPT for mnemonics Tips & Tricks	31	1695	September 17, 2025
Practicing Japanese with ChatGPT Resources	21	5657	March 14, 2024
GPT for vocabulary and grammar explanations Resources	23	682	February 1, 2025
Working on a SRS Site for Grammar (unlike bunpro) Grammar	32	1099	November 26, 2025

Testing GPT-4.0 accuracy rate on Japanese language

Phase 2: Computer science

Phase 3: Artificial Intelligence and Large Language Models

Related topics