Mokuro: Read Japanese manga with selectable text inside a browser

Nothing on that website will install anything on your PC, but you still need to process the files using my fork of mokuro before uploading them.

I’ve actually been running mokuro on Google Colab recently. I don’t know how the service is free or if its region locked, but it works great and actually seems faster than running it locally. Plus running it on colab means you don’t need to install mokuro on your machine. If you want to give it a try, I wrote up these instructions for my spouse to use (that link will take you to the Colab site).

1 Like

I think it needs mokuro’d files uploaded. At least if I’m reading this right:

2 Likes

Hello :eyes:

Just wondering, how long does it usually take mokuro to run on one volume of a manga?
Cause mine looks like it has an estimated time of 9 hours and that just doesn’t feel right :eye::mouth::eye:

2 Likes

For a 250-page volume running on a computer’s CPU, it shouldn’t take longer than maybe half an hour (unless your CPU came out in the 90’s).

Is it running on the CPU or GPU for you?

3 Likes

Well that’s sad
It’s on CPU I’m not sure how to switch it I assume GPU would be faster

2 Likes

More info on using the GPU:

Did you install PyTorch as described there? If installed correctly, it’s supposed to just use the GPU automatically. Unless you’re using Apple Silicon and running Mokuro inside of a Docker container (as I found out to my disappointment).

4 Likes

How do you actually see if it runs on the CPU vs GPU?

2 Likes

I believe it says “CPU” in one of the initial messages when it’s using the CPU.

And then for GPU, I believe it will always reference “CUDA”:

2025-06-07 08:50:17.570 | INFO | manga_ocr.ocr:init:13 - Loading OCR model from kha-white/manga-ocr-base
2025-06-07 08:50:20.162 | INFO | manga_ocr.ocr:init:19 - Using CUDA
2025-06-07 08:50:21.250 | INFO | manga_ocr.ocr:init:32 - OCR ready

3 Likes

I’d assume it would show OpenCL or something for AMD cards

4 Likes

Thank you for that, I was sure I did that when I first intalled mokuro, turns out I had some PATH problems going on and version mismatches

Its using GPU now and going much much faster! :grinning_face_with_smiling_eyes:

7 Likes

Mine spits out a 50 line long dump of what I’m assuming is training data when it runs on GPU, so it’s very obvious to me. (It also does say CUDA).

2 Likes

Interesting. Here’s what I get (in this case, running two folders through it):

2025-06-07 08:50:17.186 | INFO     | mokuro.run:run:43 - Processing 1/2: /home/chris/Books/Comics/Japanese/老女的少女ひなたちゃん (7)
Processing pages...:   0%|                                                                                  | 0/164 [00:00<?, ?it/s]
2025-06-07 08:50:17.220 | INFO     | mokuro.manga_page_ocr:__init__:30 - Initializing text detector
2025-06-07 08:50:17.570 | INFO     | manga_ocr.ocr:__init__:13 - Loading OCR model from kha-white/manga-ocr-base
2025-06-07 08:50:20.162 | INFO     | manga_ocr.ocr:__init__:19 - Using CUDA
2025-06-07 08:50:21.250 | INFO     | manga_ocr.ocr:__init__:32 - OCR ready
Processing pages...: 100%|████████████████████████████████████████████████████| 164/164 [05:11<00:00,  1.90s/it]
2025-06-07 08:55:29.058 | INFO     | mokuro.run:run:43 - Processing 2/2: /home/chris/Books/Comics/Japanese/老女的少女ひなたちゃん (8)
Processing pages...: 100%|████████████████████████████████████████████████████| 164/164 [05:05<00:00,  1.86s/it]
2025-06-07 09:00:34.258 | INFO     | mokuro.run:run:51 - Processed successfully: 2/2

2 Likes

i think i’m just an idiot, i’ve followed the instructions and getting this

‘mokuro’ is not recognized as an internal or external command,
operable program or batch file.

okay, i need to run it from the python scripts folder?

now I’m just getting invalid path

okay, just copy the volumes into the script folder, I guess

You need to add the folder pip keeps its executables in to your windows PATH. Just run

python3 -m site --user-base

the folder the executables get downloaded to is the “Scripts” folder within that, something like

C:\Users\YourUsername\AppData\Roaming\Python\Python311\Scripts

Search for “Environment Variables” on windows, and pick the “edit environment variables” result. Click on “Properties”, then “Environment variables”.

You should get a large list of these with their values. Find PATH (I’m pretty sure either user variables or global variables will work for this, but it might not be present in one of them), double click it, then add the above path you got to that. Restart your command line and it should allow you to run mokuro now.

Alternatively just do “python -m mokuro” instead, whichever you prefer.

2 Likes

That’sa reasonable assumption but I just updated to an AMD ROCm supported GPU, and have tested Mokuro, with ROCm powered Pytorch. It says “Using CUDA” in my console, so it just seems to assume all Pytorch uses are using CUDA.

I ended up making a script for this since disabling the JS manuelly each time was a pain

Script
#!/bin/env lua

local css = [[
	body{
		overflow: scroll;
	}
	.page{
		display: flex !important;
		margin-bottom: 10px;
	}
	#pagesContainer{
		display: flex;
		flex-direction: column;
	}
	a, #topMenu{
		display: none !important;
	}
	.textBox{
		opacity: 0.75;
	}
]]

local js = [[
	pleaseDie();
]]

local html = io.stdin:read("*a")

html = html:gsub("</style>", css .. "</style>")
html = html:gsub("<script>", "<script>" .. js)

io.stdout:write(html)
1 Like

hey, just wanted to let you know @ChristopherFritz that I wasted an entire evening playing around with this tool and doing different things to make it work, also, Maciej Budyś, thank you very much for having created this tool :sob:

let me expand on my spiral into insanity

first, i built a web scraper that tackles an entire category of things on 一迅プラス, all of the manga that is either free of paid for by the user

something like:

the png files are scrambled to prevent people scraping it in a 4x4 grid of reversed coordinates, where the first column of tiles should be the first row of tiles and the tiles are rectangular, so you need to unscramble them:
**


**

Luckily, the algorithm for this is fairly simple, we’re just doing something like:

    unscrambleImage(imageUrl, fullWidth, fullHeight, pageNumber) {
        return this.retry(() => new Promise((resolve, reject) => {
            const img = new Image();
            img.crossOrigin = "Anonymous";
            img.src = imageUrl;
            img.onload = () => {
                const division = 4, tileConstraint = 8;
                const potentialTileWidth = Math.floor(fullWidth / division), actualTileWidth = Math.floor(potentialTileWidth / tileConstraint) * tileConstraint;
                const scrambledWidth = actualTileWidth * division;
                const potentialTileHeight = Math.floor(fullHeight / division), actualTileHeight = Math.floor(potentialTileHeight / tileConstraint) * tileConstraint;
                const canvas = document.createElement('canvas');
                canvas.width = fullWidth; canvas.height = fullHeight;
                const ctx = canvas.getContext('2d');
                for (let i = 0; i < division * division; i++) {
                    ctx.drawImage(img, (i % division) * actualTileWidth, Math.floor(i / division) * actualTileHeight, actualTileWidth, actualTileHeight, Math.floor(i / division) * actualTileWidth, (i % division) * actualTileHeight, actualTileWidth, actualTileHeight);
                }
                if (fullWidth > scrambledWidth) {
                    ctx.drawImage(img, scrambledWidth, 0, fullWidth - scrambledWidth, fullHeight, scrambledWidth, 0, fullWidth - scrambledWidth, fullHeight);
                }
                canvas.toBlob(blob => {
                    blob ? resolve({ filename: `page_${String(pageNumber).padStart(3, '0')}.jpg`, blob }) : reject(new Error(`Canvas toBlob failed for ${imageUrl}`));
                }, 'image/jpeg', this.IMAGE_QUALITY);
            };
            img.onerror = () => reject(new Error(`Failed to load image: ${imageUrl}`));
        }));
    },

then, when we have the image unscrambled:


we need to build the mechanism to scrape eeeverything from the website, the whole algorithm can be found in this Gist I made:

I added a ton of flexibility into it where you can choose which series you want to scrape, from which categories, and so on:


And then boom, scraping takes place:

The scraper compresses everything into .cbz files with jpegs inside of them:


…that you can later use Mokuro on:

and it… just works.

im so good at this entire manga thing

this was more of a proof of concept slash getting bored thing, but I think I might have scrapped thousands, thooousands of pages today :sob:

ahhh it was fun :sob:

8 Likes

@GolyBidoof, the Mokuro arc!

3 Likes

BRO what you’re actually really cool ??? WOAAH
Gah so much to read tho * sob *

2 Likes

so random question - when I originally did mokuro files were html and I didn’t need an internet connection to read them…so I could be camping w/o any internet for days and happily be sitting under a tree in the forest reading on the tablet or in the tent…

but now with the new mokuro format you have to go to the app page which is fine but that doesn’t seem to work if you don’t have an internet connection…

anyone know of a way to make “reader.mokuro.app” work w/o an internet connection? this might be a stupid question but … figured I’ll ask anyway