Spoiler
Sooo…
I was away for some time, congratulations for discovering the first step @Masayoshiro
I wanted to know if any other articles were related to the riddle, so I originally started searching manually in the source code for comments for each article. After 5 articles, seeing that there are 20 pages of articles in the archives of Tofugu, I decided to automate it.
I’m divided about this, because I actually found something… It might look like cheating, because we haven’t solved the second step of B3 yet to get to the third step. But in the meantime, anybody with enough spare time could have done that without a script.
Script
import urllib.request
from bs4 import BeautifulSoup
from bs4 import Comment
def get_articles_urls(archive_page_number):
urls = []
if archive_page_number == 1:
archive_page_suffix = ''
else:
archive_page_suffix = 'page/' + str(archive_page_number)
archive_page_url = 'https://www.tofugu.com/archive/' + archive_page_suffix
with urllib.request.urlopen(archive_page_url) as response:
html = response.read()
soup = BeautifulSoup(html, 'html.parser')
article_tags = soup.select('li.article-index-item > a')
for article_tag in article_tags:
urls.append(article_tag.get('href'))
return urls
def get_comments_of_article(article_url):
with urllib.request.urlopen(article_url) as response:
html = response.read()
soup = BeautifulSoup(html, 'html.parser')
return soup.find_all(string=is_body_comment)
def is_body_comment(text):
# There must be a better way to get comments of the body tag
return isinstance(text, Comment) and any(parent.name == 'body' for parent in text.parents)
for page_number in range(1, 21):
print("Archive page " + str(page_number))
article_urls = get_articles_urls(page_number)
for article_url in article_urls:
absolute_article_url = 'https://www.tofugu.com' + article_url
print(" Article " + absolute_article_url)
comments = get_comments_of_article(absolute_article_url)
print(" " + str(comments))
The raw results are pretty long, so I stripped them to only contain articles which have comments in their body tag. And surprisingly, it’s only four articles:
https://www.tofugu.com/japanese/beginner-japanese-textbook/
Comments
[’ 9784789014410 89 41 3 12 31 ', ’ 9784789014403 209 1 24 29 41 50 84 ', ’ 4883196038 36 2 10 45 130 4 55 ', ’ 1568363850 23 33 2 5 10 34 19 ', ’ 4889962344 50 4 33 8 13 434 4 ', ’ 9780300038347 168 19 1 6 5 5 5 10 9 ', ’ 0887275494 33 98 2 55 34 9 10 ', ’ 9780976998129 39 4 3 45 134 2 66 ', ’ 1880656906 43 99 3 24 55 16 9 ', ’ 4789004546 181 8 49 3 20 2 33 ']
https://www.tofugu.com/japanese/how-to-install-japanese-keyboard/
Comments
[" Durt Durt! Hmm, not bad, you’re on the right track! "]
https://www.tofugu.com/reviews/genki-textbook/
Comments
['Durt Durt! Almost! ']
Article https://www.tofugu.com/reviews/dictionary-of-basic-japanese-grammar/
Comments
[’ Durt Durt! Nice try! ']
I skimmed through the articles but it’s getting late here so I need tosleep.
Here are my notes so far:
-
The comments of the beginner-japanese-textbook article are justISBN 10 codes for the books. That must be legitimate commentswith no link to the riddle.
-
The comment of the dictionary-of-basic-japanese-grammar articlemight be the result of following a red herring, hence the “Nice try!”
-
The comment of the genki-textbook article make it seem like areal step, but I did not see anything looking like a clue in thearticle.
I don’t know what to do with the articles yet and we might not knowbefore solving the step 2 of B3, but they can maybe give us somehindsight on what we’re looking for.