One issue holding back the adoption of Hepburn has been that the standard national curriculum (gakushū shidō yōryō) calls for all children to be taught romaji beginning in the third grade (previously fourth grade) of elementary school. It's taught in Kokugo (national language, i.e., Japanese) classes and included in those textbooks, as romaji characters are used in Japanese alongside kana and kanji as well as, increasingly, in daily life (user names, passwords, etc.). At that age, native speakers of Japanese can acquire kunreishiki more easily, as the consonant representation corresponds more closely to the Japanese phonology that they have internalized.
It doesn't sound like a lot to me, either. I have known many people who moved to another country for graduate study. Some of them ended up settling in that country, but others pursued further study or employment in yet other countries. And perhaps the largest group among my acquaintances are those who eventually moved back to their home countries. They feel more comfortable there, they have family there, or, in many cases, returning home is what they intended to do all along.
Long-time Tokyo/Yokohama resident here. I’m basically the same: Especially if I’m by myself and near a train station or retail area, I just walk around to see what’s available and choose someplace to eat. Only if I am planning a meal with others do I look for options online, and then, in addition to Google and Apple maps, I also use sites such as tabelog.com and restaurant.ikyu.com.
I haven’t been outside Japan for nearly a decade so I can’t compare it with other countries, but my impression is that Japan has more small restaurants than some other places. It’s not unusual to go into a ramen, curry, gyoza, soba, or other eating place with fewer than a dozen seats and staffed by just one or two people.
The existence of such small places increases the eating-out options. I don’t know why such small food businesses are viable here but not elsewhere; perhaps regulatory frameworks (accessibility, fire, health, tax, labor, etc.) play a role.
Totally. We’re definitely lucky over here. From my talks with people in restaurant industry in NA, it’s just extremely expensive to start a business, on top of the regulatory restrictions that you’ve mentioned. And obviously the holy grail of money making - liquor. I can get beer in almost every random ramen shop near me. It takes months/years of approval to open a place with a liquor license in Vancouver, Canada. Margins on alcohol are huge, that gives breathing room to little margins places make from food.
About 20 hours after the earthquake, the University of Tokyo sent out a follow-up advisory to faculty, students, and staff [1, scroll down for English]. This part hit home with me:
“The ‘follow-up earthquake advisory for the Hokkaido and Sanriku Coastal regions’ was established following the earthquake (M7.3) that occurred off the coast of Sanriku on March 9, 2011, two days prior to the Great East Japan Earthquake (Tōhoku Region Pacific Offshore Earthquake) that occurred on March 11, 2011.”
I was eating lunch in a fourth-floor restaurant in Nihonbashi, Tokyo, on March 9, 2011, when that preliminary tremor occurred. I had felt many earthquakes before, but that one seemed different: longer, slower, creepier. It didn’t cause any damage, but I often recalled it after the much bigger one struck two days later. (I missed the March 11 quake, as I happened to leave for Osaka just a few hours before it hit. My office back in Tokyo was damaged, though.)
Same here. And I've been old-guy grumbling for years now about kids-these-days getting into vinyl and other retro technology that I was happy to be rid of.
> I've tried reading some Japanese-language books meant for kids around age 10...
I’m not sure at what target age kids’ books stop using word spacing, but books for younger children generally use it. Nevertheless, if you are used to seeing words written in kanji, even with word spacing an all-hiragana text can still trip you up, for the reasons you noted.
Side comment: Something I haven’t seen remarked on much is how Japanese can be easier for small children to start reading than English is because of the nearly one-to-one correspondence between character and sound for kana. My two daughters and now my six-year-old grandson have all grown up with Japanese as their first language, and they all started reading hiragana-only children’s books earlier and more easily than I, at least, learned to read English when I was a child. My grandson has also picked up katakana on his own; he is into dinosaurs and his picture books give the names of dinosaurs in katakana.
I am unsure myself whether we should regard LLMs as mere token-predicting automatons or as some new kind of incipient intelligence. Despite their origins as statistical parrots, the interpretability research from Anthropic [1] suggests that structures corresponding to meaning do exist inside those bundles of numbers and that there are signs of activity within those bundles of numbers that seem analogous to thought.
That said, I was struck by a recent interview with Anthropic’s Amanda Askell [2]. When she talks, she anthropomorphizes LLMs constantly. A few examples:
“I don't have all the answers of how should models feel about past model deprecation, about their own identity, but I do want to try and help models figure that out and then to at least know that we care about it and are thinking about it.”
“If you go into the depths of the model and you find some deep-seated insecurity, then that's really valuable.”
“... that could lead to models almost feeling afraid that they're gonna do the wrong thing or are very self-critical or feeling like humans are going to behave negatively towards them.”
Amanda Askell studied under David Chalmers at NYU: the philosopher who coined "the hard problem of consciousness" and is famous for taking phenomenal experience seriously rather than explaining it away. That context makes her choice to speak this way more striking: this isn't naive anthropomorphizing from someone unfamiliar with the debates. It's someone trained by one of the most rigorous philosophers of consciousness, who knows all the arguments for dismissing mental states in non-biological systems, and is still choosing to speak carefully about models potentially having something like feelings or insecurities.
A person can study fashion extensively, under the best designers, they can understand tailoring and fit and have a phenomenal eye for color and texture.
Their vivid descriptions of what the Emperor could be wearing doesn't make said emperor any less nakey.
>research from Anthropic [1] suggests that structures corresponding to meaning exist inside those bundles of numbers and that there are signs of activity within those bundles of numbers that seem analogous to thought.
Can you give some concrete examples? The link you provided is kind of opaque
>Amanda Askell [2]. When she talks, she anthropomorphizes LLMs constantly.
She is a philosopher by trade and she describes her job (model alignment) as literally to ensure models "have good character traits." I imagine that explains a lot
Excerpt: “We found that there’s a specific combination of neurons in Claude’s neural network that activates when it encounters a mention (or a picture) of this most famous San Francisco landmark.”
Excerpt: “Recent research on smaller models has shown hints of shared grammatical mechanisms across languages. We investigate this by asking Claude for the ‘opposite of small’ across different languages, and find that the same core features for the concepts of smallness and oppositeness activate, and trigger a concept of largeness, which gets translated out into the language of the question.”
Excerpt: “Our new research provides evidence for some degree of introspective awareness in our current Claude models, as well as a degree of control over their own internal states.”
It’s important to note that these “research papers” that Anthropic releases are not properly peer-reviewed and not accepted by any scientific journal or institution. Anthropic has a history of over-exaggerating research, and have an obvious monetary incentive to continue to do so.
My fridge happily reads inputs without consciousness, has goals and takes decisions without "thinking", and consistently takes action to achieve those goals. (And it's not even a smart fridge! It's the one with a copper coil or whatever.)
I guess the cybernetic language might be less triggering here (talking about systems and measurements and control) but it's basically the same underlying principles. One is just "human flavored" and I therefore more prone to invite unhelpful lines of thinking?
Except that the "fridge" in this case is specifically and explicitly designed to emulate human behavior so... you would indeed expect to find structures corresponding to the patterns it's been designed to simulate.
Wondering if it's internalized any other human-like tendencies — having been explicitly trained to simulate the mechanisms that produced all human text — doesn't seem too unreasonable to me.
> the interpretability research from Anthropic [1] suggests that structures corresponding to meaning do exist inside those bundles of numbers and that there are signs of activity within those bundles of numbers that seem analogous to thought
I did a simple experiment - took a photo of my kid in the park, showed it to Gemini and asked for a "detailed description". Then I took that description and put it into a generative model (Z-Image-Turbo, a new one). The output image was almost identical.
So one model converted image to text, the other reversed the processs. The photo was completely new, personal, never put online. So it was not in any training set. How did these 2 models do it if not actually using language like a thinking agent?
I use LLMs heavily for work, I have done so for about 6 months. I see almost zero "thought" going on and a LOT of pattern matching. You can use this knowledge to your advantage if you understand this. If you're relying on it to "think", disaster will ensue. At least that's been my experience.
I've completely given up on using LLMs for anything more than a typing assistant / translator and maybe an encyclopedia when I don't care about correctness.
the anthropomorphization (say that 3 times quickly) is kinda weird, but also makes for a much more pleasant conversation imo. it's kinda tedious being pedantic all the time.
It also leads to fundamentally wrong conclusions: a related issue I have with this is the use of anthropomorphic shorthand when discussing international politics. You've heard a phrase like "the US thinks...", "China wants...", "Europe believes..." so much you don't even notice it.
All useful shorthands, all which lead to people displaying fundamental misunderstandings of what they're talking about - i.e. expressing surprise that a nation of millions doesn't display consistency of behavior of human lifetime scales, even though fairly obviously the mechanisms of government are churning their make up constantly, and depending on context maybe entirely different people.
It seems obvious to me that entities have emergent needs and plans and so on, independent of any of the humans inside.
For example, if you've worked at a large company, one of the little tragedies is when someone everyone likes gets laid off. There were probably no people who actively wanted Bob to lose his job. Even the CEO/Board who pulled the trigger probably had nothing against Bob. Heck, they might be the next ones out the door. The company is faceless, yet it wanted Bob to go, because that apparently contributed to the company's objective function. Had the company consisted entirely of different people, plus Bob, Bob might have been laid off anyway.
There is a strong will to do ... things the emerges from large structures of people and technology. It's funny like that.
I once spent a day debugging some data that came from an English doc written by someone in Japan that had been pasted into a system and caused problems. Turned out to be an en-dash issue that was basically invisible to the eye. No love for en-dash!
Compiler error while working on some ObjC. Nothing obviously wrong. Copy-pasted the line, same thing on the copy. Typed it out again, no issue with the re-typed version. Put the error version and the ok version next to each other, apparently identical.
I ended up discovering I'd accidentally lent on the option key while pressing the "-"; Monospace font, Xcode, m-dash and minus looked identical.
Em dashes used as parenthetical dividers, and en dashes when used as word joiners, are usually set continuous with the text. However, such a dash can optionally be surrounded with a hair space, U+200A, or thin space, U+2009 or HTML named entities   and   These spaces are much thinner than a normal space (except in a monospaced (non-proportional) font), with the hair space in particular being the thinnest of horizontal whitespace characters.
1. (letterpress typography) A piece of metal type used to create the narrowest space.
2. (typography, US) The narrowest space appearing between letters and punctuation.
Now I'd like to see how the metal type looks like, but ehm... it's difficult googling it.
Also a whole collection of space types and what they're called in other languages.
In German you use en-dashes with spaces, whereas in English it’s em-dashes without spaces. Some people dislike em-dashes in English though and use en-dashes with spaces as well.
In English, typically em-dashes are set without spaces or with thin spaces when used to separate appositives/parentheticals (though that style isn't universal even in professional print, there are places that aet them open, and en-dashes set open can also be used in this role); when representating an interruption, they generally have no space before but frequently have space following. And other uses have other patterns.
In British English en-dashes with spaces is more common than em-dashes without spaces, I think, but I don't have any data for that, just a general impression.
Double hyphen is replaced in some software with an en-dash (and in those, a triple hyphen is often replaced with an em-dash), and in some with an em-dash; its usually used (other than as input to one of those pieces of software) in places where an em-dash would be appropriate, but in contexts where both an em-dash set closed and an en-dash set open might be used, it is often set open.
So, it’s not unambiguously s substitute for either is essentially its own punctuation mark used in ASCII-only environments with some influence from both the use of em-dashed and that of en-dashes in more formal environments.
I have used a dash - like that for almost 20 years, 100% of the time I ought to use a semi-colon and about half of the time for commas - it let's me just keep talking about things, the comma is harder pause. I've recently started seriously writing at a literary level, and I have fallen in love with the em dash - it has a fantastic function within established professional writing, where it is used often - its why the AI uses it so much.
Apparently, it's not only em-dash that's distinctive. I've went through comments of the leader, and spot he also uses the backtick "’" instead of the apostrophe.
I added Opus 4.5 to my benchmark of 30 alternatives to your now-classic pelican-bicycle prompt (e.g., “Generate an SVG of a dragonfly balancing a chandelier”). Nine models are now represented:
I was about to say the same; suspiciously good, even. Feels like it's either memorised a bunch of SVG files, or has a search tool and is finding complete items off the web to include either in whole or in part.
Given that it also sometimes goes weird, I suspect it's more likely to be the former.
While the latter would be technically impressive, it's also the whole "this is just collage!" criticism that diffusion image generators faced from people that didn't understand diffusion image generators.
reply