Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I know that they don’t literally mean a single word, but it got me thinking: is a space ever a valid character in a word? I’m guessing no, because that’s fundamentally how languages separate tokens

However, while I’m no language person, I have wallowed in unicode enough times to recognize that the “rules” of language are so wild and varying and riddled with exceptions that basically anything is possible.



Yes: an open compound has two words separated by a space, but still has a unique meaning. Ice cream, common sense, never mind, etc.

FWIW, there are plenty of languages that don't use any spaces at all in their writing (Japanese, Chinese, Thai), and the use of spacing is quite fluid even in English. Is it a cellphone or a cell phone?


So technically in linguistics, the spoken language is considered the ground truth while the way the language is written is just a crude representation of the spoken language.

What this implies is that a space being used between two tokens doesn't actually make it two words. You can still consider that to be one word.

(Also, if you wanna go even more meta, there isn't even a widely agreed upon definition for what a word even is in linguistics!)


> (Also, if you wanna go even more meta, there isn't even a widely agreed upon definition for what a word even is in linguistics!)

To be even more precise, there are several competing definitions of what "word" should mean that don't always agree. E.g. if you go by the criterion "word" = lexeme (dictionary unit), then something like "kick the bucket" would be a "word".


Are you allowed to use "se" in Scrabble because it exists as a single word in "per se"?


No, "se" here is Latin in the same way that "exempli" and "gratia" are. You similarly can't use "kung" or "fu".

That said, you may not like this answer, but ultimately whatever's in the official dictionary (now known as Collins Scrabble Words / CSW) is what goes. "mein" as in "chow mein" is not valid (nor is "chowmein"), but "lomein" and "wonton" are.


Like always scheme has a solution to the problem.

Symbols that contain space separated word are enclosed in ||, |this is a symbol|, that while the naked space is still the symbol separator token.

Now if we can just remove all other punctuation for s-expressions we'd finally have the start of a language fit for the digital age.


Space is arguably the absence of a character. It is only a character for creating encoding standards in computing systems (*or actually since the typewriters).


The practice of placing spaces between words is connected to writing, so not some fundamental part of language. There are early Latin texts that don't have word dividers at all, for instance, and some languages used different symbols like dots.


> is a space ever a valid character in a word?

No, because then how do you define a word?

I suspect grammar has another term(!?) for the concept you're looking for.


In English, yes. In many other languages, no.


At least in English (I don't know other languages well enough to speak with any any kind of knowledge) whenever you want a term containing 2+ words meant to be taken together, you use a hyphen. Compound words, I believe they are called.

From what I can tell, they should have spelled it brain-rot.


One minute on Wikipedia would have disabused you of that wrong notion:

> If the joining of the words or signs is orthographically represented with a hyphen, the result is a hyphenated compound (e.g., must-have, hunter-gatherer). If they are joined without an intervening space, it is a closed compound (e.g., footpath, blackbird). If they are joined with a space (e.g. school bus, high school, lowest common denominator), then the result – at least in English – may be an open compound.

https://en.wikipedia.org/wiki/Compound_(linguistics)


A general heuristic is: single word is the noun form (e.g. "I made a backup"), two words is the verb form (e.g. "back up your data"), and hyphenated words is the adjectival form (e.g. "the back-up copy is over there").


>whenever you want a term containing 2+ words meant to be taken together, you use a hyphen. Compound words, I believe they are called.

So the shopping festival after thanksgiving should be called "black-friday"?


> So the shopping festival after thanksgiving should be called "black-friday"?

I would say this is a prime example, yes.


Prime-day?


Or just brainrot.

It'll win at the end. Hitting that spacebar is effort.


Yes, many words that started out as two words separated by a hyphen merge to become one as language evolves. Notebook is a good example of this; a more recent example would be email.


In German it happens even more quickly.


I thought you only used hyphens in German in exceptional circumstances like proper names in compounds? Karl-Marx-Allee etc.


Well, what I meant was that concepts made up of several words almost immediately become one word in German (without hyphens).




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: