Skip to main content

legalese - Is there a grammar rule that defines the properties of a legally accepted word



I would like to know if there is a grammar rule(s) that defines whether a word is gramatically legal or not. I understand a word is given meaning by a human and anyone can give meaning to anything. Therefore I realize it is probably impossible to create a set of laws that can absolutely define the legality of a string of letters. Barring that extreme example, is there a practical/general set of such rules?


For example, I remember my grade 2 teacher saying that if a word does not contain at the minimum 1 vowel, then it is not a legal word. Based on that principle, I might claim that the word 'lkjsdlf' is not a legal word.


Is there a generally accepted set of grammatical parameters that define whether a word is legal or not (apart from looking it up in a dictionary)?


The reason I'm asking this is to determine if it's possible to programmatically validate a word (rather than using a list of 100,000+ words from a dictionary). The goal is to categorize 'lkjsdlf' and 'apple' as 'invalid' and 'valid' respectively.



Answer



Not so much a grammar rule but people have analysed the frequency of all the letter combinations of various lengths in samples of English text. They then used this to randomly generate a kind of pseudo English.


I'm not sure where I originally saw this, I think it was a little more scholarly, but here's an example of someone's generated pseudo-English: http://ibbly.com/Pseudo-words.html


and here's someone else's attempt: http://www.fourteenminutes.com/fun/words/


But you could use the same frequency data to quantify how typically "English" a word is, i.e. how probable it is as a word in English.


Of course there's more to words than just a unstructured letter sequence as @curiousdannii has pointed out, so there are further considerations possible in this kind of analysis.


Comments

Popular posts from this blog

Is there a word/phrase for "unperformant"?

As a software engineer, I need to sometimes describe a piece of code as something that lacks performance or was not written with performance in mind. Example: This kind of coding style leads to unmaintainable and unperformant code. Based on my Google searches, this isn't a real word. What is the correct way to describe this? EDIT My usage of "performance" here is in regard to speed and efficiency. For example, the better the performance of code the faster the application runs. My question and example target the negative definition, which is in reference to preventing inefficient coding practices. Answer This kind of coding style leads to unmaintainable and unperformant code. In my opinion, reads more easily as: This coding style leads to unmaintainable and poorly performing code. The key to well-written documentation and reports lies in ease of understanding. Adding poorly understood words such as performant decreases that ease. In addressing the use of such a poorly ...

Is 'efficate' a word in English?

I routinely hear the word "efficate" being used. For example, "The most powerful way to efficate a change in the system is to participate." I do not find entries for this word in common English dictionaries, but I do not have an unabridged dictionary. I have checked the OED (I'm not sure if it is considered unabridged), and it has no entry for "efficate". It does have an entry for "efficiate", which is used in the same way. Wordnik has an entry for "efficate" with over 1800 hits, thus providing some evidence for the frequency of use. I personally like the word and find the meaning very clear and obvious when others use it. If it's not currently an "officially documented" word, perhaps its continued use will result in it being better documented.