my os x german word pool

I needed a list of german correct words to check some stuff for my BA Thesis.
Couldn’t find a way to get all words from the auto-correction function on OS X. I thought there might be a complete list of words which I simply read in and then check. But it doesn’t look like there is something.

So I do google and found out that translation files ending with ‘.strings’ located at the ‘lproj’ folder contain translations for the internationalization. The only thing I need is to make a list of those files, then go through this list and extract the german translation words. I don’t care about double word’s because the can be filtered out later easily.

So here is my Script:

for i in $(locate .lproj | grep "German" | grep "\.strings" | grep -v InfoPlist)
for line in $(cat $i | grep --binary-files=text -v "/" | grep --binary-files=text "=")
IFS=" "
for word in $(echo ${line//[\"-\+\.,:=;&\)\(%,@1234567890]/} | awk '{ $1=""; print  }')
if [ ${#word} -gt 1 ]; then
echo $word >> ~/Desktop/iMacDELang.txt;

For other languages just replace “German” in line 4 with your desired language.

Execution takes about 5 mins and I end up with a list of ( after filtering ) 43756 uniqe german words.
This was enough for my first test, later on I do really need a complete list of all words. Tthis would be great

Leave a Reply