|
Germanic Lexicon Project
Message Board
|
|
|
Author: Matthew Carver
Email: thiudans at yahoo dot com
Date: 2004-10-13 21:42:11
Subject: Re: Global corrections (JJ)
sean:
that makes sense.... i wonder if "jj" occurs in any other instances besides as a misread for thorn.
-Matthew
> > Sean et al.
> >
> > Hm. I found the eth for oacute to be common in some environments, specifically
> > "mód" (doing bt_b0225, -26 right now). These were
> > reserved yesterday.
>
> Yeah, that doesn't surprise me; I know that the program doesn't catch every single case. My main worry is to make sure it didn't introduce some other kind of problem which I didn't notice.
>
> In the case of eájj-mðd, for example, the program wasn't able to correct the error because there is a second error in the word (jj should be þ). The program couldn't find the (nonexistent) word eájj-mód in the Toronto DOE corpus, so it didn't change the word.
>
> Things like hyphens and other stuff can be a problem. The program already handles a lot of stuff like upper/lower case, stripping off punctuation at the beginning/end of the word, etc. I could make it still more complicated to handle hyphens and things; but the more complicated the program, the greater the chance of accidentally introducing some other kind of error.
>
> --Sean