Germanic Lexicon Project
Message Board
Home
Texts
Search
Messages
Volunteer
About
[ Main Message Index ]   [ Previous | Next ] [ Reply ] Author: Keith Briggs
Date: 2004-11-08 09:46:59
Subject: Re: Probabilistic correction
> I wonder if that is a reason why you were able to get such a high level of accuracy in distinguishing corrected from uncorrected pages.
It might be simpler than that - each corrected page only contains features
already seen in the training phase for the c model, and each uncorrected page contains at least one feature not seen.
> I guess it some reassurance to me that I'm not the only one scratching my head over this. It seems like there has to be some way to do some sort of useful correction using a probabilistic model.
Yes - or, at least, marking lines likely to contain an error. In some ways, the problem looks simple: we know that tþ is never correct and almost certainly should be tó, and we should not need to program that correction by hand - a suitable probabilistic model can detect it for us.
I think the challenge is to decide what kind of correction we want to do(there's not much hope with punctuation - the information content is too small), and figure out regular expressions which achieve that. Maybe I should ask the dbacl author for advice.
Keith
Messages in this thread Name College/University Date Probabilistic correction Keith Briggs 2004-11-04 05:41:11 Re: Probabilistic correction Keith Briggs 2004-11-04 07:49:10 Re: Probabilistic correction Sean Crist Swarthmore College 2004-11-04 22:42:53 Re: Probabilistic correction Keith Briggs 2004-11-05 05:31:16 Re: Probabilistic correction Keith Briggs 2004-11-05 06:59:54 Re: Probabilistic correction Keith Briggs 2004-11-05 07:29:53 Re: Probabilistic correction Sean Crist Swarthmore College 2004-11-05 09:32:30 Re: Probabilistic correction Sean Crist Swarthmore College 2004-11-05 09:48:16 Re: Probabilistic correction Keith Briggs 2004-11-08 05:07:19 Re: Probabilistic correction Sean Crist Swarthmore College 2004-11-08 09:12:45 Re: Probabilistic correction Keith Briggs 2004-11-08 09:46:59 Re: Probabilistic correction Keith Briggs 2004-11-08 10:02:13 Re: Probabilistic correction Keith Briggs 2004-11-08 12:10:56 Re: Probabilistic correction Sean Crist Swarthmore College 2004-11-08 15:26:04 Re: Probabilistic correction Keith Briggs 2004-11-09 06:47:45 Re: Probabilistic correction Keith Briggs 2004-11-09 08:50:46 Re: Probabilistic correction Keith Briggs 2004-11-09 09:43:19 Re: Probabilistic correction Keith Briggs 2004-11-09 10:59:49 Italics (was: Probabilistic correction) Sean Crist Swarthmore College 2004-11-09 13:39:13 Re: Probabilistic correction Keith Briggs 2004-11-11 06:57:20