Germanic Lexicon Project
Message Board

Home

Texts

Search

Messages

Volunteer

About


[ Main Message Index ]     [ Previous | Next ] [ Reply ]

Author: Peter Tunstall
Email: penteract at oe dot eclipse dot co dot uk
Date: 2004-11-17 07:43:57
Subject: Re: A rival Bosworth & Toller project?


Hi Sean,

I've just written to Bekie, in my capacity as a neutral/freelance/mercenary volunteer, asking what her current thoughts are on the matter. I agree it's important to get things resolved as quickly as possible, for the sake of moral and momentum. So far I haven't done any BT, just a couple of pages of Cleasby & Vigsusson--so when I get time, I'd personally be inclined to concentrate on that for the moment, at least until we know what's happening.

As will be obvious, I know next to nothing about the technical side of things, but how different are the methodologies between this project and Bekie's? Are there major incompatabilities? Or could work done on one be incoporated into the other without too much difficulty? Are there significant advantages/disadvantages to one method of correcting or the other?

I guess there would be benefits to formal cooperation, but as a worst case scenario, supposing this isn't possible for whatever reason, would it be simplest for volunteers to concentrate on the project that's nearest completion, giving their time on the understanding that the finished product will be free to view or copy or display wherever it can be, and in whatever way? I assume most people associated with either project are mainly intested in getting the complete dictionary available as quickly and efficiently as possible.

Peter








>
> Peter,
>
> Whew. Okay.
>
> This throws yet another unexpected loop into things. When I was
> regularly checking in on that project back in 2003, it looked like
> progress had stopped after a small number of pages. I stopped paying
> attention at some point because it looked like the project was dead.
> They obviously got it going again at some point.
>
> I agree that the sensible thing would be for the two projects to
> combine their efforts. I'd still be willing to do this. However, I
> think it's unlikely to happen, because of the history.
>
> Here is the story, or at least my understanding of it.
>
> I had decided as far back as 1999 to put Bosworth/Toller online, and I
> was exploring funding and data entry options. In 2001 I received a
> small grant to pay a student to scan the Bosworth/Toller. I put the
> page images online because I've always thought it best to "release
> early, release often" (uncompleted data is more use than no data). In
> 2002 I got another small grant, and another student did the OCR and
> wrote some programs to do the major automated corrections on the text
> (the current text which we're correcting is still messy, but it's far,
> far cleaner now than the raw output of the OCR).
>
> For some time, my grand plan for the project has been to digitize Torp
> first, then Bosworth/Toller, then Cleasby/Vigfusson. By late 2002,
> Torp was nearing completion, so I submitted a major grant proposal to
> the National Science Foundation in January 2003. This grant would
> have paid a fleet of students to work on correcting Bosworth/Toller,
> and would have given me one less course to teach each semester so that
> I could manage the project (I wouldn't have been any richer; this
> would have just freed up some of my teaching time for the project). I
> would hear in June 2003 whether I had gotten the grant.
>
> While I was waiting to hear the results of this grant proposal,
> something unexpected happened. On 28 March 2003, Bekie Marett wrote
> to me and introduced herself as the coordinator of the Online
> Anglo-Saxon Dictionary Project, which had begun on 17 March 2003
> (eleven days earlier). She was planning to create an online text
> version of Bosworth/Toller. She said that she had credited me on her
> site for the scanned pages because they had saved her team an immense
> amount of work.
>
> Well, this put me in an uncomfortable situation, to say the least. If
> it hadn't been for the grant proposal, I might have said, "This is
> good news; someone else has taken on Bosworth/Toller, so I can work on
> a different text instead." Now, true, this would have meant ditching
> the existing online text which a summer's worth of student salary had
> already gone into. But my goal was to get as much corrected text
> online as quickly as possible, so this arguably would have been the
> right move.
>
> But it wasn't possible for me to drop Bosworth/Toller, because of the
> pending grant proposal. If Bekie's project had started before I had
> submitted the proposal, I would have said, "Good, Bosworth/Toller is
> covered; I'll write the grant for another text instead." But at this
> point I had a specific proposal for a specific text in the works. A
> proposal of this kind is a _major_ deal to prepare, involving work by
> many people; the funding cycles are very slow. It simply wasn't
> possible for me to go to the funder while they were midstream in the
> lengthy evaluation process and say, "Oops, I decided to do another
> text instead."
>
> I wrote back to Bekie and explained the whole situation. I asked if
> we could discuss this together and figure out the right thing to do.
> I made the suggestion that she and her team consider working on one of
> the other texts whose scanned images I had posted.
>
> Bekie got angry at me for suggesting this. Now, I can understand
> this. When you've gotten yourself organized and enthusiastic around a
> project, it's hard to suddenly drop it and switch gears. I explained
> why I could not drop the project because of the grant. She said, "If
> there ends up with more than one version online, this is hardly a bad
> thing." I thought it would be much better to have one version each of
> two different texts, and said so. Hand corrections are a huge amount
> of work, and there's no point in doing the same text twice.
>
> When it became clear that Bekie was not going to switch texts, I asked
> whether we could somehow combine our efforts, and offered to talk with
> the programmer on her team to work out standards for character
> encoding and markup. She never responded to that suggestion.
>
> Around this time, Bekie's site un-credited me for the scanning work
> and credited Ian Marett instead. Bekie had already told be she was
> using the page images from my site, and I doubt very much that they
> re-scanned the entire 1,302 pages of text. It appears that Bekie's
> project builds on my students' work but does not give proper credit.
> Ian probably scanned the 14-page introduction, which I hadn't posted
> yet at the time. If so, he deserves credit for this. However, the
> credits make it look as if Ian Marett were responsible for scanning
> the whole book.
>
> I can only close my eyes and shake my head at this. This is so
> ridiculous; we're both trying to give something away for free.
>
> In fairness, there is one thing I did wrong myself, and which I
> regret. At one point in our discussion, I asked whether Bekie would
> consider taking her site offline until June, when the result of the
> grant proposal would be announced. I was worried that one of the
> evaluators might find Bekie's site and would deny my proposal on the
> grounds that the project was already being done by someone else.
> Bekie correctly pointed out that this was not honest of me. When I
> thought about it, I realized she was right, and said so. I shouldn't
> have suggested that, and I apologized to her.
>
> After April 2003, Bekie and I had no further communication. In June
> 2003, I learned that I hadn't gotten the NSF grant. I looked at
> Bekie's site from time to time. As I watched the completed number of
> pages, it looked to me as if they had gotten off to a good start, but
> then had lost interest in the project. When the number of completed
> pages seemed not to change for a long time, I stopped regularly
> checking their site.
>
> I still wanted an online Bosworth/Toller, and I had no funding, so it
> looked like the only way this was going to happen was going to be if I
> set up a volunteer-based project on my own time. So after I finished
> Torp, I went ahead and implemented the web-based correction system
> that I had had in mind for some time. That is the system which is in
> place now, and it's going well so far.
>
> Now comes unexpected news from Bekie's camp again. After I thought
> they had dropped off the radar, it turns out that they have 500-some
> pages corrected. When I was still checking their site last year, it
> really looked like the project was going nowhere. They must have
> really stepped up their efforts in the last few months.
>
> So, now what the hell do we do? I see a few choices:
>
> 1) Try approaching Bekie again, and ask for a second time whether she
> would consider combining our efforts. Unlikely to be of any use, given
> past experience.
>
> 2) Press on with our own corrections, even tho this means doing pages
> which Bekie's team has already done. This involves some unnecesary
> duplication of effort.
>
> 3) We could incorporate the pages Bekie's team has already corrected
> into our own version of the dictionary, give due credit, and then
> continue with our own corrections. There is no legal problem with
> this, even if we don't have permission, because the text is out of
> copyright. However, this idea gives me an icky feeling which I have
> not fully sorted out.
>
> I'd definitely appreciate input on how to handle this.
>
> I'm concerned that the whole rotten issue is going to dampen the
> enthusiasm of volunteers for both projects (unless it has the positive
> effect of stoking a competition that gets the project done faster; I
> have no idea how others will react to the situation I've just
> described). I wish this issue would just go away. I just wanted to
> create something useful that we could all share. I didn't ever mean
> to get into this kind of politics.
>
> Bekie had told me that her team is only doing the main 1302-page
> volume and is not planning to digitize the 768-page supplement. So if
> someone wants to be sure that their work isn't redundant, they could
> correct pages from the supplement. Nobody else is working on it.
>
> So I don't know what the hell to do. I'm not sure what I should have
> done differently to avoid this mess. I'm very much open to input on
> how to proceed from here.
>
> --Sean
>

Messages in this threadNameCollege/UniversityDate
A rival Bosworth & Toller project? Peter Tunstall 2004-11-16 07:41:26
Re: A rival Bosworth & Toller project? Sean Crist Swarthmore College 2004-11-16 13:24:42
Re: A rival Bosworth & Toller project? Peter Tunstall 2004-11-17 07:43:57
Re: A rival Bosworth & Toller project? Sean Crist Swarthmore College 2004-11-17 08:18:12
Re: A rival Bosworth & Toller project? Bekie Marett 2004-11-18 08:54:19