|
Germanic Lexicon Project
Message Board
|
|
|
Author: Sean Crist (Swarthmore College)
Email: kurisuto at unagi dot cis dot upenn dot edu
Date: 2004-10-27 08:18:41
Subject: Re: dictionary database available for download?
> I'm wondering if you, Sean, would be able to make available your
> germanic database for download. It would relieve some of your servers
> and also it would make my search queries much faster. I don't
> understand why you don't make your dictionary database available
> for download. Perseus has their dicitionary on CD, although for a price...
> Since you wrote a searching program for the database, I might as well
> ask if you are willing to make that available for download as well. ;-)
Actually, all of the texts in the search database are available for free download, and you can do whatever you want with them. Click on the "Texts" tab at the top of this page. Under Proto-Germanic is the Torp dictionary; under Old English is Bosworth/Toller; under Old Icelandic is Cleasby/Vigfusson. For all three texts, you can download the entire text. Let me know if you have any difficulty finding them.
As for the searching software, I'd consider sharing it, but this is more complicated than it might seem. It's not that I'm not willing to share my code; I do believe in code-sharing and would be willing to release it under the GNU Public License.
My reluctance is that the search system isn't just one neat self-contained program which you could download and easily run on your own computer. It's actually a Rube Goldberg contraption involving maybe a dozen different Perl scripts. It would be quite a bit of work to assemble all the code into a downloadable form, and still more work to write enough documentation for the code to be of much use. (I'd much rather put that work into the dictionaries themselves.)
There are all kinds of complicated dependencies in the code. For example, the search system has to talk to the corrections/reservations system to ask whether a particular page has been corrected or not, so that it can say "This entry has (or has not) been hand-corrected" as it displays the results. The scripts aren't portable, meaning that I saved time by hard-coding certain assumptions about where various files and other programs are located in the file system, etc.
--Sean