|
Germanic Lexicon Project
Message Board
|
|
|
Author: chicane
Date: 2004-10-27 12:55:06
Subject: Re: dictionary database available for download?
Hmmm... so basically, your database is just raw text and not reorganized as a database file?
As for the searching software, thanks for the background info. I suppose it would be way over my head as beginning programmer to even try to do anything with the code, especially since I'm learning Python and not PERL. But I'm sure I'm not the only one intrested in the progress of your program. Maybe once you're able to put more time into it, it might be a good idea to create a web page about the program's progression. ..?
> Actually, all of the texts in the search database are available for free download, and you can do whatever you want with them. Click on the "Texts" tab at the top of this page. Under Proto-Germanic is the Torp dictionary; under Old English is Bosworth/Toller; under Old Icelandic is Cleasby/Vigfusson. For all three texts, you can download the entire text. Let me know if you have any difficulty finding them.
>
> As for the searching software, I'd consider sharing it, but this is more complicated than it might seem. It's not that I'm not willing to share my code; I do believe in code-sharing and would be willing to release it under the GNU Public License.
>
> My reluctance is that the search system isn't just one neat self-contained program which you could download and easily run on your own computer. It's actually a Rube Goldberg contraption involving maybe a dozen different Perl scripts. It would be quite a bit of work to assemble all the code into a downloadable form, and still more work to write enough documentation for the code to be of much use. (I'd much rather put that work into the dictionaries themselves.)
>
> There are all kinds of complicated dependencies in the code. For example, the search system has to talk to the corrections/reservations system to ask whether a particular page has been corrected or not, so that it can say "This entry has (or has not) been hand-corrected" as it displays the results. The scripts aren't portable, meaning that I saved time by hard-coding certain assumptions about where various files and other programs are located in the file system, etc.
>
> --Sean