|
Germanic Lexicon Project
Message Board
|
|
|
Author: Sean Crist (Swarthmore College)
Email: kurisuto at unagi dot cis dot upenn dot edu
Date: 2004-11-27 16:27:22
Subject: Re: Correct HTML display
Joshua,
Thanks for your question.
The large .txt file is raw data. It is the base document from which
the various presentation forms are automatically derived (i.e., the
HTML files of individual pages, the Search system database, the PDF
files for correction volunteers).
The .txt really isn't intended for direct human consumption. I make it
available so that others can use the raw data in their own programming
projects or for whatever other purposes they want.
If you want to display that file in a browser, you'd have to convert
it to HTML first. You could do a rough conversion, using a text
editor, in two steps: 1) search-and-replace to add <p>
... </p> tags to each entry, and 2) tack on an HTML header and
footer. This would give something which is a good bit more legible,
and it might be adequate for your purposes.
However, this rough strategy wouldn't get everything right. It would
correctly display á as á,, because
that entity is standard. However, it wouldn't display
æ-acute; as , because this entity
is non-standard (there is no standard entity for that character). You
could convert æ-acute; to its numeric entity form
&#xU01FD; (there's a table of Unicode conversions for all
the entities under the About tab; but the support for different
characters varies from browser to browser). Or, you could embed an
image file which is a picture of the character (the Search system does
this for many characters). Also, you'd want to change the HEADER tag
to something else. You could do all these things with a bunch of
search-and-replace operations, but I'd write a script myself.
I haven't made one huge HTML file like this because I figured that
such a large page would be unwieldy (it would be over 2000 printed
pages worth of text). However, I've sometimes been wrong before in my
expectations about what will be the most useful for the most people.
I can certainly consider adding other presentation formats if there's
general interest.
--Sean
Messages in this thread | Name | College/University | Date |
Correct HTML display |
Joshua Tyra |
|
2004-11-27 15:52:19 |
Re: Correct HTML display |
Sean Crist |
Swarthmore College |
2004-11-27 16:27:22 |