Summary: This form looks up German and English words using
a local copy of the open-source dictionary from the Technische
Universität Chemnitz.
It differs from most online German-English dictionaries in that it
strives to reduce the verbosity of the output and
to provide grammatical searches useful to those studying German
or English.
Dictionary Instructions
Smart search
looks up words in the dictionary using a series of
matching algorithms, each progressively more general.
If an algorithm has no match, it passes the query on to the next.
If there is a match, the data is returned and the search stopped.
This reduces unnecessary output, providing the user with concise output.
Advanced search
provides traditional control over search methods and output formatting.
The matching algorithm, language, database, word type,
and output mode can be modified to suit the user's needs.
Phonetic Transcription
The
CMU Pronouncing
Dictionary has been integrated into the search engine powering
the
Lexica Online German English Dictionary.
Phonetic pronunciations for English words are provided in
IPA,
SAMPA, or
CMU format.
The online
English Phonetic Transcription
page provides this same functionality without the dictionary
interface and adds phonetic output in HTML and LaTeX formats
for those who are interested.
[
ðʌ
si em ju
prʌnaʊˈnsɪŋ dɪˈkʃʌneˌri
hæˈz bɪˈn ɪˈntʌgreɪˌtʌd ɪˌntuˈ ðʌ ðʌ sɜːˈtʃ eˈndʒʌn paʊˈɜːɪŋ ðʌ
ɔːˈnlaɪˌn dʒɜːˈmʌn ɪˈŋglɪˌʃ dɪˈkʃʌneˌri.
fʌneˈtɪk prəʊnʌˌnsieɪˈʃʌnz fɔːˈr ɪˈŋglɪˌʃ wɜːˈdz ɒˈr prʌvaɪˈdʌd ʌn
aɪˈ pi eɪˈ,
sɒˈmpɒˈ,
ɔːˈr
ei em ju
fɔːˈrmæˌt.
ðʌ ɔːnlaɪn
ɪŋglɪʃ fʌnetɪk trænskrɪpʃʌn
peɪdʒ prʌvaɪdz ðɪs seɪm fʌŋkʃʌnælʌti wɪθaʊt ðʌ dɪkʃʌneri ɪntɜːfeɪs ænd ædz fʌnetɪk aʊtpʊt ʌn HTML ænd leɪteks fɔːrmæts fɔːr ðəʊz hu ɒr ɪntrʌstʌd.
]
The IPA phonetic transcriptions require the
Lucida Sans Unicode
true type font and a browser that supports Unicode.
Dictionary History
The word list used in the online dictionary has its roots in one of
the original Internet dictionaries on the web.
Maintained from its modest beginnings by
Frank Richter, the
word list has grown significantly over the last decade.
The word list is used in the online
dictionary at the
Technische Universität Chemnitz.
Frank Richter also maintains
das Ding,
a Linux German-to-English dictionary, which uses the same word
list as the TU-Chemnitz dictionary.
It's important to note that, while other websites eventually limited
access to collaborative translating dictionaries, this word list has
remained freely available.
To the point, the dictionary is distributed freely, das Ding is released
under the GPL, and individual contributors are still
listed.
I say, "still", as my last contribution was in 1994,
writing perl scripts to spell check, alphabetize, and remove
duplicate entries in the dictionary.
At that time, I also changed the format,
where entries were on multiple lines separated with the character
"
-", to its current
format, where both languages are on the same line separated by
the characters "
::".
A feature that for better or worse persists to this day.
Dictionary Resources
This dictionary is written in HTML with interpreted in-line Perl,
JavaScript, and a Perl-script backend called
lexica.
The Perl script
lexica is a command-line dictionary
that uses
grep to perform an initial broad
search on a dictionary text file, followed by a second search implemented
using Perl regular expressions.
In order to reduce unnecessary output and provide more accurate search results,
lexica employs several search algorithms
of increasing complexity to find the best translation.
I will make
lexica available shortly after I
add textual output and finish the final search algorithm.
The best place to get the complete word list is with the source code for
das Ding.
However, I thought it would be useful to offer my simply formatted
files for download.
The original word list has been modified to conform to my
philosophy
concerning word lists: keep the word list uniform and simple while
offloading complexity to the dictionary program.
With that in mind, single lines in the word list that had multiple
translations have been split into multiple lines.
I also found, that searching a word list that included phrases led to
excessive amounts of output, especially if one searched for common words.
To solve the problem, I split the word list into two files, one
containing phrases and the other words.
As an attempt to identify errors and redundancies,
the list of words was split into several more word lists
based on the word function (adjective, verb, noun,
usw.).
With these new simplified and separated databases the
search engine can target searches to a specific category.
$Id: dictionary.html,v 2.1 2003/06/22 06:31:13 forman Exp forman
$Id: lexica,v 1.4 2003/06/27 04:50:50 forman Exp forman