Volume 3, No. 1 
January 1999

Gabe Bokor



A Unique Medium—The Flip Side
by Gabe Bokor
Index 1997-99
  Translator Profiles
Correct Science + Elegant Wording = Smiling Client
by S. Edmund Berger, Dr. Chem.
  The Profession
The Bottom Line
by Fire Ant & Worker Bee
Translation Contracts
  Non-English Computing
Use of “Virtual” Texts and HTML in Transliteration
by Michael Walker
 Translator Education
Translation Studies at a Crossroads
by Maj-Britt Holljen
 Biomedical Translation
Immunology—a Brief Overview, Part 3
by Lúcia M. Singer, Ph.D.
  Science & Technology
A Translator’s Guide to Organic Chemical Nomenclature XIV
by Chester E. Claff, Jr., Ph.D.
  Banking and Finance
Going Broke in Brazil
by Danilo Nogueira
  Caught in the Web
Web Surfing for Fun and Profit
by Cathy Flick, Ph.D.
Terminology Search on the Worldwide Web
by Gabe Bokor
Translators’ On-Line Resources
by Gabe Bokor
Translators’ Events
Letters to the Editor
Call for Papers
Translation Journal
Search the Web

Terminology Search
on the World-Wide Web

by Gabe Bokor

The World-Wide Web (WWW) is a huge potpourri of business advertising, serious scholarship, pornography, political propaganda, scam artistry, and many other things. No authority has ever prepared a card system for the millions of sites or imposed any standard of quality for posting information on the Web. Yet it is easier to find even a single word on the WWW than in the best-organized library. While experienced web surfers may have their favorite specialized sites and on-line glossaries (Cathy Flick’s column lists a number of them in each issue of the Translation Journal), the usual point of departure for most research, including terminology research for translation, is one of the many Web search engines.
    Search engines are a magic tool for translators looking for an elusive word, an abbreviation, or wishing to confirm a guess. It is, however, important to employ the right search strategy in order to take maximum advantage of the search engines’ tremendous power.
    Search engines are Web sites equipped with special software capable of periodically combing the Web and collecting each word in a huge database. Most search engines also accept submissions from Web site owners. Due to the different ways the different search engines collect, process, and store this information, the results you get from your search with different search engines may be widely different. There are hundreds, if not thousands, of different search engines on the Web, some of them regional, others highly specialized. We’ll only make reference to the major general-purpose search engines:

Yahoo! is a “Web directory,” where information is stored under 14 major categories and a large number of subcategories. This engine is ideal when you wish to obtain more information on the subject and search using a common, well-known term. For example, searching under the key word “UCLA” (University of California, Los Angeles), Yahoo gave me 3 “category” hits and 863 “site” hits, while AltaVista resulted in over 378,000 hits. This doesn’t mean that Yahoo! is that much “smaller” or less complete than AltaVista, but only that Yahoo! will give you the sites where UCLA is the key word, while AltaVista will show you virtually all occurrences of the word (or abbreviation) on the Web. You can make Yahoo search in the “AltaVista mode” by clicking the “Go To Web Page Matches” button next to your first search results, and you’ll get more additional pages than you’ll ever want to leaf through.
    Yahoo! can be used even if you don’t have a specific key word. Just select the major category and go narrowing your field until you reach the area you’re interested in. For example, if you wish to know more about automobiles that use non-polluting fuels, you can go to Science-Engineering, then Automobiles, and then Alternative Fuel Vehicles, which will finally take you to Solar Vehicles, Electric Vehicles, and Hybrid Vehicles.
    AltaVista is the other extreme of the search engine spectrum. It is ideal when you want to find the use of an obscure term or abbreviation. It is my personal favorite, although you can occasionally be overwhelmed by the number of hits you get, not all of which relevant to your search. One nice feature of AltaVista is that it can select the language of the page. When I was trying to find the English equivalent of the German word Warenkunde, I knew that there was no exact, generally used term for the same concept. So I searched Warenkunde among the English-language pages, and I actually found the personal page of a professor of Warenkunde, whose title on his English-language page was given both in German and in English, the latter as Professor of Commodity Science.
    The same feature also came in handy when I tried to confirm the use of the English terms “benchmark” and “benchmarking” in Portuguese. Searching for benchmark or benchmark* in AltaVista’s Advanced Mode among Portuguese sites, I got 1044 hits. It is interesting to note that not all of the texts found used the term in the computer context, but none (of those I checked) bothered to provide a translation in the vernacular. I also confirmed that both words are used in the masculine gender, which is not to be taken for granted, considering that marca is feminine in Portuguese, and the adopted foreign word often assumes the gender of its Portuguese cognate or equivalent. For example, the word Internet is feminine because rede (network) is feminine in Portuguese.
    A note is in order about searching for phrases like Federal Reserve Board as opposed to words. In most engines, if you just type in the phrase, you get all the sites where any of the words occurs. To restrict your search, you must either put your key phrase in quotation marks or use the Advanced Option of the search engine, which either allows you to use Boolean operators AND, OR, NEAR, NOT, or gives you a menu selection to achieve the same result. Obviously, searching for “Defense Department” (with the quotation marks) will only bring up the exact phrase as typed. Typing Defense Department (without the quotation marks) will give you pages where either of the two words occurs in any context. defense and department under advanced (Boolean) search will yield pages where both words occur, for example, Defense Attorney on line 3 and Justice Department on line 54. defense near department will yield pages with Defense Department, Department of Defense, and probably a few irrelevant sites where the two words happen to be near each other. You must select your search strategy according to the specific case at hand and the result you wish to achieve.
    Excite is one search engine that claims to be context-sensitive; i.e., if you enter car, it will also find sites where only the words automobile or motor vehicle occur.
    Most search engines are not case-sensitive. Punctuation marks, including hyphens, and “stop words” such as and, or, is, for, etc. are ignored unless they are part of a phrase between quotation marks.
    Word fragments can be searched with some engines using the asterisk (*) as a wildcard. For example, searching for annelat* (to find occurrences of annelate, annelated or annelating, AltaVista yielded 261 hits, Yahoo! 32 hits, and both Excite and Infoseek 0 hits.
    Abbreviations are easily found on the Web. When a large number of irrelevant hits occur or the expansion of the abbreviation is not immediately found, the search can be narrowed by specifying the language (with AltaVista) or using the known or suspected portion of the abbreviation with Boolean operators. For example, knowing that the first letter in the Portuguese abbreviation IPMF stands for imposto (tax), searching AltaVista in the Advanced mode for IPMF and imposto yielded 68 hits, many of which expanded the abbreviation as Imposto Provisório sobre a Movimentação Financeira.
    The order in which the search engines list the hits is usually random. Excite does it by what it considers order of relevance, even giving percentage of relevance figures, but I’ve found this order to have little to do with the actual relevance of the sites found.
    Once you’re on a page that, according to the search engine, contains the word or expression you’re looking for, it’s easy to find it on the page. Just type Ctrl-F (Command-F on the Macintosh), and you get a search dialog which allows you to find the string (word, phrase, or fragment) on the page. You can also use the Find in Page (Netscape) or Find (on this page) options under the Edit menu.
    In doing terminology research using the search engines, you can find the sites where a given term is used (and possibly its meaning clarified); less frequently, you can find the translation of the term as in bilingual or multilingual sites “Rosetta Stones” (such as those listed in Cathy Flick’s column) or in cases like the one mentioned above with Warenkunde. You can also confirm or disqualify your guess of a term. For example, if you’re not sure whether the German word Ventilhub in the automobile context is actually valve stroke, you can search for the latter to find that the 77 occurrences (in AltaVista) all refer to non-automotive context. Valve lift, on the other hand, gives you 799 hits, most of them in the automotive context. Using the search engines, you can find not only individual words, but the type of lingo in which those words are used by those who work in that particular field.
    What I would not recommend is letting AltaVista’s Translate machine translation utility do your translation. The sample below is the actual translation, including unedited punctuation, of the first sentence of an Italian web page, performed by AltaVista’s machine translation software Babelfish. A (fairly literal) human translation is attached for comparison.

Italian OriginalAltaVista TranslationHuman Translation
Bisogna procedere coi piedi di piombo quando si scrive (e si legge) di biblioteche "elettroniche", "digitali" o "virtuali", termini che si sprecano, di questi tempi, anche nelle riviste e nei programmi televisivi più divulgativi, senza che sia sempre chiaro a chi legge (e talvolta nemmeno a chi scrive) a cosa ci si stia effettivamente riferendo. It must proceed with the lead feet when law) of "electronic " libraries, " " virtual " digitalises " is written (and or, terms that are wasted, of these times, also in the reviews and the television programs more divulgativi, without that is always clearly to who law (and sometimes to who does not write) to what is effectively reporting to us. Extreme caution must be used when writing (and reading) about "electronic," "digital," or "virtual" libraries, these terms being liberally used today even in the most popular magazines and TV programs, while it is not always clear to the reader (and sometimes not even to the author) what is actually being discussed.
If the major search engines don’t give you the information you need, you can use them to access other, regional or specialized, search engines such as the Brazilian Cadê, the European EuroSeek, the Japanese Info Navigator, the Russian search engines listed in Cathy Flicks column in this issue of the TJ, or the Medical Search Engine. There is at least one for each need, taste, language, or specialty.