|
15 June 2005. As
much as I talk about how to use information discovery tools beyond
search engines, once in a while I find a new search tool that makes
me actually want to do some Web searching.
Exalead
is a fairly new search engine from France, which
was introduced in October 2004.
It is still officially in beta. Having
passed the one billion page mark in 2005,
it's still one-eighth the size of Google
or Yahoo. But what's a few billion pages
among friends? Actually, after a certain point, size really doesn't
matter. The key factors in evaluating a search engine should include
timeliness, ability to handle ambiguity, and plenty of power search
tools. Exalead does a great job, at least on two of these criteria.
|
|
|
When you first connect,
you see a stylishly minimalist page. But click
through to the Advanced Search page to appreciate Exalead's search
features.
Among the features, which you don't always
find in search engines, are:
-
The option to
specify that results "preferably
contain" all the terms you are searching for, in addition to
"must contain" and "must not contain."
-
You can also do this with the OPT
operator, to indicate which specific words are "optional."
-
Proximity
searching, in which the words you search must be within 16 words
of each other. (No, you can't tweak the number of intervening
words.)
-
Truncation,
and this isn't just the word stemming that many search engines
employ behind the scenes (a search for "pencil" will also
retrieve "pencils"), but true
truncation, where you can search for "librar" and retrieve
library, libraries, librarian, librarianship, and so on.
-
Phonetic
spelling and approximate spelling, through which you can search
for a word, even if you aren't sure of the spelling,
or if the word is frequently misspelled. Think "Arnold
Schwarzenegger" for example.
-
What Exalead
calls "Regular Expressions," in which
you can search for documents with words that match a certain
pattern. Imagine, for example, that you're doing a crossword
puzzle and have a word of 6 letters, of which the second is T
and the sixth is C. By searching /.t...c/, you will retrieve
sites with the word ATOMIC, perhaps the right word for your
puzzle.
One gripe I have is that there is an option
to limit your search by country but, unfortunately, this only
searches by two-letter top level domain (e.g., .uk, .jp). This means
that, for example, if you limit your search to Australian sites and
search for Australian biotech associations, you won't retrieve
AusBioTech.org, a major biotech
association in Australia, because it does
not have .au as its top level domain.
In
addition to search power, Exalead has a rich search
results screen. In addition to the usual display of search
pages and snippets, each entry includes an image of the retrieved
page. There is also a column along the left that displays relevant
entries from the Open Directory Project, along with tools to select
"related terms," to limit your search by
document type, and to narrow the search by location (and,
interestingly, this doesn't use the two-letter top level domain
limit, but instead retrieves only pages from the Open Directory
Project that have been categorized under that country.)
My one real objection to Exalead -- and it's a big issue –- is that
it appears Exalead has not updated its index since the beginning of
2005. One of its advanced search features lets you limit your search
by the date a file was last modified (note that you need to use the
European format of dd/mm/yyyy).
But repeated tests turned up no records
from 2005. Yes, Exalead is in beta, and that sometimes means there
are glitches, but a delay in updating the index is troubling.
Until Exalead gets its updating schedule back on track, use the
search engine to find older material, or
to verify spelling, identify alternative word
meanings or find authoritative
material from sites that have a track record. And show this site to
the next representative of one of the value-added online services.
The features in Exalead would add tremendous search power to, say,
Dialog, Factiva or LexisNexis.
© 2005 Mary Ellen Bates all rights reserved.
|
|