Limit Scope to a Type of File. In
company research, it's often entertaining,
if not enlightening, to limit the scope of a
search to certain types of files. I once
limited a company search to PowerPoint
presentations and found a file the company
probably hadn't intended to release.
The command you use depends on the search
engine. At Yahoo, its originurlextension:
(originurlextension:ppt). At Exalead, Live
(Microsoft) and Google, it's filetype:
(filetype:ppt). The command is not available
at Ask.
Using Cached Pages. Cached pages have
lots of uses in research. You might want to
examine information as it previously
appeared. Or the page you need might not be
available. Sometimes you just want to weed
out all the bells and whistles to read the
matching content. Recently, one corporate
security researcher told me she uses cached
pages to find information about defunct
companies.
For many purposes, you would simply follow
the labeled cache page in the search
results. But if you are using cached pages
as a way to avoid the site's server
(concerns about malicious code, workaround
network filters), then you want to limit the
browsing to the text cache.
As far as I know, Google is the only search
engine to enable viewing just the text on a
page. To do this, add &strip=1 to the
cached page URL. Since the cached page
defaults to pulling non-textual elements
from the Web site's server, you should
activate the command without following the
cached link. To do this, right click
the cached link, copy and paste the URL into
the browser address line, and add
&strip=1 to the URL.
Each of the search engines, except for Ask,
lets you view caches of certain file types
as HTML. For instance, run a query,
companyname filetype:ppt. Remember to
replace the filetype command with
originurlextension: if you use Yahoo.
Look for the cached links at Live,
the preview links at Exalead, or the
view as HTML links at Yahoo or
Google. These options let you display the
proprietary file type as HTML.
Google lets you display the cache of a
particular Web page. Use the cache (cache:)
command followed by the URL, like this:
cache:https://www.virtualchase.com/index.shtml.
Proximity Searching. Exalead is the
only search engine that provides a command (NEAR)
for proximity searching. It finds keyword1
within 16 words of keyword2, in any order.
For example, to find e-mail addresses at a
particular domain, you might search,
email NEAR domain.com.
While Exalead provides a familiar command,
you can simulate this query at other search
engines. Google and Yahoo let you
use an asterisk as a wildcard so that the
query, email * domain.com, searches
for the word, email, within one or
more words of the domain name, in that
order. (15 October 2007. In the
article as originally published, I stated
that Ask and Live also support wildcard
searches. I should have run more tests. They
do not. Credit for the correction goes to
search experts
Gwen Harris and
Greg Notess.)
At Google, you can use any number of asterisks
between keywords, but doing so seems to
narrow the query. While it's not always
precise, 2 asterisks return matches with at
least 2 words (not including stop words)
between the key terms. See the difference
between
privacy * pretexting and
privacy * * pretexting.
Because this technique is a word order
search, don't forget to reverse it if the
word order isn't important. For instance, in
a search for information about treatments
for a medical condition, you might try:
treatment * conditionname, and then
conditionname * treatment. You could
combine the two search statements to run a
single query, like this: treatment *
dyslexia | dyslexia * treatment. (The
vertical bar represents OR at Google.)
Date Searching. For the most part,
date searching continues to be a problem
because the date the search engines use is a
server time stamp. Recently, however, Google
added the ability to restrict queries to
newly indexed Web pages. This helps somewhat
by limiting a query to pages in the index,
which Google recently discovered.
To find newly indexed pages, add the
command, &as_qdr=qN - where q equals
d (days), w (weeks) or y
(years) and N equals a number - to the URL
for the keyword search results, like this:
new (to Google) Web pages on
pretexting within the past 15 days.
Searching with Synonyms. In initial
research, you might want to conduct
trial-and-error queries to discover the
keywords that retrieve relevant results.
This technique may be especially useful if
you are unfamiliar with the topic.
While it's best to use a thesaurus to
identify possible synonyms, and then string
them together with OR, you can do a
quick-and-dirty synonym search at Google by
inserting a tilde (~) in front of the search
term. For example, the query,
teens addictive ~behavior, also finds
matches to teens addictive personality.
Note that not all keywords will have
synonyms at Google. Use of the tilde before
teens or addictive, for
instance, will not affect the search.
Want More? Several Web sites follow
developments in search commands and power
searching. A few of my favorites include
Search Engine Showdown,
Google Guide and
ResearchBuzz.
|