Illustrate how to analyze a Web address.

Victims of many past Internet fraud schemes could have protected themselves if they had learned to decode a Web address (URL).

Example

http://www.thoughtpolice.com/bayboyz/1040.html

Components of a Web Address

  1. Everything preceding the double forward slash (e.g., http://, ftp://) indicates the protocol, or format for transmitting data. Http:// signifies hypertext transmission, or simply, the Web.
  2. Characters between the protocol and the domain name -- typically, but not always, www -- reveal the name of the server. Sometimes this part of the address is missing; e.g. http://virtualchase.com, and that's okay.
  3. Characters following the server name and ending with a top-level domain name like .com, .net, .org, .gov, etc., or a country code (e.g., uk, au, ca) comprise the domain name. Thoughtpolice.com in the above URL is the domain name. This often, although not always, provides a clue about the ownership of the Web site.
  4. Characters following the first single forward slash, and ending at the final forward slash -- /bayboyz/ in the above example, indicate the path on the server where the information resides. The path may consist of a single directory (or folder) or multiple directories (e.g., http://www.virtualchase.com/tvcalert/tvcdocs/).
  5. Characters following the last forward slash and ending in .html, .htm, .sht, .shtml, .asp, .cfm, etc. (e.g., 1040.html above) make up the file name that contains (or in the case of a dynamic site, temporarily holds) the information.

Variations

While the above steps will serve to decode many Web addresses, some variations exist. Moreover, as Web development moves more toward the use of dynamic data, URLs become more complex, and therefore, more difficult to decipher. Below appear some examples:

http://www.webcom.com/~pinknoiz/coldwar/ciaradiation.html

Follow the decoding steps above, but note the tilde (~) in the Web address. This usually, but not always, indicates a personal folder -- perhaps the customer of a Web host, or a student at a university, etc. The existence of information on a personal page or site does not necessary mean the information is substandard. It should, however, flag the user's attention and underscore the need to learn more about the author's expertise.

http://thomas.loc.gov/cgi-bin/query/z?c107:H.R.2975:

The use of a question mark (?) in a Web address typically means that behind the scenes, a script will call information from the server or a database. It's easy to determine the script used in the above URL because it appears in the address as cgi-bin. It's a CGI script. Everything that follows cgi-bin in the above example means something to the server or database. In this case, it commands the server to return a copy of House bill 107-2975.

http://libraryjournal.reviewsnews.com/index.asp?layout=article&articleid=CA170412&display=breakingNews

The above URL indicates the use of both dynamic data and stylesheets. Read everything before the question mark (?) by following the guidelines above. Following the question mark are a series of commands to the server or database. For example, articleid=CA170412 commands the database to retrieve the article having the unique number, CA170412. Layout=article and display=breakingNews probably indicate the use of stylesheets to display the Web page using a defined format.  

Close this window

COPYRIGHT: 2001 Ballard Spahr Andrews & Ingersoll, LLP all rights reserved.

This information appears in Teaching Internet Research Skills, a teaching Web of The Virtual Chase at URL http://www.virtualchase.com/researchskills/quality4.html.