Along came archie, a program to locate ftp sites using a keyword search. This program worked by collecting data at archie servers (located at some sites) and allowing archie clients to connect to the servers to search.
The next development was gopher (created at the University of Minnesota), a utility which combined several tools (a file viewer, ftp and telnet) in a single easy-to-use menu-driven interface.
At the same time, the publishing industry had been experimenting with so-called hypertext documents (electronic documents with nonlinear organization of data) -- on single machines. A standard called SGML (Standard Generalized Markup Language) was developed to write hypertext documents in free ascii-text (similar to Latex, troff etc).
Ideally, SGML should be integrated with TCP/IP to provide links across the network. But SGML is large and complex. Thus came HTML (HyperText Markup Language), a much simpler formatting language developed by CERN in Switzerland that uses TCP/IP.
The whole idea in using HTML is to display more than text, that is, formatted text and images. For this, a "browser" is needed - most often, a browser written for a windowing package such as xmosaic (the first browser) written for X-windows.
HTML documents are written in ascii text,
with commands specified by particular sequences of characters. Commands
in HTML usually consist of 3 components:
For example, to specify the title of a document,
such as, Red Riding Hood,
you would use the <title>
command:
<title> Red Riding Hood </title>Note that
Some characters, such as "<", ">" and "&" are used exclusively for HTML commands. To actually display these special characters in a document, see the section on special characters below.
Here's the basic structure of an HTML document:
<html> <!-- This is an internal comment; it won't be displayed --> <!-- Note the exclamation mark and the two dashes on either side --> <head> <title> Red Riding Hood </title> </head> <body> Once upon a time, in a land far far away, there lived... </body> </html>Now, HTML is written in free text, so the above document could just as well be written into a text file as:
<html> <!-- This is an internal comment; it won't be displayed --> <!-- Note the exclamation mark and the two dashes on either side --> <head> <title> Red Riding Hood </title> </head> <body> Once upon a time, in a land far far away, there lived... </body> </html>
NOTE:
<head> blah-blah
<\head> part
of the document is used to specify a title (so that it can be displayed
in the title window of a browser).
An important point needs to be made here: you, the
author of an HTML document, do not have much control over how a browser
will display your document. Some control, but not much. This is because
a browser may be dynamically configured to various window sizes and the
designers of HTML wanted to keep the language simple enough to let browsers
decide "what looks nice". You, the author, can decide paragraph breaks,
some fonts, italics, a rough sense of font sizes and the like - but eventually
a browser decides what to display. Thus, your document could look quite
different when viewed through different browsers.
OK, on to formatting. That is, creating paragraphs,
using italics, selecting fonts, and other goodies that we all want for
our cheesy documents.
Let's cover a few simple HTML ideas by embellishing the example shown earlier:
<html> <head> <title> Red Riding Hood </title> </head> <body> <h1> Red's Early Years </h1> <h2> Birth </h2> Once upon a time, in a land far far away, there lived... <h2> Pre-School </h2> When she was three, her mother enrolled her in <i> Grimm's DayCare </i> ... <h1> Red Goes to High-School </h1> <h2> Red Sets up a Homepage </h2> Acquiring a 486-based Linux machine running X-windows was the first step in Red's long journey from internet initiate to system guru. <p> She set up the soon-to-be hottest internet site: <hr> http://red.hood.com <hr> <address> The Brothers Grimm </address> </body> </html>The example shows
Command Description Start tag Stop tag ------------------------------------------------------------------- <html> </html> HTML document indicator <head> </head> Document head <title> </title> Title (usually in <head> section) <body> </body> Document body <address> </address> Document author info <!-- --> Comment <h1> </h1> Level-1 heading <h2> </h2> Level-2 heading . . . <h6> </h6> Level-6 heading <i> </i> Italics <em> </em> Emphasized text - similar to italics <b> </b> Bold face <strong> </strong> Strong - similar to bold <tt> </tt> Teletype <strike> </strike> Strike-through <var> </var> Variable - similar to italics <cite> </cite> Citation - similar to italics <code> </code> Code - similar to teletype <kbd> </kbd> Keyboard - similar to teletype <samp> </samp> Sample - similar to teletype <dfn> </dfn> Definition - for definitions <key> </key> Keyword - for keywords <p> End a paragraph, start a new one <br> Line break - start a new line <hr> Horizontal rule <pre> </pre> Preformatted text - format exactly as entered in ascii. <blockquote </blockquote> To set apart a quote
To create an itemized list, consider the following
example:
<body> <h2> Red Riding Hood's shopping list </h2> <ul> <li> Picnic basket <li> Iced tea <li> Red items <ul> <li> Red delicious apples <li> Red sneakers <li> Red jacket with hood </ul> <li> Safety items <ol> <li> Magnesium flare <li> Cellular phone <li> Uzi </ol> </ul> </body>Observe that an unordered list list is defined by <ul> list of things </ul> and an ordered list by <ol> list of ordered items </ol> . Each item is specified by a <li>. .
Unordered lists are bulleted and ordered lists are numbered. There are also <menu> and <dir> types of lists, both being similar to unordered lists. Lists can be nested within other lists, as the example shows.
HTML also provides a description list to pair up, for example, keywords and definitions in a glossary. For example:
<dl> <dt> HTML <dd> A language spoken by nerds <dt> Java <dd> A language spoken by major nerds </dl>You get the idea.
Character code Description ----------------------------------------------------------- & l t ; the less-than symbol & g t ; the greater-than symbol & a m p the ampersand symbol
Suppose we have created the following html
file (called, say, red.html)
in the current directory, along with the file home.html
and a subdirectory early
containing the files birth.html and
preschool.html.
<html> <head> <title> Red Riding Hood </title> </head> <body> <h1> Red's Early Years </h1> <a href="early/birth.html"> Birth </a> <a href="early/preschool.html"> Pre-School </a> <h1> Red Goes to High-School </h1> <a href="home.html"> Red Sets up a Homepage </a> </body> </html>Notice the new command used above - the anchor command - with the general form <a href="an address"> some text that will be highlighted </a>. When a browser displays this document, the text between the anchor tags will be underlined or highlighted. Mouse clicks on this portion will result in following the address to a new HTML document. The start tag provides the information needed to find the new html document (which has its own header, body etc).
http://www.seas.gwu.edu:80/tales/fairy/modern/masterlist.htmlIt specifies the following:
Named anchors allow you to mark a place in the text to point to. For example, suppose the file stories.html contains a number of stories, whereas masterlist.html contains a list of story titles such as Red Riding Hood. Each title has a link to the appropriate story in stories.html. For example, here is a portion of masterlist.html:
<ul> <li> <a href="stories.html"> Hansel and Gretel <a> <li> <a href="stories.html"> Jack and the Beanstalk <a> <li> <a href="stories.html"> Red Riding Hood <a> <li> <a href="stories.html"> Aesop's Fables <a> </ul>Let's assume that stories.html contains the tales (each with hyperlinks to other files. Now, by clicking on any stories in masterlist.html, the browser will take you to the top of the stories.html file. You then have to scroll down to the story you want. To avoid this problem, we simply mark each story beginning in the file stories.html and use the mark in the href specification. For example, in stories.html, let us mark Red Riding Hood as follows:
<h1> <a NAME="red"> Red Riding Hood <a> </h1>Now, in the appropriate href part in masterlist.html, we specify this mark:
<li> <a href="stories.html#red"> Red Riding Hood <a>Observe the hash symbol being used to specify a named anchor. You can use named anchors for rapid movement within a single HTML document.
http://www.seas.gwu.edu/tales/fairy/modern/masterlist.htmlThen, we have seen that links in the file masterlist.html are created by giving an address in the href part of an anchor. We can either provide a full address or a partial or relative address. Above, we saw an example of a relative address:
<a href="stories.html"> Red Riding Hood <a>We could have also given the complete address:
<a href="http://www.seas.gwu.edu/tales/fairy/modern/stories.html"> Red Riding Hood <a>
MIME (Multipurpose Internet Mail Extensions)
is a standard that incorporates many well-known file formats. The idea
is that the browser doesn't handle these formats and instead calls a "plug-in",
a program that knows what to do with the data. Thus, for "postscript" files,
a postscript viewer is called by the browser. You can, by setting options
in the browser, decide which application programs (plug-ins) handle which
file extensions. Here are some common extensions (some of which, like .gif,
are directly handled by the browser).
<body> <h1> The Next President of the United Brewpub Tasters of America </h1> <img alt="my mugshot" align=bottom src="mypicture.gif"> </body>With the <img ... > command, we specify the source file (mypicture.gif), an alignment for the first following line of text, and an alternate ascii string (my mugshot) for browsers that don't support images.
<body> <h1> The Next President of the United Brewpub Tasters of America </h1> <a href="bio.html"> <img alt="my mugshot" align=bottom src="mypicture.gif"> </a> Click on my picture to get my biodata </body>
http://www.seas.gwu.edu/~simhais really the file public_html/index.html, which the webserver knows to get. (You don't have to understand this last point). Make sure that you grant public access to this directory and to the files you place in the directory.
http://www.seas.gwu.edu/~simha/cv.htmlthen others can `open' this URL and get the file.
BEER RATING Chilled Cool ---------------------------------- Dork Pilsener 5.6 6.7 Blech Dark 7.3 7.1 Bugwizer 1.2 0.5 Handiken 4.4 6.5Now, in HTML:
<table border> <th> BEER </th> <th colspan=2> RATING </th> <tr> <th> </th> <th> Chilled </th> <th> Cool </th> <tr> <td> Dork Pilsener </td> <td> 5.6 </td> <td> 6.7 </td> <tr> <td> Blech Dark </td> <td> 7.3 </td> <td> 7.1 </td> <tr> <td> Bugwizer </td> <td> 1.2 </td> <td> 0.5 </td> <tr> <td> Handiken </td> <td> 4.4 </td> <td 6.5 </td> <tr> </table>And now the explanation:
<body> <h1> The Next President of the United Brewpub Tasters of America </h1> <a href="bio.html"> <img alt="my mugshot" align=bottom src="mypicture.gif"> </a> Click on my picture to get my biodata <h1> What's new in this homepage </h1> <!--#include file="whatsnew.html"> </body>The whatsnew.html is continually updated.
<FONT COLOR="#000000">black </FONT> <FONT COLOR="#FFFFFF">white</FONT> <FONT COLOR="#FFFF00">yellow</FONT> <FONT COLOR="#FFDEAD">beige</FONT> <FONT COLOR="#CD853F">dark beige</FONT> <FONT COLOR="#00FF00">bright green</FONT> <FONT COLOR="#008000">dark green</FONT> <FONT COLOR="#00FFFF">light blue</FONT> <FONT COLOR="#0000B0">navy blue</FONT> <FONT COLOR="#000080">dark blue</FONT> <FONT COLOR="#FFB6C1">light pink</FONT> <FONT COLOR="#FF1493">dark pink</FONT> <FONT COLOR="#FF0000">red</FONT> <FONT COLOR="#B22222">dark brick red</FONT> <FONT COLOR="#FFF8DC">lightly off-white</FONT> <FONT COLOR="#FA8072">flesh-pink</FONT> <FONT COLOR="#FFF8DC"></FONT><FONT COLOR="#00FF00"></FONT>Fonts also come in "faces" such as roman or helvetica. Use the face attribute for a font family and the size attribute for a size.
Note that both face and size depend entirely on what the browser supports. For example, Netscape 3.0 only supports roman and helvetica. Here are some examples:
<font face="verdana" size=3> [verdana] </font> <font face="arial" size=3> [arial] </font> <font face="helvetica" size=3> [helvetica] </font> <font face="roman" size=3> [roman] </font> <font face="roman" size=1> [roman1] </font> <font face="roman" size=5> [roman5] </font> <font face="helvetica" size=4> [helvetica4] </font>
<body bgcolor="#FFFFFF">Similarly the background for table or even a row inside a table can be set. (More detail below).
<table height=30> <tr> <td> </td> </tr> </table>If you look at the source for almost any sophisticated page, you will find tables being used for layout. Typically, a top "bar" contains information related to an organization and a "side bar" contains navigational help.
Use cellpadding and cellspacing to determine the white space around the table data and the spacing between individual cells.
Use border to specify the thickness of a border. For example:
<table cellpadding=2 cellspacing=5 border=10> <tr> <td> A1 </td> <td> A2 </td> <td> A3 </td> </tr> <tr> <td> B1 </td> <td> B2 </td> <td> B3 </td> </tr> </table>Although a table will expand to contain its elements, you can fix the width and height, with width and height tags, e.g.
<table cellpadding=10 cellspacing=3 border=1 width=100> ... </table>Likewise, a background for the table can be specified with the bgcolor attribute, e.g.,
<table cellpadding=10 cellspacing=3 border=1 width=500 bgcolor=FFFF00> ... </table>Suppose you want to color the border. You can do this by placing a table inside another. For example, to get the borders to be red, place the table inside another table whose background color is red.
<table bgcolor=DC143C> <tr> <td> <table border=1> ... </table> </td> </tr> </table>Use rowspan and colspan for entries that span multiple columns or multiple rows.
Similarly use the align attribute to decide how each entry is justified within a cell.
The following complex example demonstrates many of these features: Here,
<table bgcolor=DC143C> <tr> <td> <table cellpadding=10 cellspacing=3 border=1 width=500 bgcolor=FFFF00> <tr> <td colspan=2 align=right> A1 </td> <td align=center> A2 </td> </tr> <tr> <td valign=bottom> B1 </td> <td bgcolor=FFDEAD> B2 </td> <td width=100> B3 </td> </tr> </table> </td> </tr> </table>
<body> <h1> What's new in this homepage </h1> <!--#include file="whatsnew.html"> <h1> Current directory of files </h1> <!--#exec cmd="ls"> </body>The ls command gets executed; the returned text is simply fedback to the browser as text in the <pre> format. Instead of "ls" you can call any program or script; these executables can return text or even html documents (with html commands).
Here is an example of a simple form:
<html> <head> <title> Tell me your name </title> </head> <body> <form action="query.pl" method="post"> Your full name <input type="text" name="fullname"> <input type="submit" value="click when done"> <input type="reset" value="clear"> </form> </body>Now, an explanation:
Note: This tutorial does not cover perl. Check out the URL
http://www.cis.ufl.edu/perl.The <form> tags indicate the form, while <input ...> allows you to specify a variety of input mechanisms, from clickable radio type buttons to scrollable menus. The action attribute specifies the executable that will handle the data; generally, use the post method (the other one is the get method). Now let's look at some other input mechanisms.
<form action="query.pl" method="post"> Your full name <input type="text" name="fullname" maxlength=25> Tell me your life story <input type="text" name="lifestory" maxlength=100 size="30,3"><p> How many beers do drink a day? <input type="text" name="ntimes" value="0"> <input type="submit" value="click when done"> <input type="reset" value="clear"> </form>First observe that a maximum length can be specified for a single input line. Second, a default value (string) can be placed in a text line or box. Now let's explain the textbox. The textbox is simply a text line followed by a <p> afterwards. A display size (30 chars by 3 lines) can be specified. Default text cannot be placed in a textbox. When a user enters stuff in a textbox and uses carriage returns, the string passed to query.pl will contain "%0A" in place of each carriage return.
<form action="query.pl" method="post"> Your full name <input type="text" name="fullname" maxlength=25> Tell me your life story <input type="text" name="lifestory" maxlength=100 size="30,3"><p> How many beers to you drink a day? <input type="text" name="ntimes" value="0"> What are your favorite pick-up lines? <textarea rows=3 cols=30 name="pickup"> Your eyes are like moonlit pools of honey </textarea> <input type="submit" value="click when done"> <input type="reset" value="clear"> </form>
<form action="query.pl" method="post"> Your full name <input type="text" name="fullname" maxlength=25> Preferred number of beers per year <p> <input type="radio" name="freq" value="1"> 1-10 times a year <p> <input type="radio" name="freq" value="2"> 10-100 times a year <p> <input type="radio" name="freq" value="3"> 100-1000 times a year <p> <input type="submit" value="click when done"> <input type="reset" value="clear"> </form>When the user clicks on one of the buttons, say the second one, the string freq=2 will be returned to query.pl.
<form action="query.pl" method="post"> Your full name <input type="text" name="fullname" maxlength=25> Have you tried the following brands? <input type="checkbox" name="brand" value="1" checked> Pete's Wicked <p> <input type="checkbox" name="brand" value="2"> Oktoberfest <p> <input type="checkbox" name="brand" value="3"> Samuel Adams <p> <input type="checkbox" name="brand" value="4"> Oregon <p> <input type="submit" value="click when done"> <input type="reset" value="clear"> </form>
<form action="query.pl" method="post"> Your full name <input type="text" name="fullname" maxlength=25> Have you tried the following brands? <select name="loc" multiple> <option selected> Pete's Wicked <p> <option> Oktoberfest <p> <option> Samuel Adams <p> <option> Oregon <p> </select> <input type="submit" value="click when done"> <input type="reset" value="clear"> </form>Note that multiple choices are allowed by specifying the attribute multiple. Also default selections can be make using the attribute selected.
For a simple, drop-down use
<SELECT NAME=platforms> <OPTION>Windows <OPTION>Macintosh <OPTION>Unix </SELECT>
NOTE: you really don't need to understand
the code, other than to know that the form values are returned in the array
cgivars. The
function you will typically re-write is the function response
below. In particular the even indices (cgivars[0], cgivars[2],
etc) contain the names of the parameters and the odd numbered ones contain
the values (which is typically all you need).
/* echo.c - echoes parameters */ /* To compile: gcc echo.c */ /* Adapted from James Marshall's CGI Made Really Easy */ /** **/ /** The x2c() and unescape_url() routines were lifted directly **/ /** from NCSA's sample program util.c, packaged with their HTTPD. **/ /** **/ #include <stdio.h> #include <string.h> #include <stdlib.h> /** Convert a two-char hex string into the char it represents **/ char x2c(char *what) { register char digit; digit = (what[0] >= 'A' ? ((what[0] & 0xdf) - 'A')+10 : (what[0] - '0')); digit *= 16; digit += (what[1] >= 'A' ? ((what[1] & 0xdf) - 'A')+10 : (what[1] - '0')); return(digit); } /** Reduce any %xx escape sequences to the characters they represent **/ void unescape_url(char *url) { register int i,j; for(i=0,j=0; url[j]; ++i,++j) { if((url[i] = url[j]) == '%') { url[i] = x2c(&url[j+1]) ; j+= 2 ; } } url[i] = '\0' ; } /** Read the CGI input and place all name/val pairs into list. **/ /** Returns list containing name1, value1, name2, value2, ... , NULL **/ char **getcgivars() { register int i ; char *request_method ; int content_length; char *cgiinput ; char **cgivars ; char **pairlist ; int paircount ; char *nvpair ; char *eqpos ; /** Depending on the request method, read all CGI input into cgiinput **/ /** (really should produce HTML error messages, instead of exit()ing) **/ request_method= getenv("REQUEST_METHOD") ; if (!strcmp(request_method, "GET") || !strcmp(request_method, "HEAD") ) { cgiinput= strdup(getenv("QUERY_STRING")) ; } else if (!strcmp(request_method, "POST")) { /* strcasecmp() is not supported in Windows-- use strcmpi() instead */ if ( strcasecmp(getenv("CONTENT_TYPE"), "application/x-www-form-urlencoded")) { printf("getcgivars(): Unsupported Content-Type.\n") ; exit(1) ; } if ( !(content_length = atoi(getenv("CONTENT_LENGTH"))) ) { printf("getcgivars(): No Content-Length was sent with the POST request.\n") ; exit(1) ; } if ( !(cgiinput= (char *) malloc(content_length+1)) ) { printf("getcgivars(): Could not malloc for cgiinput.\n") ; exit(1) ; } if (!fread(cgiinput, content_length, 1, stdin)) { printf("Couldn't read CGI input from STDIN.\n") ; exit(1) ; } cgiinput[content_length]='\0' ; } else { printf("getcgivars(): unsupported REQUEST_METHOD\n") ; exit(1) ; } /** Change all plusses back to spaces **/ for(i=0; cgiinput[i]; i++) if(cgiinput[i] == '+') cgiinput[i] = ' ' ; /** First, split on "&" to extract the name-value pairs into pairlist **/ pairlist= (char **) malloc(256*sizeof(char **)) ; paircount= 0 ; nvpair= strtok(cgiinput, "&") ; while (nvpair) { pairlist[paircount++]= strdup(nvpair) ; if (!(paircount%256)) pairlist= (char **) realloc(pairlist,(paircount+256)*sizeof(char **)) ; nvpair= strtok(NULL, "&") ; } pairlist[paircount]= 0 ; /* terminate the list with NULL */ /** Then, from the list of pairs, extract the names and values **/ cgivars= (char **) malloc((paircount*2+1)*sizeof(char **)) ; for (i= 0; i<paircount; i++) { if (eqpos=strchr(pairlist[i], '=')) { *eqpos= '\0' ; unescape_url(cgivars[i*2+1]= strdup(eqpos+1)) ; } else { unescape_url(cgivars[i*2+1]= strdup("")) ; } unescape_url(cgivars[i*2]= strdup(pairlist[i])) ; } cgivars[paircount*2]= 0 ; /* terminate the list with NULL */ /** Free anything that needs to be freed **/ free(cgiinput) ; for (i=0; pairlist[i]; i++) free(pairlist[i]) ; free(pairlist) ; /** Return the list of name-value strings **/ return cgivars ; } /** Standard "hello, world" program, that also shows all CGI input. **/ int main() { char **cgivars ; int i ; /** First, get the CGI variables into a list of strings **/ cgivars= getcgivars() ; response (cgivars); /** Free anything that needs to be freed **/ for (i=0; cgivars[i]; i++) free(cgivars[i]) ; free(cgivars) ; exit(0) ; } /** THIS is the code you need to pay attention to **/ void response (char **cgivars) { /** Print the CGI response header, required for all HTML output. **/ /** Note the extra \n, to send the blank line. **/ printf("Content-type: text/html\n\n") ; /** Finally, print out the complete HTML response page. **/ printf("<html>\n") ; printf("<head><title>CGI Results</title></head>\n") ; printf("<body>\n") ; printf("<h1>Hello, world.</h1>\n") ; printf("Your CGI input variables were:\n") ; printf("<ul>\n") ; /** Print the CGI variables sent by the user. Note the list of **/ /** variables alternates names and values, and ends in NULL. **/ for (i=0; cgivars[i]; i+= 2) printf("<li>[%s] = [%s]\n", cgivars[i], cgivars[i+1]) ; printf("</ul>\n") ; printf("</body>\n") ; printf("</html>\n") ; }