HTML - ThePix/QuestJS GitHub Wiki

All web pages use a mark-up language called HTML, which stands for HyperText Mark-up Language (or XML or XHTML, but they are almost the same).

HTML is a mark-up language, which means it has codes within the text telling the browser how to display the text. Here is an example:

Here is some text with one word in <b>bold</b>.

The angle brackets, < and > are used to flag that this is mark up. The <b> is the start tag, and </b> the end tag, and as you can see the different is the /. The tag name is "b", and has to be the same in the start tag and the end tag. The "b" tag name tells the browser this text should be in bold.

The bit between the tags, "bold", is called the content, and the entire thing, tags and content, is called an element.

Nesting

Elements have to be nested; they cannot overlap. This example uses the "i" tag to put some text in italics. The first is badly formed because the bold and italic overlap - the bold starts before the italic, but then ends before it too. The second line is good, because the bold element is nested within the italic.

Here is some text with one word in <b><i>bold</b> and five in italic</i>.
Here is some text with one word in <i><b>bold</b> and five in italic</i>.

Attributes

Sometimes an element will have attributes. The "div" tag name indicates a division, and you often want to give a division an id so you can do access it with JavaScript. Here, then, is a "div" element with an attribute set. Note that you always need double quotes either side of the value, even if the value is a number.

<div id="example">Here is some text with one word in <i><b>bold</b> and five in italic</i>.</div>

Empty elements

Sometimes an element has no contents, and there is a short hand for that. The "img" tag name indicates an image. It has the "src" attribute which gives the image filename, but has no contents. There is only one tag, but note the /, which tells the browser there is no end tag. The is called an empty element.

<img src="house.png" />

The "script" tag name is used to add JavaScript, which could be the contents of the element, but could be from file. For reasons I do not understand the above form does not work for this tag name, and you need an end tag.

<script src="clock.js"></script>

Bare bones of a web page

There are certain elements that browsers expect to be present in specific places.

<!DOCTYPE html>
<html lang="en">
<head>
  <title>A Test Page</title>
</head>
<body>
My contents
</body>
</html>

The first line is not really HTML, but is rather a declaration telling the browser that the rest of document is HTML. The root element of an HTML document is the "html" element. Within that element there are two other elements, "head" and "body". The "head" element can contain various things that do not appear on the page, such as the title, meta-data and references to other files. What you see on the page goes into the "body" element.

Character entities

Certain characters have a special meaning in HTML, the most obvious being <. When the browser hits that character, it will assume what follows is a tag. So what do you do it you want a < to be displayed? A character entity. These start with an ampersand and end with a semi-colon. In between you can have a two to five letter code or a number.

< &lt;
> &gt;
& &amp;
" &quot;
' &apos;
¢ &cent;
£ &pound;
¥ &yen;
€ &euro;
© &copy;
® &reg;
™ &trade;
µ &micro;
° &deg;

Also worth a mention is  , which is a non-breaking space (so hard to show on the table above). If you want two or more spaces together, use this.