HTML is a language for describing web pages.
HTML stands for Hyper Text Markup Language
HTML is not a programming language, it is a markup language
A markup language is a set of markup tags
HTML uses markup tags to describe web pages
HTML markup tags are usually called HTML tags
HTML tags are keywords surrounded by angle brackets like <html>
HTML tags normally come in pairs like <b> and </b>
The first tag in a pair is the start tag, the second tag is the end tag
Start and end tags are also called opening tags and closing tags
HTML documents describe web pages
HTML documents contain HTML tags and plain text
HTML documents are also called web pages
The purpose of a web browser (like Internet Explorer or Firefox) is to read HTML documents and display them as web pages. The browser does not display the HTML tags, but uses the tags to interpret the content of the page:
<html>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
</body>
</html>
The text between <html> and </html> describes the web page
The text between <body> and </body> is the visible page content
The text between <h1> and </h1> is displayed as a heading
The text between <p> and </p> is displayed as a paragraph
You don't need any tools to learn HTML at W3Schools.
You don't need an HTML editor
You don't need a web server
You don't need a web site
HTML can be written and edited using many different editors like Dreamweaver and Visual Studio.
However, in this tutorial we use a plain text editor (like Notepad) to edit HTML. We believe using a plain text editor is the best way to learn HTML.
We suggest you experiment with everything you learn at W3Schools by editing your web files with a text editor (like Notepad).
Note: If your test web contains HTML markup tags you have not learned, don't panic. You will learn all about it in the next chapters.
When you save an HTML file, you can use either the .htm or the .html file extension. There is no difference, it is entirely up to you.
HTML headings are defined with the <h1> to <h6> tags.
<h1>This is a heading</h1>
<h2>This is a heading</h2>
<h3>This is a heading</h3>
HTML paragraphs are defined with the <p> tag.
<p>This is a paragraph.</p>
<p>This is another paragraph.</p>
HTML links are defined with the <a> tag.
<a href="http://www.w3schools.com">This is a link</a>
Note: The link address is specified in the href attribute.
(You will learn about attributes in a later chapter of this tutorial).
HTML images are defined with the <img> tag.
<img src="w3schools.jpg" width="104" height="142" />
Note: The name and the size of the image are provided as attributes.
An HTML element is everything from the start tag to the end tag:
* The start tag is often called the opening tag. The end tag is often called the closing tag.
An HTML element starts with a start tag / opening tag
An HTML element ends with an end tag / closing tag
The element content is everything between the start and the end tag
Some HTML elements have empty content
Empty elements are closed in the start tag
Most HTML elements can have attributes
Tip: You will learn about attributes in the next chapter of this tutorial.
Most HTML elements can be nested (can contain other HTML elements).
HTML documents consist of nested HTML elements.
<html>
<body>
<p>This is my first paragraph.</p>
</body>
</html>
The example above contains 3 HTML elements.
The <p> element:
<p>This is my first paragraph.</p>
The <p> element defines a paragraph in the HTML document.
The element has a start tag <p> and an end tag </p>.
The element content is: This is my first paragraph.
The <body> element:
<body>
<p>This is my first paragraph.</p>
</body>
The <body> element defines the body of the HTML document.
The element has a start tag <body> and an end tag </body>.
The element content is another HTML element (a p element).
The <html> element:
<html>
<body>
<p>This is my first paragraph.</p>
</body>
</html>
The <html> element defines the whole HTML document.
The element has a start tag <html> and an end tag </html>.
The element content is another HTML element (the body element).
Some HTML elements might display correctly even if you forget the end tag:
<p>This is a paragraph
<p>This is a paragraph
The example above works in most browsers, because the closing tag is considered optional.
Never rely on this. Many HTML elements will produce unexpected results and/or errors if you forget the end tag .
HTML elements with no content are called empty elements.
<br> is an empty element without a closing tag (the <br> tag defines a line break).
Tip: In XHTML, all elements must be closed. Adding a slash inside the start tag, like <br />, is the proper way of closing empty elements in XHTML (and XML).
HTML tags are not case sensitive: <P> means the same as <p>. Many web sites use uppercase HTML tags.
HTML elements can have attributes
Attributes provide additional information about an element
Attributes are always specified in the start tag
Attributes come in name/value pairs like: name="value"
HTML links are defined with the <a> tag. The link address is specified in the href attribute:
<a href="http://www.w3schools.com">This is a link</a>
Attribute values should always be enclosed in quotes.
Double style quotes are the most common, but single style quotes are also allowed.
Tip: In some rare situations, when the attribute value itself contains quotes, it is necessary to use single quotes: name='John "ShotGun" Nelson'
Attribute names and attribute values are case-insensitive.
However, the World Wide Web Consortium (W3C) recommends lowercase attributes/attribute values in their HTML 4 recommendation.
Newer versions of (X)HTML will demand lowercase attributes.
A complete list of legal attributes for each HTML element is listed in our:
Below is a list of some attributes that are standard for most HTML elements:
For more information about standard attributes:
Not valid in base, head, html, meta, param, script, style, and title elements.
Not valid in base, br, frame, frameset, hr, iframe, param, and script elements.
HTML 4 added the ability to let events trigger actions in a browser, like starting a JavaScript when a user clicks on an element.
Below is the standard event attributes that can be inserted into HTML / XHTML elements to define event actions.
The two attributes below can only be used in <body> or <frameset>:
The attributes below can be used in form elements:
The attribute below can be used with the img element:
Attribute
onabort
Value
script
Description
Script to be run when loading of an image is interrupted
Valid in all elements except base, bdo, br, frame, frameset, head, html, iframe, meta, param, script, style, and title.
Valid in all elements except base, bdo, br, frame, frameset, head, html, iframe, meta, param, script, style, and title.
The following table lists all HTML/XHTML elements, and defines which doctype declarations (DTDs) each element appear in.
To display an HTML page correctly, the browser must know what character-set to use.
The character-set for the early world wide web was ASCII. ASCII supports the numbers from 0-9, the uppercase and lowercase English alphabet, and some special characters.
Since many countries use characters which are not a part of ASCII, the default character-set for modern browsers is ISO-8859-1.
If a web page uses a different character-set than ISO-8859-1, it should be specified in the <meta> tag.
It is the International Standards Organization (ISO) that defines the standard character-sets for different alphabets/languages.
The different character-sets being used around the world are listed below:
Because the character-sets listed above are limited in size, and are not compatible in multilingual environments, the Unicode Consortium developed the Unicode Standard.
The Unicode Standard covers all the characters, punctuations, and symbols in the world.
Unicode enables processing, storage and interchange of text data no matter what the platform, no matter what the program, no matter what the language.
The Unicode Consortium develops the Unicode Standard. Their goal is to replace the existing character-sets with its standard Unicode Transformation Format (UTF).
The Unicode Standard has become a success and is implemented in XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc. The Unicode standard is also supported in many operating systems and all modern browsers.
The Unicode Consortium cooperates with the leading standards development organizations, like ISO, W3C, and ECMA.
Unicode can be implemented by different character-sets. The most commonly used encodings are UTF-8 and UTF-16:
Character-set
UTF-8
UTF-16
Description
A character in UTF8 can be from 1 to 4 bytes long. UTF-8 can represent any character in the Unicode standard. UTF-8 is backwards compatible with ASCII. UTF-8 is the preferred encoding for e-mail and web pages
16-bit Unicode Transformation Format is a variable-length character encoding for Unicode, capable of encoding the entire Unicode repertoire. UTF-16 is used in major operating systems and environments, like Microsoft Windows 2000/XP/2003/Vista/CE and the Java and .NET byte code environments
Tip: The first 256 characters of Unicode character-sets correspond to the 256 characters of ISO-8859-1.
Tip: All HTML 4 processors already support UTF-8, and all XHTML and XML processors support UTF-8 and UTF-16!