Last updated: 2022-11-07 18:37:06 Show
IntroductionIn this chapter, we introduce the most basic and fundamental component of web technologies: HTML. As we will see, HTML is a data format used to encode the contents and structure of
web pages. HTML is usually stored in plain text files with the Starting from this chapter and onward, we are going to present computer code examples. Some examples are short, separate pieces of code used to illustrate an idea or concept. Other examples include the complete source code of a web page, which you can open and display in the browser, as well as modify and experiment with. The way that each of the complete code examples will appear when opened with the browser is shown in a separate figure, such as in Figure 1.1. As mentioned in Section 0.8, the online version of this book contains live versions of all ninety-plus complete examples (Appendices B–C), as well as a downloadable folder with all code files to experiment with the examples on your own computer (Appendix A). Learning programming requires a lot of practice, so it is highly recommended to open the examples on your computer as you go along through the book. Better yet, you can modify the code of each example and observe the way that the displayed result changes, to make sure you understand what is the purpose of each code component. For instance, the first example (Figure 1.1) displays a simple web page with one heading and one paragraph—you can try to modify its source code (see Section 1.4 to learn how) to change the contents of the heading and/or paragraph, to add a second paragraph below the first one, and so on. Chapter 2 in Introduction to Data Technologies (Murrell 2009) gives a gentle and gradual introduction to HTML as well as the practice of writing computer code3. It is a highly recommended complementary reading to the present chapter, especially for readers who are new to computer programming. How do people access the web?Web serversWhen you ask your browser for a web page, typing a URL such as https://www.google.com in the address bar, the request is sent across the internet to a special computer known as a web server which hosts the website. Web servers are special computers that are constantly connected to the internet, and are optimized to send web pages out to people who request them. Your computer, the client, receives the file and renders the web page you ultimately see on screen. We will discuss web servers and server-client communication in Chapter 5. When you are looking at a website, it is most likely that your browser will be receiving HTML and CSS documents from the web server that hosts the site. The web browser interprets the HTML and CSS code to create the page that you see. We will learn about HTML in Chapter 1 (this chapter) and about CSS in Chapter 2. Most web pages also send JavaScript code to your browser to make the page interactive. The browser runs the JavaScript code, on page load and/or later on while the user interacts with the web page. The JavaScript code can modify the content of the page. We will introduce JavaScript in Chapters 3–4. Web pagesAt the most basic level, a web page is a plain text document containing HTML code. This book comes with several examples of complete web pages. The examples are listed in Appendices B–C. They can be viewed and/or downloaded from the online version of this book (Section 0.8). The first example, Here is the source code you should see when opening the file
FIGURE 1.2: HTML document source code viewed in a text editor (Notepad++)
FIGURE 1.3: HTML document (left) and its source code (right) The source code comprises the contents of an HTML document. The source code is sent to the browser, then processed to produce the display shown in Figure 1.1. The
Text editorsHTML, CSS, and JavaScript code, like any other computer code, is plain text stored in text files. To edit them, you need to use a plain text editor. The simplest option is Notepad++. There are also more advanced editors, such as Visual Studio Code or Sublime Text. The more advanced editors contain additional features for easier text editing, such as shortcuts, highlighted syntax, marked matching brackets, etc. You can use any plain text editor you prefer5.
What is HTML?OverviewHypertext Markup Language (HTML) is the language that describes the contents and structure of web pages. Most
web pages today are composed of more than just HTML, but simple web pages—such as HTML code consists of HTML elements. An HTML element may contain text and/or other elements. This makes HTML code hierarchical. An HTML element consists of a start tag, followed by the element content,
followed by an end tag. A start tag is of the form The following example shows a
Table 1.2 summarizes the basic components of an HTML element. TABLE 1.2: HTML element structure
Some HTML elements are empty, which means that they consist of only a start tag, with no contents and no end tag. The following code shows an An element may have one or more attributes. Attributes appear inside the start tag and are of the form
Table 1.3 summarizes the components of an HTML element with an attribute. TABLE 1.3: HTML element attribute structure
There can be more than one attribute for an element, in which case they are separated by spaces. For example, the following
It is important to note that there is a fixed set of valid HTML elements (Section 1.6), and each element has its own set of possible attributes. Moreover, some
attributes are required while others are optional. For example, the As for the entire document structure, an HTML document must include a
Technically, everything
except for the As mentioned above, the primary role of HTML code is to specify the contents of a web page. The type of elements being used and their ordering determine the structure of information that is being displayed in the browser. Block vs. inlineWhile learning about the various HTML elements (Section 1.6), it is important to keep in mind that HTML elements are divided into two general types of behaviors:
A block-level element, or simply a block element, is like a paragraph. Block elements always start on a new line in the browser window (Figure 1.4). Examples of block elements include:
It is helpful to imagine block elements as horizontal boxes. Box width is determined by the width of the browser window, so that the box fills the entire available space. Box height is determined by the amount of content. For example, a paragraph fills the entire available page width, with variable height depending on the amount of text. (This is the default behavior; in Chapter 2 we will see that the height and width can be modified, using CSS.) An inline element is like a word within a paragraph. It is a small component that is arranged with other components inside a container. Inline elements appear on the same line as their neighboring elements (Figure 1.4). Examples of inline elements include:
FIGURE 1.4: Block vs. inline HTML elements Common HTML elementsHTML element typesThis section briefly describes the important behavior, attributes, and rules for each of the common HTML elements. We will use most of these elements throughout the book, so it is important to be familiar with them from the start. You don’t need to remember how to use each element—you can always come back to this section later on. Keep in mind that the HTML elements we are going to cover in this chapter are just the most common ones. HTML defines a lot of other element types that we will not use in the book6. For convenience, the HTML elements we will cover will be divided into three types according to their role (Table 1.4) in determining page contents and structure. Other than elements setting the basic document structure, there are elements giving general
information about the page (mainly inside the
StructureOverviewThe |
Input type | Usage | Section |
---|---|---|
Numeric input | <input type="number">
| Section 1.6.13.2 |
Range input | <input type="range">
| Section 1.6.13.3 |
Text input | <input type="text">
| Section 1.6.13.4 |
Text area | <textarea></textarea>
| Section 1.6.13.5 |
Radio buttons | <input type="radio">
| Section 1.6.13.6 |
Checkboxes | <input type="checkbox">
| Section 1.6.13.7 |
Dropdown lists | <select><option></option></select>
| Section 1.6.13.8 |
Buttons | <input type="button">
| Section 1.6.13.9 |
Numeric input
A numeric <input>
element is used to get numeric input through typing or clicking the up/down buttons. A numeric input is defined using an <input>
element with a type="number"
attribute. Other important attributes are min
and max
, specifying the valid range
of numbers that the user can enter. For example, the following HTML code creates a numeric input, where the user can enter numbers between 0 and 100, with the initial value set to 5:
<input type="number" name="num" value="5" min="0" max="100">
The name
attribute identifies the form control and is sent along with the entered information when submitting a form to a server. It is not very useful within the scope of this book but is shown here for completeness as it is commonly used in other contexts (Section
1.6.13.1).
The way that the above numeric input element appears in the browser, along with all other types of input we cover next (Sections 1.6.13.3–1.6.13.9), is shown in Figure
1.9. The numeric input is in the top-left corner if the figure. Note that the code for example-01-06.html
includes CSS styling rules (which we learn about in Chapter 2) for arranging the input elements in three columns.
Range input
A range <input>
element is used for picking numeric values with a slider. This is usually more convenient and intuitive for the user in cases when the exact value is not important. A range input is defined using type="range"
. The purpose of the value
, min
, and max
attributes is to specify the initial, minimal, and maximal values,
respectively, just like in the numeric input (Section 1.6.13.2). Here is an example of a range input element:
<input type="range" name="points" value="5" min="0" max="100">
The result is shown in Figure 1.9.
Text input
A text <input>
is used for typing plain text. A text input is defined using type="text"
. For example, the following HTML code creates two text input boxes for entering first and last names, along with the corresponding labels7. The <br>
element is used to
place each text input box on a new line, beneath its label:
First name:<br>
<input type="text" name="firstname"><br>
Last name:<br>
<input type="text" name="lastname">
The result is shown in Figure 1.9.
Text area
A text area input is used for typing plain text, just like text input, but
intended for multi-line rather than single-line text input (e.g., Figure 7.5). A text input is defined using the <textarea>
element, as shown in the following example:
<textarea name="mytext"></textarea>
The result is shown in Figure 1.9.
Radio buttons
Radio buttons are used to select one of several options. Each radio button is defined with a separate <input>
element using type="radio"
. The user can select only one option of the radio buttons sharing the same value for the name
attribute. The checked
attribute can be used to define which button is selected on page load. Note that the checked
attribute has no value. For example, the following HTML code creates two radio buttons, with corresponding labels:
<input type="radio" name="gender" value="male" checked> Male<br>
<input type="radio" name="gender" value="female"> Female<br>
The result is shown in Figure 1.9. The “Male” option is initially checked because of the checked
attribute.
Checkboxes
Checkboxes are used to select one or more (or none) of several options. Each checkbox is defined with a separate <input>
element using type="checkbox"
. For example, the following HTML code creates two checkboxes, with labels:
<input type="checkbox" name="vehicle1" value="Bike"> I have a bike<br>
<input type="checkbox" name="vehicle2" value="Car"> I have a car<br>
The result is shown in Figure 1.9.
Dropdown menus
Dropdown lists, or dropdown menus, are used to select one option from a list. The list is initially hidden from view, expanding only when clicked. The list is also scrollable, therefore the number of items is potentially longer than can fit on screen. This makes dropdown lists suitable for situations when we have a long list of options the user needs to choose from, and we do not want to “waste” page space displaying all possible options at all times (e.g., Figure 10.4).
The dropdown menu is initiated using the <select>
element. Inside the <select>
element, each option is
defined with a separate <option>
element. For example:
<select name="cars">
<option value="volvo">Volvo</option>
<option value="suzuki">Suzuki</option>
<option value="fiat">Fiat</option>
<option value="audi">Audi</option>
</select>
The result is shown in Figure 1.9.
Note that in radio buttons (Section 1.6.13.6), checkboxes (Section 1.6.13.7) and dropdown menus (Section
1.6.13.8), the value
attribute identifies the currently selected option when sending the data to the server. The value
does not necessarily have to be identical to the text contents we see on screen when interacting with the input element in the browser. For example, in the above HTML code the first <option>
has value="volvo"
, which is used to identify the option when sending data to a server, while the text
shown on screen is actually "Volvo"
(with capital V
).
Buttons
A button is used to trigger actions on the page. A button can be created using the <input>
element with the type="button"
attribute. The value
attribute is used to set the text label that appears on the button. For example, the following HTML
code creates a button with the text “Click me!” on top:
<input type="button" value="Click Me!">
The result is shown in Figure 1.9.
On their own, the input elements are not very useful. For example, interacting with the various input elements in example-01-06.html
(Figure 1.9) has no effect whatsoever. To make the input elements
useful, we need to capture the input element values and write code that does something with those values. In Section 4.12 we will learn how the current values of input elements can be captured and used to modify page appearance and/or contents, using JavaScript.
id, class, and style attributes
Overview
So far we have mostly encountered specific attributes for different HTML elements. For example, the src
attribute is specific to <img>
(and several other) elements and the href
attribute is specific to <a>
(and several other) elements. All HTML elements also share three important non-specific attributes, which can appear in any element:
id
—Unique identifierclass
—Non-unique identifierstyle
—Inline CSS
The following Sections 1.7.2–1.7.4 cover the purpose and usage of these three non-specific attributes.
id
The id
attribute is used to uniquely identify an HTML element from other elements on
the page. Its value should start with a letter or an underscore, not a number or any other character. It is important that no two elements on the same page have the same value for their id
attributes—otherwise the value is no longer unique.
For example, the following page has three <p>
elements with id
attributes. Note that the values of the id
attribute—"intro"
, "middle"
, and "summary"
—are different from each other and thus unique for each element.
<!DOCTYPE html>
<html>
<head>
<title>A Minimal HTML Document</title>
</head>
<body>
<p id="intro">The 1st paragraph is an overview.</p>
<p id="middle">The 2nd paragraph gives more details.</p>
<p id="summary">The 3rd paragraph is a summary.</p>
</body>
</html>
As we will see when discussing CSS (Chapter 2), giving an element a unique id
allows us to style it differently than any other instance of the same element on the page. For example, we may want to assign one paragraph within the page a different color than all of the other paragraphs. When we go on to learn about JavaScript and interactive behavior (Chapter
4), we will also use id
attributes to allow our scripts to uniquely affect the interactive behavior of particular elements on the page.
class
Every HTML element can also carry a class
attribute. Sometimes, rather than uniquely identify one element within a document using an id
, we will want to identify a group of elements as being different from all other elements on the page. For example, we may have some paragraphs of text that contain information that is more important than others and want to distinguish these elements, or differentiate between links that point to other pages on your own site and links that point to external sites.
To mark multiple
elements as belonging to one group we can use the class
attribute. The value of the class
attribute identifies the group those elements belong to. For example, in the following HTML document, the first and third <p>
elements share the class
attribute value of "important"
.
<!DOCTYPE html>
<html>
<head>
<title>A Minimal HTML Document</title>
</head>
<body>
<p class="important">The 1st paragraph is an overview.</p>
<p>The 2nd paragraph gives more details.</p>
<p class="important">The 3rd paragraph is a summary.</p>
</body>
</html>
Just like an id
, the class
attribute is commonly used for styling, or interacting with, a group of elements on the page.
style
All elements may also have a style
attribute, which allows inline CSS rules to be specified within the element start tag. We will talk about inline CSS in Section 2.7.2.
Code layout
When writing code, it is useful to keep a uniform code layout. For example, we can use indentation to distinguish content that is inside another element, thus highlighting the hierarchical structure of code.
The following two HTML documents are the same as far as the computer is concerned, i.e., they are displayed exactly the same way in the browser. However, the second HTML document is much more readable to humans thanks to the facts that:
- Each element starts on a new line.
- Internal elements are indented with tabs.
<!DOCTYPE html><html><head><title>A Minimal HTML Document</title></head>
<body><p>The content goes here!</p></body></html>
<!DOCTYPE html>
<html>
<head>
<title>A Minimal HTML Document</title>
</head>
<body>
<p>The content goes here!</p>
</body>
</html>
Inspecting elements
When looking at the HTML code of a simple web page, such as the ones we created in this chapter, it is easy to locate the HTML element responsible for creating a given visual element we see on screen. However, as the HTML code becomes longer and more complex, it may be more difficult to make this association.
Luckily, browsers have a built-in feature for locating HTML code associated with any element you see on screen. For example:
- Open the example file named
example-01-01.html
in Chrome. - Press Ctrl+Shift+I or F12.
The screen should now be split. The left pane still shows the web page. The right pane shows the developer tools. The developer tools are a set of web authoring and debugging tools built into modern web browsers, including Chrome. The developer tools provide web developers access into the internals of the browser and the web page being displayed.
- Press Ctrl+Shift+C.
This toggles the Inspect Element mode. (It also opens the developer tools in the Inspect Element mode if they are not already open.) In the Inspect Element mode, you can hover above different parts of the page (left pane) with the mouse pointer. The relevant elements are highlighted, and their name is shown (Figure 1.10). Clicking on an element highlights the relevant part of the page source code and scrolls it into view. This also works in the opposite direction: hovering over the code in the right pane highlights the respective visual element in the left pane.
FIGURE 1.10: Using the Inspect Element tool in Chrome
Remember how we mentioned that every (block-level) HTML element can be thought of as a horizontal box, where (by default) height is determined by amount of content and width is set to maximum of browser width (Section 1.5.3)? This becomes evident when the Inspect Element tool highlights those boxes (Figure 1.10).
Exercise
- Edit the minimal HTML document
example-01-01.html
to experiment with the HTML element types we learned in this Chapter:- Modify the title of the page and the first-level heading.
- Delete the existing paragraph and add a new paragraph with two to three sentences about a subject you are interested in.
- Use the appropriate tags to format some of the words in italic or bold font.
- Use the
<a>
tag to add a link to another web page. - Add a list with two levels, i.e., a list where each list item is also a list.
- Add images which are loaded from another location on the internet, such as from Flickr.
References
Murrell, Paul. 2009. Introduction to Data Technologies. Boca Raton, FL, USA: Chapman; Hall/CRC.