Introduction to Web and Emergence of HTML

The internet, woven into the fabric of our lives, has a fascinating history rooted in resilience and collaboration spanning several decades. While its birthdate is often pinpointed back to 1969, its seeds were sown much earlier.

In 1962, J.C.R. Licklider proposed a “Galactic Network” connecting computers. However, in 1969 the US Department of Defense initiated ARPANET, funded by the Advanced Research Projects Agency (ARPA), initially connecting only four university computers using packet switching technology. This allowed the first message to be sent between two computers, marking the birth of the internet.

Further down the years, ARPANET witnessed an expansion, connecting various universities and research institutions, which further led to the development of email by Ray Tomlinson in 1971. This furthered the internet's capabilities and fostered communication.

Later on, the need for a common language arose. In 1983, Transmission Control Protocol and Internet Protocol (TCP/IP) were developed to standardise communication across diverse networks. The 1990s marked a significant turning point with the advent of the World Wide Web by Tim Berners-Lee in 1991, which introduced concepts like HTML, URLs and HTTP, making information accessible through a user-friendly interface. Also, the release of the Mosaic browser in 1993 propelled the web's graphical interface, spurring widespread internet usage. Here, the web became publicly accessible and gained popularity.

The dot-com bubble was a period in the late 1990s and early 2000s when there was a rapid increase in the value of internet-based companies, often referred to as “dot-coms” due to their reliance on ".com" domain names. During this time, many investors were excited about the potential of the internet and poured a lot of money into these companies, causing their stock prices to soar to incredibly high levels.

However, much of this excitement and investment was based on speculation rather than solid business fundamentals. Many of these companies were not making profits or were not sustainable in the long term. Eventually, the market became saturated, and investors began to realise that some of these companies were overvalued. This led to a massive sell-off of stocks, causing their prices to plummet. As a result, numerous internet-based companies went out of business, stock values declined dramatically, and many investors lost substantial amounts of money.

In the 2000s, there were big improvements in internet technology. One important development was broadband internet, which made internet connections much faster for people. This led to the rise of what we call 'Web 2.0.' It was a time when the internet became more about people sharing their own content online - like posting pictures, writing blogs, making videos, and connecting with others on social media. This era was all about users creating content and having more interactive and fun experiences on the internet.

In the 2010s, lots of people started using smartphones, which made it easier for them to access the internet from anywhere. This brought big changes because now people could do more things online, like storing their files on the internet (cloud computing), using social media to connect with others, buying things online (e-commerce), and watching movies or shows online (streaming services). These changes made it much easier for everyone to get information and accomplish tasks using their phones.

Today, the internet continues evolving with technologies like the Internet of Things (IoT), Artificial Intelligence (AI), and 5G networks. It has become an indispensable part of modern life, connecting billions worldwide, shaping communication, business, education, and societal interactions.

Advancements in web technologies, such as HTML5 and CSS3, led to new interactive and responsive websites and web applications. Throughout its history, the web has undergone significant developments, from its initial creation as a means to share scientific information to becoming an integral part of everyday life, revolutionising communication, commerce, and information access on a global scale.

The ABCs of HTML

HTML stands for HyperText Markup Language and is the standard language used to create web pages. Though frequently traced back to 1991 for its inception along with the first web browser and web server, Tim Berners-Lee, a British computer scientist, proposed a system called 'Mesh' as a way to link and access documents over the internet.

In 1995, HTML 2.0 was published by the Internet Engineering Task Force (IETF), which included features like forms and tables. This version laid the groundwork for many elements still used in modern HTML. Further down the line, HTML 3.2 was introduced with more features, such as support for tables, applets, and text flow around images.

In 2000, XHTML and HTML 4.01 were released by the World Wide Web Consortium (W3C), emphasising a stricter and more standardised version of HTML. XHTML, short for eXtensible HTML, was a newer version of the language used to create web pages. It was an upgraded and stricter form of the previous HTML version called HTML 4.01. The main goal of XHTML was to make the code used in creating web pages cleaner and more organised. It also aimed to work better with other languages following similar rules to XML, another way of structuring and organizing data on the internet.

In 2014, a major revision of HTML brought a wide range of new features and improvements, making it more capable for multimedia, better handling of content structure, and providing native support for audio, video, and interactive elements.

Following HTML5, the focus has been on continuous updates and improvements to HTML through the WHATWG (Web Hypertext Application Technology Working Group) and W3C. The evolution continues to address new technologies, accessibility, and the ever-changing needs of web developers and users.

HTML has significantly evolved over time to accommodate the growing demands of the internet. Its continual development ensures it remains the backbone of web content creation, providing a standardised way to structure and present information on the World Wide Web.

The Fundamental Trio: Web Pages, Web Servers, and Web Browsers

The intricate interplay of the internet relies on three essential components: web pages, web servers, and web browsers. Let’s delve into how they harmonize to bring the web to life.

1. Web Pages: Picture them as digital documents containing text, images, videos, hyperlinks, and interactive elements accessed over the internet. They are crafted using specialised codes such as HTML, CSS, and JavaScript, instructing browsers on how to showcase content. Every web page possesses a unique address known as a URL, directing to its location on a web server.

2. Web Servers: These robust computers store web pages and furnish them to devices requesting data, effectively hosting websites. Upon entering a URL, the server retrieves the corresponding web page and transmits it back to the user's device. Operating ceaselessly, servers manage numerous requests concurrently, housing databases, executing scripts, and ensuring seamless content delivery.

3. Web Browsers: These software applications, like Chrome, Firefox, or Safari, installed on devices, facilitate access to web pages. They decipher the code of web pages, visually presenting them on screens. Browsers liaise with web servers, sending requests for specific pages and receiving corresponding data. They convert HTML, CSS, and JavaScript into the vibrant, interactive interfaces we encounter online.

Their Interaction:

i. Picture desiring to read a news article online. You input the URL into your browser (the request).

ii. The browser dispatches the request to the web server linked with the URL.

iii. The web server pinpoints the requested web page and dispatches it back to your browser.

iv. The browser interprets the code and exhibits the article on your screen, complete with text, images, and potentially interactive elements.

While web pages, web servers, and browsers form the core, other vital components like routers, databases, and security protocols play crucial roles in ensuring the smooth functioning of the internet.

Types of Web Servers:

Just like chefs specialise in different cuisines, web servers come in various flavors, each with its own strengths and weaknesses. Here are some of the most popular ones:

Apache HTTP Server: The pioneer of web servers is an open source web server, the granddaddy of them all, Apache is the most widely used web server globally It built a reputation for stability, versatility, and reliability over decades of web development. Like a seasoned chef adept at serving up anything from simple blogs to complex e-commerce sites, Apache handles a diverse array of requests with ease.
Nginx: The Speedy Web Server: Nginx is a modern, lightweight web server known for its speed and efficiency. Unlike traditional servers, Nginx uses fewer resources to achieve better performance under high traffic loads. This makes it an ideal choice for handling the demands of high-profile websites like online newspapers and social media platforms. Nginx works swiftly and precisely, acting as the sous chef that keeps up with a flood of requests without compromising quality. Often used as a reverse proxy server.

Lets take a little dive into what a reverse proxy server is:

A reverse proxy server is a type of server that sits between a client (like a web browser) and one or more backend servers. Instead of forwarding client requests to different servers (as a typical proxy server would), a reverse proxy accepts requests from clients and forwards them to the appropriate backend server. It then takes the response from the server and sends it back to the client.

Here's a breakdown in simpler terms:

Imagine you're in a restaurant. You place an order with the waiter (the reverse proxy) who takes your request and relays it to the kitchen (the backend server). The kitchen prepares your order, gives it back to the waiter, and then the waiter serves your meal to you.

In the case of a reverse proxy server:

Client Request: A user sends a request (like accessing a website) to the reverse proxy server.
Reverse Proxy: The reverse proxy server takes that request and forwards it to the appropriate server (like a web server, application server, or another backend service) that can fulfill the request.
Backend Server: The backend server processes the request, retrieves the necessary information or data, and sends it back to the reverse proxy.
Reverse Proxy to Client: Finally, the reverse proxy server sends the response it received from the backend server back to the client that made the initial request.

Reverse proxies offer several benefits:

They can help distribute incoming traffic among multiple servers, improving performance and reliability.
They provide an extra layer of security by shielding backend servers from direct exposure to the internet.
They can cache content, serving it directly to clients without repeatedly fetching it from the backend servers, which speeds up responses for frequently accessed content.

Overall, a reverse proxy server acts as an intermediary between clients and servers, managing and optimising communication between them.

Microsoft Internet Information Services (IIS): The Windows Web Server is a web server tailored for Windows environments. Its seamless integration with Microsoft products and robust security features make it a go-to choice for Windows users. Like a restaurant that perfects one specialty cuisine, IIS is optimised to serve up Windows-based websites and applications with enhanced security protections. It brings Windows-centric tools and safeguards together in one complete web solution.
LiteSpeed Web Server: The High-Performance Web Server is a commercial-grade web server built for speed and power. It earns top marks for its ability to handle complex workloads without compromise. Where other servers struggle with demanding tasks, LiteSpeed responds with exceptional performance - no matter the scale or complexity required. It brings elite skills and finely-tuned technology together to deliver smooth, speedy experiences even when serving up the most advanced web applications.
Apache Tomcat: The Java Application Server is an open-source web server designed specifically for Java-based web applications. It provides a standard Java servlet container used to run Java code, making it a popular choice for hosting applications built with Java technologies like servlets and JSPs. Tomcat integrates seamlessly with Java, allowing developers to compile servlets and JSPs directly into Java classes for execution. This tight coupling with Java lends itself well to scalability, high performance, and cross-platform flexibility. With its focus on Java hosting and optimisation, Tomcat serves as an ideal web server for Java-centric development environments. It brings the convenience of an application server tailored for Java apps directly to the core web serving stack.
Google Web Servers (GWS): The Custom-Tuned Web Server is a proprietary web server developed in-house by Google to power its own services. Unlike typical servers, GWS is customized and optimized specifically for Google's infrastructure and usage patterns. It scales rapidly to handle Google's massive traffic levels across products like Search, Gmail, Maps and more. The exact technical details of GWS are not publicly known since it is for Google's internal use. But its focus on high performance and efficiency even under heavy loads has been key to supporting Google's growth. With custom optimisations and scalability tuned to Google's specific needs, GWS exemplifies a web server tailored end-to-end for one environment. Its ability to excel under Google's demands shows the benefits of specialisation and control in web serving.
Node.js: The JavaScript-Powered Server is not a conventional web server but rather a JavaScript runtime environment that enables server-side JavaScript execution. This allows Node.js to be used for building fast and scalable web servers in JavaScript. It utilises an event-driven, non-blocking I/O model optimised for data-intensive real-time applications. By using JavaScript on both client and server sides, Node.js eliminates context switching between different languages. Its single-threaded event loop handles concurrent requests efficiently without blocking. This makes Node.js well suited for I/O bound applications like web servers where much time is spent waiting for I/O. With its lightweight architecture and JavaScript foundation, Node.js brings speed, simplicity and uniformity to the server-side for modern web applications.
Caddy: The Beginner-Friendly Web Server is a modern open source web server that simplifies web serving through automation and defaults. Unlike traditional servers that require manual configuration, Caddy aims to be immediately useful out of the box. It handles tedious tasks like enabling HTTPS encryption automatically, allowing users to get started quickly. Caddy's beginner-friendly approach also includes defaults optimised for common use cases like static file serving, reverse proxying etc. Underneath its simple exterior, Caddy is built with Go and implements latest web standards and best practices. Its modular architecture makes it extensible and flexible. With its focus on usability and smart defaults, Caddy lowers the barrier to entry for web serving while still delivering robust functionality and performance.
OpenLite Speed: The Open-Source LiteSpeed is an open source web server developed by LiteSpeed Technologies as a free alternative to their commercial LiteSpeed Web Server. It retains LiteSpeed's core advantages like event-driven architecture, scalability and modular design while being available under an open source license at no cost. OpenLiteSpeed provides enterprise-level performance and features like HTTP/2 support, URL rewriting, clustering, etc. without the licensing fees. Its lightweight threaded model allows it to handle thousands of concurrent connections with a low memory footprint. OpenLiteSpeed brings the speed, security, and flexibility of LiteSpeed to the open source community. For organizations wanting high performance without vendor lock-in or cost, OpenLiteSpeed delivers an appealing open source web serving solution.
Cherokee: The User-Friendly Web Server is an open source cross-platform web server designed with ease of use in mind. It stands out for its intuitive graphical configuration system that simplifies web server management. Cherokee provides a clean and accessible interface for configuring settings without needing to manually edit configuration files. It also features a built-in administration panel for overseeing servers and sites visually. Under the hood, Cherokee is optimised for scalability and flexibility with support for technologies like SSL, virtual hosting, URL rewriting etc. Its event-driven architecture makes it lightweight and high performance as well. For administrators looking for a user-friendly web server, Cherokee's graphical approach and administrative controls create a more accessible web serving experience.
Lighttpd: The Lightweight Speedster is an open source web server designed and optimised for high performance serving static and dynamic content. It stands out for its speed, low resource usage and minimalist architecture. lighttpd is one of the lightest web servers available with a memory footprint around 2-3MB and capable of handling thousands of concurrent connections. This makes it well-suited for serving large volumes of static resources like HTML/images/video where throughput speed matters. But it also supports dynamic languages like PHP and can be extended via plugins. Its modular nature allows building a customised server with only needed features. For applications where efficiency, scalability and speed on constrained resources are critical, lighttpd provides an appealing lightweight web serving solution. Its trim architecture and streamlined codebase squeeze maximal performance from minimal resources.
Gunicorn: The Python Web Server is a popular open source Python WSGI HTTP server used for deploying Python web applications and frameworks. It allows Python apps built with frameworks like Django and Flask to be served like a production-ready web server. Gunicorn is specifically built for optimising Python web apps for speed and scalability. It features an efficient async worker model that handles concurrent requests under heavy loads. Gunicorn also supports hot reloading of code for easier development. Its configuration options provide flexibility to tune performance for different deployment needs. With native Python functionality and tight integration frameworks, Gunicorn makes it easy to take Python apps from development to production while avoiding compatibility issues. For Python developers looking to operationalise web projects, Gunicorn provides an ideal web serving solution tailored for Python environments.

Before concluding this article, let's take a moment to explore a few essential HTML tags. Neglecting to do so would undermine the significance of these tags within the realm of HTML.

HTML utilises markup to annotate various types of content for presentation within a web browser. While most people invariably referred to HTML Element as Tags, but I would like to point out what those two actually are:

In HTML, an "element" refers to a specific part of a webpage, like a paragraph or an image. These elements are enclosed or set off from the rest of the text by what are called "tags." Tags consist of the element's name enclosed by "<" and ">". For example, if you have a paragraph, it might be surrounded by <p> and </p> to indicate the start and end of the paragraph.

The important thing to note about these tags is that they are not case-sensitive. This means you can write them in uppercase (like <TITLE>), lowercase (like <title>), or even a mix of both (like <Title>). However, it's generally recommended and considered good practice to write tags in lowercase for consistency and easier readability.

Therefore, in the context of HTML, we can say that a "tag" consists of both the element's name and the enclosure or markup that surrounds the content or specific part of a webpage.

H1 Tag:

The H1 tag in HTML denotes the primary heading for a webpage, summarising the core topic covered. It functions as a prominent title that conveys to visitors and search engines what the page is fundamentally about. The content within the starting <h1> and closing </h1> tags should describe the main subject in a concise and descriptive phrase. Proper use of the H1 tag is essential for accessibility, SEO and clean structure. It establishes the hierarchical relationship for other headings on the page, with H2 tags for subsections nested under the key focus outlined in H1. Just as the top headline captures the essence of a news article, the H1 tag encapsulates the predominant theme of a webpage in an eye-catching header at the top.

H2 Tag:

The H2 tag in HTML functions like a subheading, providing a title that is smaller than the main H1 heading. It helps segment web content into logical sections and subsections related to the primary focus denoted by the H1. Much like subheadings divide up sections in a document, H2 tags break up blocks of text and information on a webpage into more manageable and hierarchical chunks. Using H2 allows web authors to establish relationships between headings, guiding readers through the topics and themes covered on the page in an organised manner. The visual demarcation from H1 also makes it easy to scan for key subsections at a glance. So in essence, the H2 tag enables clearer structure and information architecture in webpages through labeled subdivisions under the main subject.

P Tag:

The paragraph tag <p> in HTML indicates a block of text as a distinct paragraph. It demarcates sentences into separate chunks without indenting the first line. When web browsers render the <p> tag, the enclosed text will display as an individual paragraph unit with spacing above and below it to set apart from adjacent content. The <p> tag can be utilised within any section on a page to break up long portions of text into more readable segments. This improves scannability for users by organising content into logical blocks rather than a continuous stream. Proper paragraph structure also enhances accessibility for screen readers navigating page content. So the HTML paragraph tag enables easier consumption of text by segmenting it into localised blocks with visual whitespace separation.

For a <p> tag, the start tag is required. The end tag may be omitted if the <p> element is immediately followed by an <address>, <article>, <aside>, <blockquote>, <details>, <div>, <dl>, <fieldset>, <figcaption>, <figure>, <footer>, <form>, h1, h2, h3, h4, h5, h6, <header>, <hgroup>, <hr>, <main>, <menu>, <nav>, <ol>, <pre>, <search>, <section>, <table>, <ul> or another <p> element, or if there is no more content in the parent element and the parent element is not an <a>, <audio>, <del>, <ins>, <map>, <noscript> or <video> element, or an autonomous custom element.

IMG Tag:

The <img> tag in HTML embeds an image file onto a web page. At minimum, it requires the "src" attribute containing the image source path and "alt" for alternative text. The alt text provides a description of the image when it fails to load or for accessibility by screen readers.

The img tag also supports attributes like crossorigin, decoding, sizes and srcset. Crossorigin handles CORS security for fetching images from external domains. Decoding optimises image rendering. Sizes and srcset make images responsive by serving different resolutions based on screen pixel density and viewport size.

Additional attributes allow granular control over performance, security and rendering when embedding images. Using the core src and alt attributes ensures basic image insertion, while other attributes enhance optimisation, accessibility and responsiveness. The flexibility of the img tag caters to a wide range of use cases for presenting images effectively on sites.

Lorem:

The Lorem Ipsum placeholder text commonly used today originated from a passage by the ancient Roman philosopher Cicero. This nonsensical Latin text was adopted by the publishing industry as filler content during layout design since the 1500s.

Over centuries of use, Lorem Ipsum became the standard dummy text for visual mockups across publishing and graphic design. Its utility comes from the repetitive and meaningless nature of the passage. The unfamiliar Latin words allow designers to focus on stylistic elements like fonts, spacing and imagery without meaningful text distracting from the layout itself. The seamless flow of Lorem Ipsum also demonstrates how real prose will appear without conveying comprehensible information too early in the design process. This allows the final readable content to be inserted later without disrupting visual continuity. In essence, the longevity of Lorem Ipsum stems from its role as innocuous placeholder text that recedes into the background when required.

In tracing the progression from the early web to today's intricate digital realm, we witness the interplay between evolving technologies and their real-world impact. The malleable foundations of HTML allowed the web to transform from a basic academic network into a multifaceted platform enriching society. As HTML expanded to handle more diverse needs, web servers kept pace, growing in scope and sophistication. The nuanced roles of HTML elements and attributes further reflect the web's adaptability in enabling personalised experiences. While the specifics continue changing, the dynamic spirit of innovation persists. The web's capacity to reshape itself points to an exciting future where emerging advancements coalesce into new phases of human connection, understanding and achievement. Just as the pioneering technologies of yesterday shaped modern life, the digital frameworks of today carry on that legacy by laying the groundwork for tomorrow's possibilities.

A Glimpse into the Internet’s History: Introduction to Web and Emergence of HTML