While our experience touring the World Wide Web may seem somewhat uniform and seamless, we are in fact interacting with a complex organism from our browsers. Here we will drill down through the facade of this experience to reveal the different layers of technology so as to understand what kind of animal we are dealing with.
The Web Browser
Perhaps you know it by the name ‘Internet Explorer’ on ‘Windows’ or ‘Safari’ on a ‘Macintosh’ – or Mozilla running on a ‘Linux’ computer. Regardless – what you are really seeing is an computer program which renders a ‘basic’ form of computer code, known as html (not the only language used – but the most ubiquitous, and the parent code of any others carried within it). The browser itself renders graphics and text in such a way as to display it in a manner consistent with the wishes of the people who designed it. Conseuqently – the act of building web pages is heavily dependent on web browsers to preview the final effect of the code that goes into it.
The Internet Service Provider, or ISP
An ISP is a company which provides internet service to a home or business by cable or DSL (over phone lines) – these are certainly the most common forms of service. The ISP commonly also commonly provides e-mail service to the user. The ISP does nothing more than provide a gatway to the end user onto the internet so they may interact with the numerous ‘hosts’, in the way that a telephone company provides a gateway into the international telephone networks.
What is a Server?
A server, in this case a ‘web server’ is simply a computer – located somewhere in the world that has a specific address. When you click on a link or type in a domain name, a request is sent to that specific machine in order to retrieve information – in most cases this consists of a single html file which may itself refer to additonal content (including bitmapped ‘pictures’ and other files which are needed to display the web page as intended. The server’s job is to handle all the requests to view the pages on a website as quickly and reliably as possible. A server may, in most cases, ‘host’ many websites – in this case it is called a ‘shared’ server – for low traffic websites, this is a very economical solution. For websites that have a lot of users – the top 1% or so of websites in the world will use either a dedicated server – or more likely – a bank of servers – requests upon which are distributed among them in order to effectively ‘serve’ the massive number of users who wish to view and interact with the content placed on the servers.
The World Wide Web
The ‘web’ itself – is the sum total of all these servers – all connected through various gateways and subnetworks. When you type in a URL (Uniform Resource Locator) or website name (for example ‘www.yahoo.com’ or ‘www.ebay.com, etc. ) the request goes out through your ISP to the internet domain name server registry as what’s called a ‘DNS’ (domain name server) inquiry. This database then refers your request to the appropriate location via an ‘IP’ (internet protocol) number. This number is very much like a phone number – and the way that host machines find eachother on the web. The website name or URL we type into our web browsers are more like an alias, or human interface that we prefer to use because they are easier to remember than IP numbers, which are normally 12 digits to so long.
We’ve already established that websites reside as a collection of files on a server, and that this may be together with other websites or else across multiple servers in the case of large and popular sites. The HTML code (stands for ‘HyperText Markup Language’) that represents a viewed web page is normally a fairly small file. This file is downloaded onto your computer and then rendered by the browser as a web page, fetching needed files such as ‘inline’ image files such as photos, animations, display text or other graphic elements. Contrary to what one might think – the page and it’s contents are not actually being viewed ‘remotely’ since they actually reside on your computer in something called a ‘cache’, which is a collection of temporary files.
Every time the user clicks onto a new page, content unique to that page is loaded, while common files are displayed from the already loaded cache file. Only pages which are directly requested are loaded since this is a far more efficient routine than downloading content from the website which may not be needed in a brief visit.
Because of the need to occasionally submit ‘secure’ information such as a credit card number or personal information, it was necessary to set up a way to transmit such sensitive information without compromise. The most common way of doing this is by making use of what’s called a ‘signed certificate’. This requires encoding uploaded information through a third party file, which contains a private encryption key – through which personal information is kept secure. Any interception of the uploaded transmission will yield only a meaningless digital cipher whose key is completely inacessible. Web pages which employ such signed certificates use a special variation of the http protocol called ‘secure html’ – designated by the name ‘HTTPS’ – all such pages start with the URL ‘https’ instead of ‘http’.