The Internet, or World Wide Web, was first created to help connect geographically distant laboratories and research centers conducting government research. Since 1994, technology has evolved to the point that it now connects individuals and networks run by private companies, individuals, industry, government, and other organizations world-wide. Unlike other communications mediums, the Internet has become the source of information and sharing for anyone who finds a way to connect to the Web. A common question that arises amongst computer and mobile user is how does the Internet work?

Understanding Internet Technologies

Today’s Internet consists of a combination of enabling technologies created over the past 50 years of computer research from across the globe. In order to understand how the Web works, a short look at each of the critical components of the Internet must be taken. Each of the components is based on international standards published to allow technology to interoperate across national boundaries. These technologies consist of how computer hosts are addressed on a network, network communication methods and protocols, and the network infrastructure that makes up the Internet backbone.

How Do Internet Addresses Work

Today’s Internet can be thought of a global network of computing devices connected to the Web. In order to send and receive information to other destinations, each of the devices requires a unique address. The addresses are assigned using the IP (Internet Protocol) standard and are known as IP address. An address is setup in the format: NNNN.NNNN.NNNN.NNNN.

The current IPv4 standard was identified as not having enough address space to accommodate the exponential growth of the Internet in the 1990s. As a result, the short term work around created to accommodate growth was classless addressing. In this scheme, when a computing device connects to the Internet through an ISP (Internet Service Provider) or other host, it will be assigned a temporary IP address for the duration of the networking session using a Dynamic Host Configuration Protocol (DHCP) server. Under this scheme, the computer will have a unique IP address for the online session, and no two computers will have the same address.

The longer term fix for the finite IPv4 address space is the IPv6 standard. Under the IPv6 methodology, the address space significantly increases and along with other technological improvements significantly increases the security, quality of service, and speed of the Internet. Since IPv6 requires a significant investment into new networking infrastructure, the adoption rate by industry and society is slow, but ongoing.

How Does Network Communication Work?

Once a computer host has a unique IP address, it requires a means of communication with other computer hosts on the Internet. TCP (Transmission Control Protocol) is the communication protocol used on the Internet that when combined with IP is referred to as the TCP/IP protocol stack. The protocol is divided into four layers:

Application Protocol Layer – Designed to support specific applications such as email, FTP, or the World Wide Web.
Transmission Control Protocol Layer – Responsible for sending network packets to a specified application on a remote computer using a port number.
Internet Protocol Layer – Sends packets to a destination IP address.
Hardware Layer – Responsible for converting the binary packets to and from network signals. Normally accomplished by a network interface card or modem.

For example, if your computer was to send a message to another computer stating “Hello ByteGuide fan” the communications flow would be:

–  The message would be encoded by a specific program at the application protocol layer and be encoded into one or more packets depending on the length of the message.

–  The packets would continue to the TCP layer where it is assigned a port number required for the destination host to receive the information. The port number is specific to the program on the destination computer since it knows to listen for incoming traffic on a pre-assigned port. (For example, web browsers use port 80).

–  The packets proceed to the IP layer where the destination IP address is assigned.

–  Once the packets have a destination address and port number, they are ready to be sent over the network or Internet. The hardware layer translates the packet information into electronic signals and transmits to the router of the ISP or local network.

–  Once the network router receives the packet, receives an outbound packet, it determines the next destination for the information. This is typically another router on the way to the ultimate destination.

–  The data forwarding process continues until the network packets are received by the destination computer.

–  When received, the data packets start at the bottom of the TCP/IP stack at the hardware layer and move upwards in the stack until reaching the application layer.

–  In the application layer of the TCP/IP stack, the data packets are reassembled in the correct order since they may be received out of order until the original message sent is reassembled in its original form, “Hello ByteGuide fan”.

How Does Networking Infrastructure Work?

In order for network packets to travel from one computing device to another on the Internet, it requires communications infrastructure to connect the computers. Before the penetration of high-speed Internet, ISP’s would maintain a pool of computer modems for the dial-in customers to connect to the Internet that would connect to a central port or line server. Today, a similar function is performed by one or many routers that manage connectivity to the ISP’s backbone. Smaller ISP’s buy or rent bandwidth on one of the Internet backbones from larger companies. Once the network traffic reaches the Internet backbone, it will pass through a number of routers on other backbones or networks until the destination address is found or the data packet times out. If everything works properly, the data packet will be delivered to the destination computer.

What is the Internet Backbone?

The Internet backbone is made up of a number of larger computer networks that connect with each other. They are referred to as Network Service providers (NSPs) and exchange packet traffic with each other. Each of the NSPs is required to have at least three Network Access Points (NAPs) where Internet traffic is allowed to move from one backbone to another. The NSPs will also connect at Metropolitan Area Exchanges (MAEs) which perform the same role as NSPs except that they are privately owned. Both NSPs and MAEs are referred to as Internet Exchange Points (IXs). Smaller ISPs will often purchase bandwidth from NSPs.

The largest Internet backbone providers, or NSPs, are referred to as tier 1 providers. The five tier 1 providers are Cable & Wireless Worldwide, UUNet, Sprint, AT&T, and Genuity. Although not listed as a Tier 1 provider, Verizon runs one of the largest Internet backbones in the world. Occasionally, if a major portion of any of the NPS backbones goes down or is inundated with traffic, a significant degradation in quality of service across much of the Internet will occur.

How Does the Internet Routing Hierarchy Work?

Many computer users do not realize that there is no centralized control computer or network on the Web which makes the Internet work. All network communication is not broadcast to all computers or networks on the Internet. Instead, when a computer sends network traffic it is sent to the closest router which uses routing tables to help the traffic move closer to its ultimate destination. Internet routers act as packet switches. The individual device will know what sub-networks and IP addresses being used by these networks, but will not normally know about individual computer IP addresses located on a remote network.

When a local router receives a network packet, it will look at the destination IP address placed in the packet by the originating computer or host. It will then check its routing table to see if the network containing the address is in the table. If it is, the packet will be forwarded to that network for further routing. If the address is not found, the packet will be forwarded along a default route (normally further up the Internet backbone hierarchy) to the next router. If the network the IP address is assigned to is known, it will then be routed. If not, it will continue to be routed until it reaches one of the NSP backbones. These routers store the larger network routing tables which will allow the data to be sent to the correct backbone NSP. Once it arrives, it will then be sent down the backbone until finding the ultimate destination.

How Does Domain Name Resolution Work?

Most Internet services start by a user or computer making a request to a plain language Internet address such as www.byteguide.com. In order to resolve the human readable name to an IP address, the Domain Name Service (DNS) is used. DNS servers on local networks contain a database of the subset of all Internet domain names and their assigned address(es) which they provide resolution to a name request from local computer hosts. If a DNS server does not contain a requested domain, it will request the resolution from the next DNS host in the hierarchy. This action functionally works similar to the IP routing hierarchy until a DNS server is located that can resolve the request. Depending on the configuration of the local server, past requests are stored for a predetermined amount of time to improve lookup speed for subsequent requests.

When an Internet service is configured at home or work, there are typically one or more DNS servers assigned for lookup as part of the initial configuration of the network connection. The more typical setup normally limits these entries to a primary and secondary DNS server to guard against outages of the primary. Once setup, this will be the host that the local computer sends DNS request to for websites or other domain name resolution needs.

How Does the HTTP Protocol Work?

The HTTP protocol is the primary application protocol which allows the World Wide Web to work for normal daily use. HTTP differs from HTML in that it is the protocol used by web servers and individual web browsers to establish two-way communications on the Internet. It resides in the application layer of the TCP/IP stack and is used by two applications to send and receive communications (web servers and web browsers).

The HTTP protocol is labeled as a connectionless protocol. This comes from the manner in which the client (web browser) and server (web server) communicate with each other. When a web page is entered into a web browser, the request is pushed downward through the TCP/IP stack and sent to the designated server which hosts the website. Once the request is received by the server, the requested page is sent back to the client (web server) and the communication is disconnected. Each new request from the server results in a new connection being started. This action differs from other communication protocols in that there is not a persistent connection maintained between the two applications or computers.

How Does a Web Browser HTTP Request Work?

When a website URL (unique resource locator) is entered into a web browser, the browser will forward a request to the domain server to resolve the IP address of the desired website if it is not stored on the local computer. Once the address is resolved, the browser will then connect to the web server through the TCP/IP stack and send a HTTP request for the website. The web server checks to see if the requested page is available. If so, the page will be sent to the requesting web browser. If not, a “Page Not Found” or HTTP 404 error will be returned. Once the page or error message is returned, the network connection with the requesting web browser will be closed.

After a web browser receives a requested page, it will parse the content to take action on embedded components which can include Java applets, remotely hosted images, JavaScript, etc. For each of the elements located on remote servers, a new HTTP request will be sent to request, receive, and display the element. Once all information is received, the complete web page will be displayed in the Internet browser on the client computer or device.

What Does the SMTP Protocol Do?

Widespread email usage predates the Internet by a number of years. Email servers leverage another application layer protocol called SMTP (Simple Mail Transfer Protocol). Unlike the HTTP protocol, SMTP is a connection orientated application protocol. When an email client such as Microsoft Outlook is opened and requests new email, it will establish a connection to the default email or mail server. The server’s domain name or IP address will be configured when the email client is initially setup.

Once a connection is established, the mail server will send a message to identify itself to the email client or application. The client program will then send a “HELLO” command that the server will respond with a “250 OK” message. Once the connection is authenticated or established, the client will send various messages to check or send mail. The connection will not be broken until a “SMTP QUIT” command is transmitted by the email application (client). The POP (Post Office Protocol) is more commonly used alongside SMTP for individual user applications to receive email and manage inboxes since SMTP is primarily a delivery protocol.

If using a webmail application, similar functionality occurs to send and receive email; however, the Internet Message Access Protocol (IMAP) is used to facilitate communication with the email server. Many Internet Service Providers allow both IMAP and POP access to email accounts to provide as much flexibility as possible to the end-user in how email is accessed, sent, and received.

The Future of the Internet

To get a glimpse in to research on advanced networking capabilities across speed, security, and reliability, and how the future Internet will work, the non-profit Internet 2 research project is the place to look. In existence for more than a decade, Internet 2 is a collaboration of partnerships from across industry, governments, and academia devoted to advanced networking and interoperability using cutting edge network technologies. The network uses the IPv6 networking protocol and is expanding to be the first 100 Gigabit Ethernet in the world with 8.8 terabits of capacity in the world. Researchers expect that as the number of devices owned by individuals become reliant on Internet connectivity, that the lessons learned from the Internet 2 project will help ensure the right infrastructure is developed and deployed world-wide to support long-term growth of the World Wide Web.