How The Internet Works - ryantc94/Knights-of-Arthur GitHub Wiki

Intro

  • Every computer connected to the Internet has a unique address. This address is called an IP (Internet Protocol) Address which follows the pattern nnn.nnn.nnn.nnn where n can be any number from 0 - 255.
  • If you are connecting to the internet with an ISP you are usually assigned a temporary IP address. Connecting through LAN provides a permanent IP address or a temporary address from a DHCP server.

Protocols and Stacks

  • When sending a message from computer to computer the message has to be converted from text to electronic signals transmitted over Internet and then translated back. This is done through the Protocol Stack, which is usually built in to the computer's OS.
  • Connections to the Internet are built on the TCP/IP Protocol Stack which looks like this:
Application Layer - protocol specific to applications
Transmission Control Protocol Layer - directs packets to specific application on an application using a port number
Internet Protocol Layer - directs packets to specific computer with correct IP address
Hardware Layer - converts binary packet data to electronic signals and back

  • Message is broken down into chunks by each layer, these chunks of data are called Packets. Data sent over the Internet is sent in manageable packets.
  1. Packets start at Application layer and move to TCP layer and is assigned a port number.
  2. Packets are then sent to IP layer where it receives its destination IP address.
  3. Hardware layer will then turn packet into electronic signals and send it over phone line (??? and how does wifi work???).
  4. ISP router has direct connection to phone line, looks at the packets IP address and figures out where to send packet.
  5. When packet arrives at destination the process is reversed until it is in its original form.

Networking Infrastructure

  • The ISP maintains a pool of modems for dial in customers that is managed by some form of computer which controls data flow to backbone/line router, this is called a Port Server setup.
  • After packets traverse this network it makes its way onto the ISP's backbone and travel through routers and other backbones to reach its destination.

Internet Infrastructure

  • The internet backbone is built up of many Network Service Providers (NSPs), and they exchange packet traffic with each other.
  • Each NSP is required to connect to at least three Network Access Point (NAP) or Metropolitan Area Exchanges (MAEs). This is where the exchange of packet traffic occurs.

Internet Routing Hierarchy

  • Routers are usually used to connect Networks together.
  • Each router knows its sub-network and the IP's within it.
  • When a packet arrives at a router, based on the IP address added to the packet by the IP Protocol Layer. The router will check its routing table (table holding address) if the IP is not there then the packet will be routed up to another router. It will continue routing up to the NSP, at which point the NSP hold the largest routing table and will direct it to the correct network.

Domain Name & Address Resolution

  • Domain Name Service (DNS) is a distributed database that keeps track of computers names and their address.
  • DNS servers are computer that host parts of the DNS database and the software that accesses it.
  • No DNS server contains the entire database, it only contains a subset, so if it doesn't contain the domain name it'll redirect the request to another server.
  • When Internet is setup for you (I mean it is setup) one primary and one or more secondary server is specified for it. This way you get the IP address back faster (since you have more data available)

Application Protocol: HTTP

  • One of the services on the Internet is the World Wide Web (www), HTTP is an Application Protocol that web browsers and web servers use to communicate to each other.
  • HTTP is a connectionless text based protocol. After Client send request to Server and Server sends information back the connection is terminated.
When you type a URL into the browser this is what happens:
1. DNS resolves the domain name and gets the IP address for that domain name.
2. Web Browser connects to Web Server (what does a connection actually mean???) and sends an HTTP request
3. Web Server resolves request and sends back appropriate info
4. Web Browser receives info from Web Server and ends connection
5. Browser then parses page looking for additional requests
6. Makes additional HTTP requests if needed
7. No more requests then finishes loading page.
  • Most Internet requests are specified by the RFC.

Application Protocol: SMTP

  • Email uses an application protocol called Simple Mail Transfer Protocol (SMTP). SMTP is connection oriented.
When you open your mail client:
1. Mail Client opens a connection to its default Mail Server. The Mail Server's IP address/Domain Name is usually setup with the Mail Client.
2. Mail Server transmits first message to identify itself.
3. Mail Client will respond with SMTP HELO command and Mail Server will send back 250 OK message.
4. According to user interaction Mail Client will send appropriate SMTP command.
5. request/response will continue to happen until SMTP sends a QUIT command, and server says goodbye.

Transmission Control Protocol

  • TCP is responsible for routing the Application Protocol to the correct application on the destination computer.
  • TCP uses port numbers to keep track of application destinations.
TCP Process:
1. When TCP receives Application Protocol its breaks it up into manageable "chunks".
2. Each "chunk" has added to it a TCP header with TCP information (i.e port number).
3. When TCP on destination receives the packet from the IP layer it takes the TCP header information and uses it to send the rest of the data to correct location.

  • TCP is not a textual protocol, it is a connection-oriented, reliable, byte stream service.
  • Connection-Oriented means two applications using TCP must first establish a connection before exchanging data.

Internet Protocol

  • IP is a connectionless protocol and unreliable because it doesn't care if it gets to its destination (huh???). IP also doesn't know about connects (ughhh what is a connection???)
  • IP's job is to route and send packets to other computers.
  • IP packets may arrive out of order or not at all. TCP's job is to make sure packets arrive and in order.

URL

  • Universal Resource Locater
scheme/protocol - i.e http
domain/ip address
port - optional, 80 is http default port
path
query_string
fragment_id
 
protocol://domain:port/path?query_string#fragment_id

HTTP

  • Hyper Text Transfer Protocol (HTTP) is the application protocol for the web to retrieve documents, it is a basic response-request protocol.
  • A client will try to connect to the address of a server, to establish a TCP connection and communicate on this connection.

Request

  • Structure of an HTTP Request Message consists of:
1) Request Line
2) Request Headers (https://en.wikipedia.org/wiki/List_of_HTTP_header_fields#Request_fields)[list of request header fields]
    a) Host: hostname of server being connected to.
3) Empty Line (\r\n) [https://www.w3.org/Protocols/rfc2616/rfc2616-sec2.html#sec2.2](carriage return/line feed)
4) Body
  • Request Methods aka "verbs"[https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods](list of request methods) tell the server what action to perform:
GET - retrieve resource/data without other effects, data is passed through **Query Strings** of URL.
POST - requests that the server store data (create date), data is passed through body of request.

Response

  • Structure of an HTTP Request Message consists of:
1) Status Line
2) Response Headers (https://en.wikipedia.org/wiki/List_of_HTTP_header_fields#Response_fields)[list of response header fields]
    a) Content-Type: type of resource server is responding with
    b) Location: url to redirect if 300
3) Empty Line (\r\n) [https://www.w3.org/Protocols/rfc2616/rfc2616-sec2.html#sec2.2](carriage return/line feed)
4) Body (usually an HTML document)

Status Codes

  • 1XX : Informational, request received
  • 2XX : Success, request received, understood, and acce
  • 3XX : Redirection, additional action must occur to complete request
  • 4XX : Client Error
  • 5XX : Server Error

netcat/curl/

HTTP/2

TCP/IP

https://web.stanford.edu/class/msande91si/www-spr04/readings/week1/InternetWhitepaper.htm JSON / XML https://cs.nyu.edu/courses/fall16/CSCI-UA.0480-007/ https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Objects/JSON https://www.youtube.com/watch?v=7_LPdttKXPc Response/Request ??? How does it ask for JS + CSS and express serves back html?