Masterdoc - Dleifnesor/NET-215 GitHub Wiki
A network protocol defines rules and conventions for communication between network devices, including:
- Mechanisms for devices to identify and make connections with each other
- Formatting rules that specify how data is packaged into messages
- Rules on how those data "packages" are sent and received
One universal protocol for all data transmission is inefficient because:
- Transmission media can change (wired to wireless)
- Addressing can change (IPv4 to IPv6)
- Systems may run multiple instances of a service
Network developers realized the need to develop modular protocols that focus on one aspect of communication at a time.
- The fundamental abstraction used to collect protocols into a unified whole is known as a layering model
- Layers help protocol designers and implementers manage complexity
- Layering allows focus on one aspect of communication at a given time
Using a layered model:
- Assists in protocol design with defined information and interfaces
- Fosters competition because products from different vendors can work together
- Prevents technology changes in one layer from affecting other layers
- Provides a common language to describe networking functions
The OSI (Open Systems Interconnection) model uses encapsulation, which is the process of prepending and appending to the data being sent. Each layer encapsulates the data in its own headers, which that protocol then removes when the data reaches its destination.
The term frame format refers to the way a packet is organized, including details such as the size and meaning of individual fields.
- MAC addresses (Media Access Control) are 6 bytes in hexadecimal format
- Example: B8-AC-6F-CA-2A-E6
- First 3 bytes are manufacturer address (OUI - Organization Unit Identifier)
- Last 3 bytes uniquely identify the device
- The type field in an Ethernet frame allows a computer to have multiple protocols operating simultaneously
- 2-byte field after the Destination and Source MAC Address
- Indicates which protocol to send to next
- Layer 2 addresses (MAC) are used for local transmissions between directly connected hardware devices
- Layer 3 addresses (IP) allow for indirectly connected devices on different networks
- When sending a request using IP, it is sent one hop at a time, from one physical network to the next
- At each hop, actual transmission occurs using MAC addresses
Translation from a computer's IP address to an equivalent hardware address is known as address resolution. ARP involves encoding the IP address of the intended recipient in a broadcast message, which is sent on a local network to allow the intended recipient to respond with its data link layer address.
- An ARP transaction begins when a source device on an IP network has an IP datagram to send
- It must first decide whether the destination device is on the local or a distant network
- If local, it will broadcast asking for the MAC address of the destination's IP
- If distant, it will broadcast asking for the MAC address of the default gateway's (router) IP
Four different addresses are involved in each message:
- Sender Hardware Address: Layer 2 address of the sender
- Sender Protocol Address: Layer 3 (IP) address of the sender
- Target Hardware Address: Layer 2 address of the target
- Target Protocol Address: Layer 3 (IP) address of the target
Bandwidth Issues:
- Each ARP message ties up the local network
- ARP messages aren't large, but sending them for every hop would be inefficient
Performance Issues:
- Sending system has to wait for answer
- ARP Request messages are broadcasted, requiring CPU time from every device
Solution: ARP caching
- Reduces network traffic and ensures resolution of commonly used addresses is fast
- ARP maintains a small table of bindings in memory as a cache
- The oldest entry is removed when the table is full or an entry has not been updated for a long period
- Kept in memory on most operating systems
- On Windows, use
arp -a
to view the table contents - Modern operating systems set a random initial timeout between 15-45 seconds
- If communication continues, the timeout can extend to a few minutes
- Addressing: Mechanism to identify both the network address and unique host address
- Routing/Indirect Delivery: When the destination is on a distant network, datagrams must be delivered indirectly using routers
- Fragmentation and Reassembly: IP can fragment datagrams into pieces when necessary due to physical layer limits, and reassemble them at the destination
Important fields include:
- Version: IPv4 or IPv6
- Header Length: Multiply by 4 to get byte count
- TTL: Time-To-Live
- Prevents packets from roaming forever
- Value set when datagram is created (max 255)
- Decreased by one as it passes each router
- If it reaches 0, the router discards it and may send an ICMP TTL expired message
- Default TTL value is usually 64 or 128
Routers transfer packets between different networks:
- IP networks use the Net ID portion of the IP address to identify specific networks
- Having no routers would be like having no post offices - only one zip code for the entire network
- All systems would be in the same broadcast domain
-
Switching (Layer 2):
- Looks at Layer 2 headers and tracks MAC addresses and their ports
- Forwards packets to appropriate ports based on MAC
-
Routing (Layer 3):
- Builds table of Network IDs and the IP of the next router in that direction
- Inspects packets for NetID of IP address and forwards to next router
- Only needs to know the next router, not the whole path
At minimum, router tables need:
- Network Address: The address of a network
- Subnet Mask: To know how many bits are in the Net ID
- Next Hop: IP address of a neighboring router
- Interface: Which of the router's physical interfaces to use
May also contain data like distance (how many hops) and preference for multiple entries.
-
Static: Network admins manually configure routes
- Example:
ip route 192.168.4.0 255.255.255.0 192.168.3.2 FA0/1
- Does not scale well for large networks
- Example:
- Dynamic: Protocols allow routers to exchange routing information
- Routers broadcast their routing tables
- Other routers listen and update their tables
- Tables typically sent every 30-60 seconds
- Routing preference based solely on hops (distance vector protocol)
- Advantages: Easy to configure, good for simple/small networks
- Disadvantages: Noisy (lots of broadcasts), slow to converge, doesn't scale well
- More scalable method for IGP (Interior Gateway Protocol)
- Routers identify their neighbors
- After initial convergence, only send Hello messages and Link State updates
- Uses advanced route-selection metrics (notably bandwidth)
- Open Shortest Path First (OSPF) is the most popular IGP
- Adjacent routers send "Hello" messages every 10 seconds
- Changes propagate through Link State Advertisements (LSAs)
NAT remaps one IP address space into another by modifying network address information in IP datagram packet headers.
For example, a medical center with:
- ISP-provided public IP network: 210.84.27.0/24 (254 public IPs)
- Thousands of clients that need internet access
NAT helps by allowing clients to share public IPs:
- Clients make outbound connections
- NAT router translates private-to-public IPs
- NAT router tracks translations for return packets
- Common type of NAT used in labs
- Used on "client" networks making outbound connections
- "Hides" networks with private addresses
- The private source IP address is changed to a public IP address
- Enables communication only when conversation originates inside the masqueraded network
- Masquerading routers maintain stateful translation tables
- Common implementation of IP Masquerading NAT
- Permits multiple devices on a LAN to be mapped to a single public IP
- Primary goal: Conserve IP addresses
- Secondary goal: Masquerading
- Widely used in enterprises for client/wireless subnets and by ISPs for home networks
- PAT devices use source TCP port numbers to track different sessions
- For client-to-server packets:
- Destination port is the service (22/SSH, 80/HTTP)
- Source port is "ephemeral" (1-65535)
- PAT router uses source port to track sessions
- PAT router maintains a translation table
- Like a router rewriting the Layer 2 header at every hop
- A NAT router also rewrites the Layer 3 header with a new source IP
- The router builds a table to track translations for proper routing of response packets
- 32-bit number - a string of 32 1's and 0's
- Displayed as four 8-bit numbers with dots (dotted decimal notation)
- Each 8-bit number is called an octet
- Example:
- Computer sees: 10000001101010100001001011011100
- Humans see: 129.170.18.220
An IP address includes both:
- Network ID (like a zip code) - always at the beginning of the address
- Host ID (like a street address) - the remainder
The tricky part:
- IP address is always 32 bits but the Net ID can be different lengths (8-30 bits)
- Host ID is whatever is left over
- Subnet mask defines how many bits are in the Net ID
The subnet mask uses 1 and 0 bits to indicate:
- Network portion (1)
- Host portion (0)
Example:
IP Address: 172.16.4.35 / 24 Host 172.16.4.35 10101100 00010000 00000100 00100011 Mask 255.255.255.0 11111111 11111111 11111111 00000000
Subnet Mask Values Within an Octet:
Mask (Decimal) | Mask (Binary) | Network Bits | Host Bits -- | -- | -- | -- 0 | 00000000 | 0 | 8 128 | 10000000 | 1 | 7 192 | 11000000 | 2 | 6 224 | 11100000 | 3 | 5 240 | 11110000 | 4 | 4 248 | 11111000 | 5 | 3 252 | 11111100 | 6 | 2 254 | 11111110 | 7 | 1 255 | 11111111 | 8 | 0
IPv6 retains many successful features of IPv4:
- Connectionless (not bound to route)
- Header contains maximum hop count
- But IPv6 changes all the details
Key differences:
- Address Size: 128 bits instead of 32 bits
- Header Format: Completely different from IPv4
- Extension Headers: Uses a standard header with extension headers for optional protocol data
Although twice as large as IPv4 (40 bytes), the IPv6 base header contains fewer fields:
- VERS (Version 6)
- TRAFFIC CLASS: Specifies traffic class (differentiated services)
- PAYLOAD LENGTH: Similar to IPv4's packet length field
- HOP LIMIT: Corresponds to IPv4's TIME-TO-LIVE field
- FLOW LABEL: Associates datagrams with particular paths
- NEXT HEADER: Specifies type of information following the current header
- Fixed, constant format 40-byte base header
- Optional extension headers
- Payload (data)
- All encapsulated within a Data Link layer frame
- Fixed-size header reduces processing time
IPv6 uses colon hexadecimal notation:
- Each group of 16 bits written in hex with colons separating groups
- Example: 69DC:8864:FFFF:FFFF:0:1280:8C0A
Notation shortcuts:
- Leading zeros can be omitted in each field
- Double colon (::) can be used once to replace multiple fields of zeros (Zero Compression)
- Example: fe80:0:0:0:200:f8ff:fe21:67cf can be written as fe80::200:f8ff:fe21:67cf
IPv6 addresses typically have two logical parts:
- 64-bit network prefix for routing
- 48 bits: Routing Prefix assigned to organization
- 16 bits: For subnetting within organization
- 64-bit interface identifier for hosts
- 16 bits for Subnet ID equals IPv4's Class B Network (65+ thousand subnets)
- 64 bits recommended for host addresses (required for auto-configuration)
- IPv6 subnetting works like Variable Length Subnet Masking in IPv4
- CIDR notation is effective (no real boundaries)
- Examples:
- 1090::9:900:210D:325F/60 (60 bits for network, 68 for host)
- fe80:0:0:0:200:f8ff:fe21:67cf/24 (24 bits for network, 104 for host)
Champlain College assigned the Prefix: 2620:E4
/48 Can create subnets:- 2620:E4:C000:1:x:x:x
- 2620:E4:C000:2:x:x:x
- ...
- 2620:E4:C000:FFFF:x:x:x
- IPv6 doesn't use Layer 3 broadcasts (no special network broadcast address)
- No ARP: Uses Neighbor Discovery and Router Advertisements/Solicitations
- ICMPv6:
- Critical to IPv6 network function
- Supports Neighbor Discovery, Router Solicitation/Advertisements, and Link-Local Addresses
TCP is a Layer 4 protocol that improves upon "best effort" delivery when needed.
TCP features:
- Connection Orientation: Application must request a connection to a destination
- Point-to-Point Communication: Each TCP connection has exactly two endpoints
- Complete Reliability: Guarantees data delivery completely and in order
- Full Duplex Communication: Allows data to flow in either direction simultaneously
- Connection-Oriented: 3-way handshake and session teardown
- Sequencing: Tracking data order in both directions
- Reliability: Acknowledging packet receipt to detect and resend missing packets
- Reliable connection startup using a 3-way handshake
- Each side sends a control message with initial buffer size and sequence number
- The handshake ensures TCP won't open/close a connection until both ends agree
Connection Establishment sequence:
- SEQ=0, ACK=0
- SEQ=0, ACK=1
- SEQ=1, ACK=1
TCP flags in the header:
- URG: Bit 106
- ACK: 107
- PSH: 108
- RST: 109
- SYN: 110
- FIN: 111
Source and Destination Port:
- 16-bit numbers (0-65,535)
- Source Port identifies the sending application
- Destination Port identifies the receiving application
Sequence and Acknowledgement:
- 32-bit numbers to track communication
- Sequence tracks what's being sent in that direction
- Acknowledgement tracks what's been received in that direction
- Each end generates a random 32-bit initial sequence number
- Sequence numbers increase based on bytes sent
- Using bytes allows packets to be placed in correct order when received
- Acknowledgement numbers indicate what has been received
- Equals initial SEQ# plus all bytes received
- Acknowledgements don't have to be sent for every packet received
- Graceful shutdown ensures both sides agree to close the connection
- Open connections can be security risks
- FIN flag used to close connections
- FIN and ACK sent in each direction to guarantee all data arrived before termination
Congestion results in delay:
- If persistent, devices run out of memory and discard packets
- Retransmission recovers lost packets but adds more traffic
- TCP avoids congestion collapse by monitoring the network and reacting quickly
- Uses sliding window and retransmission optimizations
General principles:
- Receiver indicates how many bytes it can handle
- Sender knows how many bytes it can send
- Sender knows how many bytes to send between acknowledgements
- Set in the 16-bit "Window" field in TCP header
- Tells sender how much data can be sent before acknowledgment
- If receiver can't process data quickly enough, it reduces window size
- This alerts sender to reduce data flow or allow receiver to clear buffer
When a client advertises zero window size:
- TCP receive buffer is full
- Processor may be busy or stuck
- When client resumes data processing, it sends a TCP Window Update
- Zero Window indicates network performance issues
- Fixed timeout not suitable for Internet
- TCP makes retransmission adaptive
- Uses weighted moving average of RTT (Round Trip Time)
- Helps reset retransmission timer if delay returns to lower value
- Used by applications not requiring reliability, acknowledgment, or flow control
- Simple and fast
- Provides only transport layer addressing (UDP ports) and optional checksum
When performance is more important than completeness:
- Streaming (humans only notice significant disruptions)
- Retransmissions not helpful for past points in streams
For short exchanges:
- Simple request/reply (like DNS)
For applications handling reliability in upper layers:
- Some modern protocols use UDP and add TCP features in upper protocols
UDP is simple with a minimal header
- TCP and UDP use ports and sockets for virtual software addressing
- Enables multiple applications to function simultaneously on an IP device
- Different applications communicate between client and server
- Multiplexed for transmission using the same IP and physical connection
- Received data is demultiplexed to appropriate applications
- Source and destination port numbers in TCP/UDP headers
- 16-bit fields allow 65,535 possible ports
- Each port identifies a particular software process
- Clients initiate communication
- Clients need to know server port numbers
- Servers use well-known/registered port numbers
- Example: web server uses port 80
- Clients don't need "known" ports
- Can pick random ports for initial requests
- Servers use this port for responses
- Combination of IP address and port number
- Notation: <IP Address>:<Port Number>
- Example: 41.199.222.3:80 for HTTP server
- Uniquely identifies each connection
- Contains four elements:
- Client IP address
- Client port
- Server IP address
- Server port
- Example: 41.199.22.3:80 and 177.41.72.6:3022
Network configuration information needed:
- IP Address
- Subnet Mask
- Default Gateway/Router
- DNS Server Address
DHCP assigns this information automatically.
Two primary phases:
- Initialization Process: Clients request, receive, and confirm IP address assignment
- Renewal Process: Clients coordinate with DHCP server to continue using assigned IP
DHCP uses UDP:
- UDP port 67 for server
- UDP port 68 for client
Four phases:
- Discover: Client attempts to discover a DHCP server
- Offer: IP lease offer from server to client
- Request: Client requests to use the IP lease
- Acknowledgement: Server confirms lease acceptance
- Client has no IP and doesn't know the network
- "Shout into the wilderness"
- Layer 2: Destination MAC is FF:FF:FF:FF:FF (global broadcast), Source MAC is client's
- Layer 3: Destination IP is 255.255.255.255 (global broadcast), Source IP is 0.0.0.0
- Sent from server to client
- Includes valid IP addresses
- Server communicates using client's hardware address
- Includes offered IP as destination in IP header
- Sent from client to server
- Requesting to use the offered IP address
- Final step
- Server confirms IP address for client
- Records information in its database
- Client can now communicate on the network
Depends on network needs:
- Highly static subnets (lab systems): Hours or days
- Reduces broadcast traffic and address changes
- Highly dynamic subnets (dining hall wireless): Minutes
- Prevents holding leases after clients leave
Operations:
-
Renewal: Client requests continued use of its lease
- T1 (default 50% of lease period): Client unicasts to server holding lease
- If server responds with DHCPACK, lease is renewed
-
Rebinding: If no response from original server
- T2 (default 87.5% of lease period): Client broadcasts requests to any DHCP server
-
Expiration: If lease expires with no response
- Client releases IP and TCP/IP stops except for DHCPDISCOVER broadcasts
- OpCode: Request or reply indicator
- Hardware Type: Type of hardware address
- Hardware Length: Length of hardware address
- Hops: Used by relay agents
- Transaction ID: Random number to pair requests with responses
- Seconds Elapsed: Time since first request
- Flags: Traffic types client can accept
- Client IP Address: Derived from Your IP Address field
- Your IP Address: IP offered by DHCP server
- Server IP Address: DHCP server's IP
- Gateway IP Address: Default gateway
- Client Hardware Address: Client's MAC
- Server Host Name: Optional
- Boot File: Optional
- Options: Expands DHCP packet features
The distributed, hierarchical naming structure for Internet systems:
- Root Servers
- Top-level-domain Servers (TLD)
- Authoritative Servers
Authoritative servers contain records for their domains:
- Name (Fully-qualified Domain Name)
- Type (Record type)
- TTL (Time in seconds for caching)
- Value (What the FQDN resolves to)
- Type A: Maps hostname to IP address
- Type NS: Maps domain to authoritative name server
- Type CNAME: Maps alias to canonical name
- Type MX: Maps domain to mail server
Two methods:
- Iterative: Server responds with the name of another server
- Recursive: Server sends requests to other servers and returns results to client
- Request/Response Protocol
- Uses short messages suitable for UDP
- ID Number matches queries with responses
- Primarily uses UDP port 53 for server, ephemeral port for client
- Some activities (zone transfers) use TCP port 53
Key fields:
- Identifier (ID Number): Associates queries with responses
- Q/R: Query or Response indicator
- AA: Authoritative Answer flag
- RD: Recursion Desired
- RA: Recursion Available
- Reduces need to repeatedly ask the same questions
- DNS Records include TTL (Time-to-Live) value
- TTL tells recursive servers/resolvers how long to cache records
- System admins strategically plan TTLs to balance:
- Query volume (longer TTL)
- Propagation speed for changes (shorter TTL)
A VLAN is:
- A group of devices configured to communicate as if attached to the same wire
- Located on different LAN segments
- Based on logical rather than physical connections
- Extremely flexible
- Traffic cannot pass directly between VLANs within a switch
- To interconnect VLANs, routers or Layer 3 switches are needed
- VLANs are often associated with IP subnets
- Traffic between VLANs must be routed
- After subnetting an organization, the scheme must be implemented on physical infrastructure
- VLANs keep devices on separate networks even when physically adjacent
Switch ports need VLAN configuration:
- Access: For end-devices, assigned a single VLAN
-
Trunk: Carry multiple VLANs, connect switches
- Trunk ports "tag" packets with VLAN ID
- VLAN IDs must be consistent across switches
How trunk ports identify VLAN membership:
- "Tag" added to Ethernet header with VLAN ID
- Untagged packets assumed to be on "Native" VLAN
- Native VLAN is a default VLAN defined by admin
Common VLAN tagging standard:
- 802.1Q header inserted into Ethernet frame before crossing trunk ports
- 32-bit header (4 byte):
- 16 bits: Tag Protocol ID (0x8100)
- 16 bits: Tag Control Information including:
- 3 bits: Priority Code Point
- 1 bit: Drop Eligible Indicator
- 12 bits: VLAN ID (allows up to 4,094 VLANs)
Three foundations:
- Consistent Formatting: HyperText Markup Language (HTML)
- Addressing Scheme: Uniform Resource Locator (URLs)
- Exchange Protocol: HyperText Transfer Protocol (HTTP)
- HTTP relies on TCP/IP for network communications
- TCP provides reliable, connection-oriented transmission
- TCP provides sequencing and error-checking needed for HTTP
- Uniform Resource Locator (URL) is the most common resource identifier
- URLs describe specific location of a resource on a particular server
- They specify exactly how to fetch a resource from a precise location
- Absolute paths: Include domain name
-
Relative links: Point to file or path without domain
- Example: /graphics/image.png
- Only usable within the same site
- Paths in URLs aren't the same as filesystem paths
- They start at the "web root" directory
- Web root is the folder that web server software uses to publish resources
- Apache on Linux: often /var/www/html
- IIS on Windows: often c:\InetPub
Two types of HTTP messages:
- Request messages: Request action from web server
- Response messages: Carry results back to client
Both have similar structure with start lines:
- Request line: What to do (method, resource, version)
- Response line: What happened (version, status, description)
Key status codes:
- 200: OK
- 304: Not Modified
- 400: Bad Request
- 401: Unauthorized
- 403: Forbidden
- 404: File Not Found
- Host: Domain name (required in HTTP/1.1)
- User-Agent: Information about requestor
- Accept-Language: Preferred language
- Accept-Encoding: Supported compressions
- If-Modified-Since: Checks for page changes
- Referer: Page where link was clicked
- Cache-Control: Instructions for handling cached pages
- Content-Type: Type of data returned
- Last-Modified: When page was last changed
- Allow user data entry
- Can be processed by:
- Browser scripts (client-side)
- Server scripts (server-side)
- Enclosed in HTML
<form>
tag - Tag specifies endpoint and method (GET or POST)
GET:
- Sends data in URL query string
- Example:
GET /test/demo_form.asp?first_name=adam&last_name=goldstein HTTP/1.1
- Can be cached, bookmarked, remains in history
- Should not be used for sensitive data
- Has length restrictions
POST:
- Sends data in request body
- Example:
POST /test/demo_form.asp HTTP/1.1
withfirst_name=adam&last_name=goldstein
in body - Never cached, doesn't remain in history, can't be bookmarked
- No restrictions on data length
Comparison Table:
Feature | GET | POST -- | -- | -- Visibility | Data in URL | Data in request body Caching | Can be cached | Never cached History | Remains in browser history | Does not remain in history Bookmarks | Can be bookmarked | Cannot be bookmarked Data sensitivity | Not secure for sensitive data | More appropriate for sensitive data Length restrictions | Limited by URL length | No restrictions Use case | Simple retrievals, searches | Form submissions, uploads
- Cryptography: "Lock and key" that protects data through "disguise"
- Cryptographers: Create lock and key
- Cryptanalysts: Attempt to remove the disguise
- Cryptology: Study of cryptography and cryptanalysis
Additional terms:
- Cipher: Method to disguise text
- Plaintext: Original text
- Ciphertext: Disguised text
- Encrypt: Process of disguising
- Decrypt: Remove disguise
Technical terms:
- Algorithm: Step-by-step procedure for calculations
- Key: Information that determines the output of a cryptographic algorithm
- Hash function: Algorithm that takes a data block and returns a fixed-size bit string
- Message Digest: Result of a hash function
Security goals (CIA plus NR):
- Confidentiality: Using cryptography to keep data private
- Integrity: Ensuring data has not been altered
- Authentication: Verifying user identity
- Non-repudiation: Ensuring users cannot deny their actions
- Use encryption to prevent unauthorized disclosure
- Common applications:
- Encryption of data in transit
- Encryption of data at rest
- Two basic methods:
- Symmetric Encryption (Secret Key)
- Asymmetric Encryption (Public Key)
- Some processes combine both types
- Uses a secret key (number, word, random string)
- Applied to a message to change content
- Can be as simple as shifting letters in the alphabet
- Requires sender and recipient to know the secret key
- Pros: Fast, simple, effective
- Cons: Key exchange challenges, keeping keys private
- Also known as Public Key encryption
- Uses "key pairs":
- Public key: Available to anyone
- Private key: Only known by owner
- Messages encrypted with public key can only be decrypted by matching private key
- Messages encrypted with private key can only be decrypted by matching public key
- Pros: No need to exchange private/secret keys
- Cons: Slower, requires more processing
- Modern techniques combine asymmetric and symmetric encryption:
- Use asymmetric to start communication
- Share a symmetric key for ongoing communication
- Common implementations:
- SSL (Secure Socket Layer) used by HTTPS
- TLS (Transport Layer Security) - newer version of SSL
- SSH (Secure Shell) - common in Linux/Unix
- Identifies a user or server (subject)
- Contains information such as:
- Organization name
- Certificate issuer
- Subject's email address and country
- Subject's public key
SSL/TLS process:
- Systems exchange public keys (certificates)
- Use public/private key pairs to establish encrypted tunnel
- Create a secret key (symmetric) and send through tunnel
- Use the secret key to encrypt data for transmission