Networking - RicoJia/notes GitHub Wiki

========================================================================

Learning Path

========================================================================

  • Books: If you are looking for intro, these Zhihu posts are good
    • flask webdevelopment
      • Templates (HTML code with logic)
        • comes from template engine, called Jinja
      • great for prototyping, while FastAPI is natively asynchronous
        • Use WSGI
    • scrapy?
    • SQL (Structured Query Language) is relational. Not suitable for Big Data, where there's a lot of concurrency. MongoDB (nosql).

========================================================================

Basics

========================================================================

  1. Web Structure:

    %% graph TD
    flowchart LR
        C(client) --> N(nginx server)
        N --> U(uwsgi server)
    
        %% This is how you highlight
        classDef measurement stroke-width:4px
        class N,U measurement
    
    Loading
    • WSGI (Python Web Server Gateway Interface) can talk to nginx
    • challenge in front end: new features may not be supported by browser.
      • CSS (how things look, there are selectors we can use.)
      • Javascript: powerful, supports gadgets without reloading pages, etc.
    • mozilla dev tools: , then you can edit on the page. refresh the page then they will come back
  2. Networking Basics

    • Lan (local area network) vs WAN (wide area network)
    • NAT: network address translation: rerouting packets to a routing server
      • usually in physical lan, route requests from one IP to corresponding backend server
        • every time docker starts, it tries to get on to a new subnet. So docker network create mynetwork creates that new subnet. docker network inspect mynetwork
          • docker network inspect: gives subnet, and ID;
            • config ip a should
            • when you do docker run -p 80:80 to run nginx, then do IP address:80, you can see it
      • bridge vs natting: br- in IP
        • wireshark can run tcpdump, but need to run with sudo
          • capture everything, then filter
          • linux networking: create a VPN: see network interface. Route network interface to the new?
        • Pritunl: a new network interface. Bridge
          • create a new subnet
          • External network interface
    • MAC (Medium Access Protocol): distinguish specific devices; Each device has a unique address
    • SSH is on application layer
    • Network interface: every device has one. --> switch (like a network port expander.) --> Router (will assign IPs to MAC.)
      • Could be newtwork interface card, or operating system level software. Includes virtual interfaces like loopback.
      • Identified by a unique MAC address
      • a switch is a network bridge, combining two networks together, with inbound and outbound ethernet ports. Analyze incoming traffic, and direct that to the device.
        • An old network hub will blast all the traffic to all outbounds ports, shortcoming!
  3. Ip address is part of TCP/IP protocol. TCP is transmission control protocol, IP is internet protocol.

    1. An IP address can be public, assigned by ISP, or private, assigned by router.
      • 192.168.x.x is the "localhost" of a server on the same network
      • Port is a 1 - 2^16 number, used for differentiating application on the same ip address.
      • IP Address + Port number = socket.
      • Webservers always use 80 as their port number. But this can be randomly assigned. Since there's only 65536 number, this is not really a safety feature.
      • Once a connection is gone, the socket should be cleaned.
    2. a switch is faster than a router
    3. Lo
      • In ifconfig, lo means "loopback interface", which is the physical connection bw the device with itself or another device of the same type.
        • 127.0.0.1 is called the "loop back" address. No external machine can be connected to that
        • 0.0.0.0 is accessible to all external machines
      • A loop back test is used to test if the device can send/receive signals, for hardware failures.
      • It's always 127.0.0.1, which is called localhost during web server setup. (the local machine)
      • It's not the address your computer uses to communicate with other machines.
  4. Half duplex, full duplex, multiduplex

    • HTTP is half duplex.
  5. Polling: client periodically asks the server for updates. HTTP 1/2 has persistent connection that last for a few request-responses. THe server will send a response back, but for updates, client still needs to poll.

    • short polling: if no updates from server, server won't hold it
    • long polling: if there are updates, then server will hold it
    • Websocket is Full duplex is full bidirectional, with a long-lived connection. so once there's an update, the client gets it immediately. No more polling

Protocols

  1. HTTP Request: top of tcp/ip.

    • GET has everything in HTTP header, for queries, no side effects. Can be saved by browser 2. GET curl http://10.0.1.85/L is doing GET /L```
    • HEAD: ask for header of a page.
    • POST has content in HTTP content, for changing something on the server. Cannot be saved by browser, so safer
    • PUT: ask the server to replace something, with the content
    • DELETE: delete a page
  2. what is a socket?

    1. an "endpoint" (end point is not a networking term) of a bi-directional inter-process communication flow.
    2. an endpoint = IP address + port number. A socket is bound to an endpoint.
      • IP address is 4 bytes.

      • Every TCP has 2 endpoints (does that mean I need to create 2 sockets??) -A: true, your lib creates 1 endpoint for a TCP socket . But the endpoint at the client's side is implemented by them. Additionally, you're using ServerSocket, which implements a socket servers can use to listen on a port and accept connections.

      • Once we accept a connection, a new socket is created bound to the same local point, and has a remote endpoint set to the client address and port.

      • Can we connect this to the web?

        • A: unless you use URL sockets, which is a related class.
      • Make sure your socket doesn't clash with other nodes.

  3. RTP, RTSP, TCP RTSP controls videos streaming (like play, pause, etc.) over RTP. RTP handles synchronization of media streams, identifying lost packet (there's a packet number to each packet). RTP can be transmitted over UDP or TCP

========================================================================

Networking Workflows

========================================================================

  1. Router

    1. has a Route table, where to send packets to. It is a set of rules
    2. local network VLANS (virtual local area networks, subnets)
      • easy to find them on router
      • ARP (address resolution protocol) scans could be only limited to the current segment. ARP scans are using.
        • ARP: Address Resolution Protocol: Purpose: find MAC address of machine:
          1. Two devices communicate with each other must know their MAC addresses.
          2. If given IP address, send an ARP request to the device to find MAC
            • The device can choose not to respond
          3. If not, broadcast; then update ARP table.
          4. It's ARP protocol, not ICMP
        • ARP-scan is network discovery, scanning the current network segment
      • ICMP protocol (Internet Control Message Protocol, same layer as IP), ping. Used for diagnostics, no data
    3. A proxy: IPv6 doesn't have private address: they could be routable through: 10 years ago. only Ipv4 has private IP. private ip address is not routable
  2. VPN-Firewall

    • palo alto firewall is great capacity, 7000, vs Unify, by cisco, $80000. malware can reach all other devices. DNS filtering, different levels of subscriptions, come with global protect suite. Blocking other vpn tools.
    • Split tunel
      • specific traffic will be "routed". VPN takes over internet gateway (NAT), handling the subnet. Subnet block
  3. Docker Container

    • Network is the same as the linux networking.
    • network resolution
      
      

========================================================================

Building a Server

========================================================================

Socket

  1. Workflow

    1. listening to socket (socket() -> bind() -> listen() -> accept()->block until connection is established)
      • client is socket() -> connect()
    2. read() (while client is writing)
      • client is write()
    3. write() (analyze request -> if valid, calculate (run CGI and return result, or just return text) -> make HTTP request)
      • client is read()
    4. read() (wait for client to send finish)
      • client sends eof
    5. close
  2. Definitions:

    • Server listening on a port and an iP.
    • clients will send reqeusts
    • Socket is communication over a network, for multoiple processes
      • each process should create a Socket
      • Only when the socket, socket type, and address domain match can two processes communicate
        • Address Domain: (either or)
          • Unix domain
            • two processes sharing the same file Systems
            • address of socket is a char string, entry in the file Systems
          • Internet Domain (Most common)
            • Two computers (hostmachines)
            • Address of socket is the IP address
        • Socket type
          • TCP (aka stream socket)
            • continuous
              • like a pipe, bi-directional.
              • Any number of bytes
              • all the ordering is correct
            • does error-checking (order is correct)
              • the other end will get notified if one closes/resets the connection. If connection is lost, then reset. Heavy
          • UDP (USER datagram protocol, aka datagram socket)
            • message is sent at one shot (8kb)
            • no error-checking
            • So light weight, but unliable
  3. Serialize & deserialization, Good Read

    • nanopb (serialization) vs protobuf (deserialization) is for transmitting a data structure over the wire
    • protobuf (proto buffer)
      • invented by google, better than JSON, XML for smaller data packets
        • is compressed binary rather than raw string
        • Not human readable. But JSON, XML is.
      • definition of data to be serialized is in .proto file. The configs are called msgs.
        • How it works: they use this pre-determined schema on two ends of transmission , then, they start just using the numbers for transmission.
           syntax = "proto3";
           message Person {
             uint64 id = 1;
             string email = 2;
             bool is_active = 3;
             enum PhoneType {
               MOBILE = 0;
               HOME = 1;
               WORK = 2;
             }
              message PhoneNumber {
               string number = 1;
               PhoneType type = 2;
              }
             repeated PhoneNumber phones = 4;
           }
          
          • uint64, bool, float are sent as scalars; enum is a different type; PhoneNumber is a different type too
          • repeated means array. But before proto3, there's required and optional
          • 1,2,4 are field tags, so they can be used for replacing the actual words.
      • protobuf can be compiled to generate code in the user's programming language.
      • Might be too easy to crack, like pokemon go using MitM Attack
  4. Broadcasting:

    • This is only in UDP. Everytime someone comes on, they broadcast to all machines.
      1. You can broadcast to exchange? (network switch)
  5. Websocket

  • If already joined a room, rejoining it will block.
  • socket will try connecting in a lazy fashion

Server

  1. Create a socket
    	socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)	// IPv4 or IPv6 addresses, socket type, or TCP/UTP.
    	Returns a descriptor referencing the socket.
    	close(server_socket or client_socket)
    
    • port number: 18000, 80 is default, but you sometimes need super user priviledges.
    • socket is just allocating resources. SOCK_STREAM is like a file stream, but that goes to client.
  2. Bind a socket
    if (bind(server_socket, (SOCKADDR*)&service, sizeof(service)) == SOCKET_ERROR)
    {printf( bind_failed);
    closesocket(server_socket);}
    )
    sockaddr_in service;	//structure for storing IPv4
    service.sin_family  AF_INET //address family
    service.sin_addr.s_addr = INADDR_ANY;	//
    service.sin_port=htons(5555)	//this port number should be > 1024.
    #Remember addr can come from either the same machine or different machines on the network.
    
  3. Listen Function
    • listening: telling the operating system to listen.
    	if (listen(server_socket, request_queue_size) == SOCKET_ERROR)) printf("ERROR listening on socket");
    
  4. Accept Connection
    	SOCKADDR_IN client_addr
    	SOCKET client_socket
    	int clientAddrSize = sizeof(clientAddr)
    	client_socket = accept (server_socket, (SOCKADDR*)&clientAddr, &clientAddrSize) != INVALID_SOCKET	#wrapper with the client socket and the connection itself.
    
  5. receive
    recv(client_socket, buffer, sizeof(buffer), 0)
    
  6. disconnect
    • 3-6 can be summarized as
      while loop:
          confd = accept(NULL), null is to accept any sort of connection (a single HTTP client request?);
          // confd is what we actually use to the client. The original socket is still olistening. accept is blocking.
      
          memset(recvline, 0, MAXLINE)	#reset the char buffer
          while (n = read(confd, recvline)){
              bin2hex(recvline);	//HTTP request is binary. convert that to HEX
              }
          snprintf(buff, "Hello")	//server write a response to a buffer.
          write(confd, buff)
          close(connfd)
    • SHould be able to work with a web browser
      • build using cmake, run it.
      • open browser, localhost:port_num; if no port_num specified, then it's by default 80.

Custom Lib - crow

  • What does crow do? - create websocket, webservers, etc.
  • Add route (end points) to your app ??
     CROW_ROUTE(app, "/")([](){
     return "Hello world";});
    
  • run your app, once happy with route settings
     app.port(18080).multithreaded().run();
    

========================================================================

AWS VPC, EC2

========================================================================

Basics

  1. VPC is like a parking log

  2. VPN server: t2 micro is free tier (Can allow 2 machines max)

    1. Set up an Amazon VPN server. video
      • Getting into ssh server: sudo ssh -i "vpn.pem" [email protected]
        • vpn.pem should be downloaded there.
      • Choose "use routing" instead of "use NAT"
      • In advanced VPN settings, enable Inter-Client Communication
    2. On client computers: Install Client
      sudo wget https://swupdate.openvpn.net/repos/openvpn-repo-pkg-key.pub
      sudo apt-key add openvpn-repo-pkg-key.pub
      
    3. How to connect
      1. Go to EC2 AWS Console
      2. find Public IPv4 Address, under EC2->Instances->i-0d24ca8ee730a9a8c
      3. Admin: https://3.22.234.161:943/admin. username is openvpn
      4. For client, do https://3.22.234.161:943/, username is also openvpn
      5. On local machine, openvpn3 session-start --config profile-4.ovpn
      6. To Disconnect: openvpn3 session-manage --session-path /net/openvpn/v3/sessions/3d98ca84sd46cs437ds9b80s5484f1bc6726 --disconnect
        • or openvpn3 session-manage --session-path $(openvpn3 sessions-list | grep "Path" | sed 's/Path://g') --disconnect
    4. How to connect on rpi:
      • Install OpenVPN
      • Get the openvpn profile.
      • sudo openvpn --config profile-4.ovpn

========================================================================

Edge Server, K8S

======================================================================== Whole Process 1. edge device: car with 50 CPU, with quite some computing power 2. edge server: IT equipment to monitor production process. Talk to many edge devices. Each is run in containers. (distributed by K8S) 3. security matters, because edge server is not in our central datacenter

========================================================================

OSI model (Open System Interconnect)

========================================================================

  1. APplication, User apps like web-browser,

  2. Presentation Layer: compress & encrypt data, like JPEG, MPEG4S, SSH, etc.

    • encodings such as big-Endian, Little-Endian, UTF8/UTF-16
  3. Session: opens & close "sessopms" with the host. VIrtual connection (aka transport connection) is establised, no network communication. (lower layers will send one-time msgs.)

    • the whole big "sessions" where smaller sessions "transport" sessions and the failure communication take place.
      • RPC is for remote computer; SQL is for database; x windows
  4. Transport Layer

    • packages, addresses, sends data from session layer (TCP, UDP is here)
    • Firewall
    • TLS, transport layer protocal is a successor to SSL. Has digital certs: authenticate server, HTTPS (HTTP secure) is over TLS
  5. Network Layer: main purpose: finds the way thru.

    • routes: finds how to get the packet to the socket. Then frame it, and reassemble packets of data (so it will become a complete packet again)
      • logical addressing (ipv4, ipv6)
      • data enc
  6. Data Link Layer

    • It is a network driver, that controls network card (hardware)
      • Manages Request from 2 competitive layers
      • handles physical addressing of connection.
    • Mac Sublayer (media access control): provides network interaction
      • device network interaction
    • LLC Sublayer (logic link control )
      • addressing and multiplexing ()
  7. Physical Layer: network card

  • Worth mentioning that TCP/IP predates OSI, so it uses Internet Protocol Suite. Just a different partitioning of things

  • each network is isolated from each other. Inside each network, addresses are unique.

    • To connect different subnet, we have NAT (Network Address Translation)
    • I.e., 192.168.0.1 can be in every single subnet
  • IPV4 vs IPV6:

    • IPV4 192.168.0.1 is IPV4, 1100 0000 - 1010 1000 - 0000 0000 - 0000 0101 (32 bits)
    • Where as IPV6: 12 03-8f e0 - fe 80 - b897-8990-8a7c-99bf-323d... (128 bits)
  • subnetting:

    • 192.168.0, 192.168.1 are subnets. then you have the hosts.

========================================================================

Commandline tools

========================================================================

  • netstat -plntc
  • sudo arp-scan to scan devices on the same network

========================================================================

HTML Basics

========================================================================

  1. tags
    • <strong>: bold
    • <a href="ref">some text</a>

========================================================================

MQTT

======================================================================== What is MQTT: - We have one broker, and everybody else is a client. Client -- publish --> broker <-- subscribe -- subscriber 1, subscriber 2... - The broker can be run on a local machine, or in private AWS, Azure, etc, or a public one - Typical workflow: 1. Clients establish TCP/IP connection with broker 1. Client publishes to a certain topic. Broker receives it, then direct it to the listening clients MQTT is great for IoT applications - has a very light weight client implementation, . - Its message headers are very short - has 3 Quality of Security (QoS) levels: - QoS 0 : message is delivered at most once (so it can get lost) - QoS 1: message is delivered at least once - Has TLS (transport layer security), and SSL encryption. - Can choose between clean session or persistent session, after client reconnecting to a previously disconnected session - clean session: when reconnected, messages published while it's offline is not available - Will - message to be sent after a certain time after a disconnection MQTT topic subscription supports wild cards: - topic/+ supports listening to topics/x, topic/y - topic/# supports topic/x/a, topic/y/b, etc. - These wild cards cannot be used when publishing

When Creating an MQTT client: - ServerURL, port (SSL is 8883) - end point - Cert, key - on_connection_interrupted, on_connection_resumed - client id - clean_session, keep_alive_secs - on_connection_success, on_connection_failure, on_connection_closed

  • Question:
    sslopts.set_trust_store("/path/to/ca.crt");
    sslopts.set_key_store("/path/to/client.crt");
    sslopts.set_private_key("/path/to/client.key");
    sslopts.set_enabled_cipher_suites("TLSv1.2");
    sslopts.set_enable_server_cert_auth(true);
    

./build/basic-connect --endpoint "akx2axdo1d3zl-ats.iot.us-east-1.amazonaws.com" --cert "./pcdu0/pcdu0-device.pem.crt" --key "./pcdu0/pcdu0-private.pem.key" --ca_file "./Amazon-root-CA-1.pem" --port_override 8883 --client_id "pcdu0"

========================================================================

MODBUS

======================================================================== guide: https://www.debugself.com/2020/01/01/modbus_guide/

  1. 3 main types of Modbus interfaces: ASCII, TCP/IP (over ethernet), RTU (over RS-232, 485).

    • RS-232 is older, and only meant for point-point, short range. RS-485 is point-multiple point, long range
    • Represent binary as hex, as it's easier to read
    • in Modbus ASCII, hex number 0F will be represented as "0" and "F" (2 chars, so 2 bytes), then being sent over. Modbus RTY will just do 0F in 1 byte binary.
  2. Modbus RTU: (remote terminal unit)

    • Master initiates: |Address of Slave, 1B|Function Code (action being take, read or write) 1B| number of data | required data (nB) | CRC low byte, CRC high Byte|
      1. address 1B + 3.5 byte time gap. This time, all slaves will decode and see if this is sent to them
  3. 4 types of registers (Data types). Each type of register's address starts from 0, so writing to them individually doesn't affect them

    1. Coil - aka bit (coil on a relay switch), R/W
    2. Discrete input - 1bit, read only
    3. input registers - 16bits, read only
    4. Holding register, 16 bit, read write
  4. Funciton codes (1byte): specify which operation to be done on which registers. E.g,

    • Public ones, defined in modbus manual
      • 0x01, batch reading register 0 (coils)
      • 0x02, batch reading register 1 (discrete input digits)
      • ...
    • User defined ones: see register map
  5. PDU (Protocol Data unit): function code + payload

    • Request: start address and count of registers
    • Response: each register's data is packed into 2 bytes. first bytes are high, seconds bytes are low.
  6. Modbus RTU over TCP/IP: no more slave address, and CRC, but with function code and payload. Because TCP already handles the two for you.

    • Modbus is half duplex (one person can talk at a time, 半双工), but TCP is full duplex (全双工)
⚠️ **GitHub.com Fallback** ⚠️