Networking - RicoJia/notes GitHub Wiki
========================================================================
========================================================================
- Books: If you are looking for intro, these Zhihu posts are good
- flask webdevelopment
- Templates (HTML code with logic)
- comes from template engine, called Jinja
- great for prototyping, while FastAPI is natively asynchronous
- Use WSGI
- Templates (HTML code with logic)
- scrapy?
- SQL (Structured Query Language) is relational. Not suitable for Big Data, where there's a lot of concurrency. MongoDB (nosql).
- flask webdevelopment
========================================================================
========================================================================
-
Web Structure:
%% graph TD flowchart LR C(client) --> N(nginx server) N --> U(uwsgi server) %% This is how you highlight classDef measurement stroke-width:4px class N,U measurement
- WSGI (Python Web Server Gateway Interface) can talk to nginx
- challenge in front end: new features may not be supported by browser.
- CSS (how things look, there are selectors we can use.)
- Javascript: powerful, supports gadgets without reloading pages, etc.
- mozilla dev tools: , then you can edit on the page. refresh the page then they will come back
-
Networking Basics
- Lan (local area network) vs WAN (wide area network)
- NAT: network address translation: rerouting packets to a routing server
- usually in physical lan, route requests from one IP to corresponding backend server
- every time docker starts, it tries to get on to a new subnet. So docker network create mynetwork creates that new subnet.
docker network inspect mynetwork
- docker network inspect: gives subnet, and ID;
- config ip a should
- when you do
docker run -p 80:80
to run nginx, then doIP address:80
, you can see it
- docker network inspect: gives subnet, and ID;
- every time docker starts, it tries to get on to a new subnet. So docker network create mynetwork creates that new subnet.
- bridge vs natting: br- in IP
- wireshark can run tcpdump, but need to run with sudo
- capture everything, then filter
- linux networking: create a VPN: see network interface. Route network interface to the new?
- Pritunl: a new network interface. Bridge
- create a new subnet
- External network interface
- wireshark can run tcpdump, but need to run with sudo
- usually in physical lan, route requests from one IP to corresponding backend server
- MAC (Medium Access Protocol): distinguish specific devices; Each device has a unique address
- SSH is on application layer
- Network interface: every device has one. --> switch (like a network port expander.) --> Router (will assign IPs to MAC.)
- Could be newtwork interface card, or operating system level software. Includes virtual interfaces like loopback.
- Identified by a unique MAC address
- a switch is a network bridge, combining two networks together, with inbound and outbound ethernet ports. Analyze incoming traffic, and direct that to the device.
- An old network hub will blast all the traffic to all outbounds ports, shortcoming!
-
Ip address is part of TCP/IP protocol. TCP is transmission control protocol, IP is internet protocol.
- An IP address can be public, assigned by ISP, or private, assigned by router.
-
192.168.x.x
is the "localhost" of a server on the same network - Port is a 1 - 2^16 number, used for differentiating application on the same ip address.
- IP Address + Port number = socket.
- Webservers always use 80 as their port number. But this can be randomly assigned. Since there's only 65536 number, this is not really a safety feature.
- Once a connection is gone, the socket should be cleaned.
-
- a switch is faster than a router
- Lo
- In ifconfig, lo means "loopback interface", which is the physical connection bw the device with itself or another device of the same type.
- 127.0.0.1 is called the "loop back" address. No external machine can be connected to that
- 0.0.0.0 is accessible to all external machines
- A loop back test is used to test if the device can send/receive signals, for hardware failures.
- It's always 127.0.0.1, which is called localhost during web server setup. (the local machine)
- It's not the address your computer uses to communicate with other machines.
- In ifconfig, lo means "loopback interface", which is the physical connection bw the device with itself or another device of the same type.
- An IP address can be public, assigned by ISP, or private, assigned by router.
-
Half duplex, full duplex, multiduplex
- HTTP is half duplex.
-
Polling: client periodically asks the server for updates. HTTP 1/2 has persistent connection that last for a few request-responses. THe server will send a response back, but for updates, client still needs to poll.
- short polling: if no updates from server, server won't hold it
- long polling: if there are updates, then server will hold it
- Websocket is Full duplex is full bidirectional, with a long-lived connection. so once there's an update, the client gets it immediately. No more polling
-
HTTP Request: top of tcp/ip.
- GET has everything in HTTP header, for queries, no side effects. Can be saved by browser
2. GET
curl http://10.0.1.85/L
is doing GET /L``` - HEAD: ask for header of a page.
- POST has content in HTTP content, for changing something on the server. Cannot be saved by browser, so safer
- PUT: ask the server to replace something, with the content
- DELETE: delete a page
- GET has everything in HTTP header, for queries, no side effects. Can be saved by browser
2. GET
-
what is a socket?
- an "endpoint" (end point is not a networking term) of a bi-directional inter-process communication flow.
- an endpoint = IP address + port number. A socket is bound to an endpoint.
-
IP address is 4 bytes.
-
Every TCP has 2 endpoints (does that mean I need to create 2 sockets??) -A: true, your lib creates 1 endpoint for a TCP socket . But the endpoint at the client's side is implemented by them. Additionally, you're using ServerSocket, which implements a socket servers can use to listen on a port and accept connections.
-
Once we accept a connection, a new socket is created bound to the same local point, and has a remote endpoint set to the client address and port.
-
Can we connect this to the web?
- A: unless you use URL sockets, which is a related class.
-
Make sure your socket doesn't clash with other nodes.
-
-
RTP, RTSP, TCP RTSP controls videos streaming (like play, pause, etc.) over RTP. RTP handles synchronization of media streams, identifying lost packet (there's a packet number to each packet). RTP can be transmitted over UDP or TCP
========================================================================
========================================================================
-
Router
- has a Route table, where to send packets to. It is a set of rules
- local network VLANS (virtual local area networks, subnets)
- easy to find them on router
- ARP (address resolution protocol) scans could be only limited to the current segment. ARP scans are using.
- ARP: Address Resolution Protocol: Purpose: find MAC address of machine:
- Two devices communicate with each other must know their MAC addresses.
- If given IP address, send an ARP request to the device to find MAC
- The device can choose not to respond
- If not, broadcast; then update ARP table.
- It's ARP protocol, not ICMP
- ARP-scan is network discovery, scanning the current network segment
- ARP: Address Resolution Protocol: Purpose: find MAC address of machine:
- ICMP protocol (Internet Control Message Protocol, same layer as IP), ping. Used for diagnostics, no data
- A proxy: IPv6 doesn't have private address: they could be routable through: 10 years ago. only Ipv4 has private IP. private ip address is not routable
-
VPN-Firewall
- palo alto firewall is great capacity, 7000, vs Unify, by cisco, $80000. malware can reach all other devices. DNS filtering, different levels of subscriptions, come with global protect suite. Blocking other vpn tools.
- Split tunel
- specific traffic will be "routed". VPN takes over internet gateway (NAT), handling the subnet. Subnet block
-
Docker Container
- Network is the same as the linux networking.
- network resolution
========================================================================
========================================================================
-
Workflow
- listening to socket (socket() -> bind() -> listen() -> accept()->block until connection is established)
- client is socket() -> connect()
- read() (while client is writing)
- client is write()
- write() (analyze request -> if valid, calculate (run CGI and return result, or just return text) -> make HTTP request)
- client is read()
- read() (wait for client to send finish)
- client sends eof
- close
- listening to socket (socket() -> bind() -> listen() -> accept()->block until connection is established)
-
Definitions:
- Server listening on a port and an iP.
- clients will send reqeusts
- Socket is communication over a network, for multoiple processes
- each process should create a Socket
- Only when the socket, socket type, and address domain match can two processes communicate
- Address Domain: (either or)
- Unix domain
- two processes sharing the same file Systems
- address of socket is a char string, entry in the file Systems
- Internet Domain (Most common)
- Two computers (hostmachines)
- Address of socket is the IP address
- Unix domain
- Socket type
- TCP (aka stream socket)
- continuous
- like a pipe, bi-directional.
- Any number of bytes
- all the ordering is correct
- does error-checking (order is correct)
- the other end will get notified if one closes/resets the connection. If connection is lost, then reset. Heavy
- continuous
- UDP (USER datagram protocol, aka datagram socket)
- message is sent at one shot (8kb)
- no error-checking
- So light weight, but unliable
- TCP (aka stream socket)
- Address Domain: (either or)
-
Serialize & deserialization, Good Read
- nanopb (serialization) vs protobuf (deserialization) is for transmitting a data structure over the wire
- protobuf (proto buffer)
- invented by google, better than JSON, XML for smaller data packets
- is compressed binary rather than raw string
- Not human readable. But JSON, XML is.
- definition of data to be serialized is in .proto file. The configs are called msgs.
- How it works: they use this pre-determined schema on two ends of transmission ,
then, they start just using the numbers for transmission.
syntax = "proto3"; message Person { uint64 id = 1; string email = 2; bool is_active = 3; enum PhoneType { MOBILE = 0; HOME = 1; WORK = 2; } message PhoneNumber { string number = 1; PhoneType type = 2; } repeated PhoneNumber phones = 4; }
- uint64, bool, float are sent as scalars; enum is a different type; PhoneNumber is a different type too
- repeated means array. But before proto3, there's required and optional
- 1,2,4 are field tags, so they can be used for replacing the actual words.
- How it works: they use this pre-determined schema on two ends of transmission ,
then, they start just using the numbers for transmission.
- protobuf can be compiled to generate code in the user's programming language.
- Might be too easy to crack, like pokemon go using MitM Attack
- invented by google, better than JSON, XML for smaller data packets
-
Broadcasting:
- This is only in UDP. Everytime someone comes on, they broadcast to all machines.
- You can broadcast to exchange? (network switch)
- This is only in UDP. Everytime someone comes on, they broadcast to all machines.
-
Websocket
- If already joined a room, rejoining it will block.
- socket will try connecting in a lazy fashion
- Create a socket
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) // IPv4 or IPv6 addresses, socket type, or TCP/UTP. Returns a descriptor referencing the socket. close(server_socket or client_socket)
- port number: 18000, 80 is default, but you sometimes need super user priviledges.
- socket is just allocating resources. SOCK_STREAM is like a file stream, but that goes to client.
- Bind a socket
if (bind(server_socket, (SOCKADDR*)&service, sizeof(service)) == SOCKET_ERROR) {printf( bind_failed); closesocket(server_socket);} ) sockaddr_in service; //structure for storing IPv4 service.sin_family AF_INET //address family service.sin_addr.s_addr = INADDR_ANY; // service.sin_port=htons(5555) //this port number should be > 1024. #Remember addr can come from either the same machine or different machines on the network.
- Listen Function
- listening: telling the operating system to listen.
if (listen(server_socket, request_queue_size) == SOCKET_ERROR)) printf("ERROR listening on socket");
- Accept Connection
SOCKADDR_IN client_addr SOCKET client_socket int clientAddrSize = sizeof(clientAddr) client_socket = accept (server_socket, (SOCKADDR*)&clientAddr, &clientAddrSize) != INVALID_SOCKET #wrapper with the client socket and the connection itself.
- receive
recv(client_socket, buffer, sizeof(buffer), 0)
- disconnect
- 3-6 can be summarized as
while loop: confd = accept(NULL), null is to accept any sort of connection (a single HTTP client request?); // confd is what we actually use to the client. The original socket is still olistening. accept is blocking. memset(recvline, 0, MAXLINE) #reset the char buffer while (n = read(confd, recvline)){ bin2hex(recvline); //HTTP request is binary. convert that to HEX } snprintf(buff, "Hello") //server write a response to a buffer. write(confd, buff) close(connfd)
-
SHould be able to work with a web browser
- build using cmake, run it.
- open browser, localhost:port_num; if no port_num specified, then it's by default 80.
- 3-6 can be summarized as
Custom Lib - crow
- What does crow do? - create websocket, webservers, etc.
- Add route (end points) to your app ??
CROW_ROUTE(app, "/")([](){ return "Hello world";});
- run your app, once happy with route settings
app.port(18080).multithreaded().run();
========================================================================
========================================================================
-
VPC is like a parking log
- each section of the VPC is a subnet
- a subnet can have a public facing route
- Availability zone: choose your data center?
- 2 types
- site-to-site (connect remote offices to corporate network),
- VPN connect the robot and the laptop;
- client-to-client
- client VPN: single client, need VPN client
- site-to-site (connect remote offices to corporate network),
-
VPN server: t2 micro is free tier (Can allow 2 machines max)
- Set up an Amazon VPN server. video
- Getting into ssh server:
sudo ssh -i "vpn.pem" [email protected]
-
vpn.pem
should be downloaded there.
-
- Choose "use routing" instead of "use NAT"
- In
advanced VPN settings
, enableInter-Client Communication
- Getting into ssh server:
- On client computers: Install Client
sudo wget https://swupdate.openvpn.net/repos/openvpn-repo-pkg-key.pub sudo apt-key add openvpn-repo-pkg-key.pub
- How to connect
- Go to EC2 AWS Console
- find Public IPv4 Address, under
EC2->Instances->i-0d24ca8ee730a9a8c
- Admin:
https://3.22.234.161:943/admin
. username isopenvpn
- For client, do
https://3.22.234.161:943/
, username is alsoopenvpn
- On local machine,
openvpn3 session-start --config profile-4.ovpn
- To Disconnect:
openvpn3 session-manage --session-path /net/openvpn/v3/sessions/3d98ca84sd46cs437ds9b80s5484f1bc6726 --disconnect
- or
openvpn3 session-manage --session-path $(openvpn3 sessions-list | grep "Path" | sed 's/Path://g') --disconnect
- or
- How to connect on rpi:
- Install OpenVPN
- Get the openvpn profile.
sudo openvpn --config profile-4.ovpn
- Set up an Amazon VPN server. video
========================================================================
======================================================================== Whole Process 1. edge device: car with 50 CPU, with quite some computing power 2. edge server: IT equipment to monitor production process. Talk to many edge devices. Each is run in containers. (distributed by K8S) 3. security matters, because edge server is not in our central datacenter
========================================================================
========================================================================
-
APplication, User apps like web-browser,
-
Presentation Layer: compress & encrypt data, like JPEG, MPEG4S, SSH, etc.
- encodings such as big-Endian, Little-Endian, UTF8/UTF-16
-
Session: opens & close "sessopms" with the host. VIrtual connection (aka transport connection) is establised, no network communication. (lower layers will send one-time msgs.)
- the whole big "sessions" where smaller sessions "transport" sessions and the failure communication take place.
- RPC is for remote computer; SQL is for database; x windows
- the whole big "sessions" where smaller sessions "transport" sessions and the failure communication take place.
-
Transport Layer
- packages, addresses, sends data from session layer (TCP, UDP is here)
- Firewall
- TLS, transport layer protocal is a successor to SSL. Has digital certs: authenticate server, HTTPS (HTTP secure) is over TLS
-
Network Layer: main purpose: finds the way thru.
- routes: finds how to get the packet to the socket. Then frame it, and reassemble packets of data (so it will become a complete packet again)
- logical addressing (ipv4, ipv6)
- data enc
- routes: finds how to get the packet to the socket. Then frame it, and reassemble packets of data (so it will become a complete packet again)
-
Data Link Layer
- It is a network driver, that controls network card (hardware)
- Manages Request from 2 competitive layers
- handles physical addressing of connection.
- Mac Sublayer (media access control): provides network interaction
- device network interaction
- LLC Sublayer (logic link control )
- addressing and multiplexing ()
- It is a network driver, that controls network card (hardware)
-
Physical Layer: network card
-
Worth mentioning that TCP/IP predates OSI, so it uses Internet Protocol Suite. Just a different partitioning of things
-
each network is isolated from each other. Inside each network, addresses are unique.
- To connect different subnet, we have NAT (Network Address Translation)
- I.e., 192.168.0.1 can be in every single subnet
-
IPV4 vs IPV6:
- IPV4 192.168.0.1 is IPV4, 1100 0000 - 1010 1000 - 0000 0000 - 0000 0101 (32 bits)
- Where as IPV6: 12 03-8f e0 - fe 80 - b897-8990-8a7c-99bf-323d... (128 bits)
-
subnetting:
-
192.168.0
,192.168.1
are subnets. then you have the hosts.
-
========================================================================
========================================================================
- netstat -plntc
-
sudo arp-scan
to scan devices on the same network
========================================================================
========================================================================
- tags
-
<strong>
: bold <a href="ref">some text</a>
-
========================================================================
========================================================================
What is MQTT:
- We have one broker, and everybody else is a client.
Client -- publish --> broker <-- subscribe -- subscriber 1, subscriber 2...
- The broker can be run on a local machine, or in private AWS, Azure, etc, or a public one
- Typical workflow:
1. Clients establish TCP/IP connection with broker
1. Client publishes to a certain topic. Broker receives it, then direct it to the listening clients
MQTT is great for IoT applications
- has a very light weight client implementation, .
- Its message headers are very short
- has 3 Quality of Security (QoS) levels:
- QoS 0 : message is delivered at most once (so it can get lost)
- QoS 1: message is delivered at least once
- Has TLS (transport layer security), and SSL encryption.
- Can choose between clean session or persistent session, after client reconnecting to a previously disconnected session
- clean session: when reconnected, messages published while it's offline is not available
- Will
- message to be sent after a certain time after a disconnection
MQTT topic subscription supports wild cards:
- topic/+
supports listening to topics/x
, topic/y
- topic/#
supports topic/x/a
, topic/y/b
, etc.
- These wild cards cannot be used when publishing
When Creating an MQTT client: - ServerURL, port (SSL is 8883) - end point - Cert, key - on_connection_interrupted, on_connection_resumed - client id - clean_session, keep_alive_secs - on_connection_success, on_connection_failure, on_connection_closed
- Question:
sslopts.set_trust_store("/path/to/ca.crt"); sslopts.set_key_store("/path/to/client.crt"); sslopts.set_private_key("/path/to/client.key"); sslopts.set_enabled_cipher_suites("TLSv1.2"); sslopts.set_enable_server_cert_auth(true);
./build/basic-connect --endpoint "akx2axdo1d3zl-ats.iot.us-east-1.amazonaws.com" --cert "./pcdu0/pcdu0-device.pem.crt" --key "./pcdu0/pcdu0-private.pem.key" --ca_file "./Amazon-root-CA-1.pem" --port_override 8883 --client_id "pcdu0"
========================================================================
======================================================================== guide: https://www.debugself.com/2020/01/01/modbus_guide/
-
3 main types of Modbus interfaces: ASCII, TCP/IP (over ethernet), RTU (over RS-232, 485).
- RS-232 is older, and only meant for point-point, short range. RS-485 is point-multiple point, long range
- Represent binary as hex, as it's easier to read
- in Modbus ASCII, hex number
0F
will be represented as "0" and "F" (2 chars, so 2 bytes), then being sent over. Modbus RTY will just do0F
in 1 byte binary.
-
Modbus RTU: (remote terminal unit)
- Master initiates: |Address of Slave, 1B|Function Code (action being take, read or write) 1B| number of data | required data (nB) | CRC low byte, CRC high Byte|
- address 1B + 3.5 byte time gap. This time, all slaves will decode and see if this is sent to them
- Master initiates: |Address of Slave, 1B|Function Code (action being take, read or write) 1B| number of data | required data (nB) | CRC low byte, CRC high Byte|
-
4 types of registers (Data types). Each type of register's address starts from 0, so writing to them individually doesn't affect them
- Coil - aka bit (coil on a relay switch), R/W
- Discrete input - 1bit, read only
- input registers - 16bits, read only
- Holding register, 16 bit, read write
-
Funciton codes (1byte): specify which operation to be done on which registers. E.g,
- Public ones, defined in modbus manual
- 0x01, batch reading register 0 (coils)
- 0x02, batch reading register 1 (discrete input digits)
- ...
- User defined ones: see register map
- Public ones, defined in modbus manual
-
PDU (Protocol Data unit): function code + payload
- Request: start address and count of registers
- Response: each register's data is packed into 2 bytes. first bytes are high, seconds bytes are low.
-
Modbus RTU over TCP/IP: no more slave address, and CRC, but with function code and payload. Because TCP already handles the two for you.
- Modbus is half duplex (one person can talk at a time, 半双工), but TCP is full duplex (全双工)