Session 11: HTTP protocol - myTeachingURJC/2018-19-PNE GitHub Wiki
-
Goals:
- Learn about the HTTP protocol
- Write our first web server using sockets
- Duration: 2h
- Date: Week 6: Tuesday, Feb-26th-2019
- This session consist of the teacher's guidelines for driving the Lecture, following the learn by doing approach
- Introduction to the HTTP protocol
- Creating our first http server
- HTML
- Exercise 1
- Authors
- Credits
- License
- HTTP protocol is the language spoken between a browser (client) and a web server
- This is our general scenario, in which there is a communication between one client and one server. There are two kinds of sockets: one just for listening to new connection on the server (Red dot), and others for interchanging data between the client and the server (blue dots)
Let's understand what is happening when a browser connects to a web server for viewing a web page. This is the initial scenario:
The client is the browser running in our device (computer, mobile, tablet...). the server is running in another computer on the internet. It is waiting for the clients to connect
When we write an URL in the browser, we are requesting a web page from the server. The client creates a socket and establish a connection with the server. The server creates a new socket (clientsocket) for interchanging data with the client (in both directions). The original sockets continues listening for new connections
Now the client and server can communicate by means of the "blue" sockets. When they write to the sockets, the data is sent. When they read from them, the data is received. There is a bidirectional communication channel established
The client takes the initiative (always) and sends a request message for obtaining the web page that the user wants to see
The server receives the request message and reads the html file from the hard disk
The server builds a response message, composed of different fields. The HTML contents are located in the end of the message
The client receive the html content and shows it on the screen
There are two types of messages in HTTP: Request and response. They both have the same format: They consist of Lines in plain text (strings) separated by the special characters '\n' and '\r'
The lines are divided into two parts: the heather and the body. There is a blank line for separating both elements
This is the format of the Request messages
And this is an example of a real message:
In this example, there is no body (it is empty)
This is the format of the response message. It is the same than for the request message
This is an example of a response message:
Let's create our first HTTP server, step by step, learning while doing
We start from a simple server, from the previous week, that just receives the request message and print it on the console. It does no generates a response yet. We already know the code:
import socket
import termcolor
# Change this IP to yours!!!!!
IP = "192.168.124.41"
PORT = 8089
MAX_OPEN_REQUESTS = 5
def process_client(cs):
"""Process the client request.
Parameters: cs: socket for communicating with the client"""
# Read client message. Decode it as a string
msg = cs.recv(2048).decode("utf-8")
# Print the received message, for debugging
print()
print("Request message: ")
termcolor.cprint(msg, 'green')
# Close the socket
cs.close()
# MAIN PROGRAM
# create an INET, STREAMing socket
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Bind the socket to the IP and PORT
serversocket.bind((IP, PORT))
# Configure the server sockets
# MAX_OPEN_REQUESTS connect requests before refusing outside connections
serversocket.listen(MAX_OPEN_REQUESTS)
print("Socket ready: {}".format(serversocket))
while True:
# accept connections from outside
# The server is waiting for connections
print("Waiting for connections at {}, {} ".format(IP, PORT))
(clientsocket, address) = serversocket.accept()
# Connection received. A new socket is returned for communicating with the client
print("Attending connections from client: {}".format(address))
# Service the client
process_client(clientsocket)
Notice that we are using the termcolor library for highlighting the message received from the client
For testing it, open a new tab in your browser and type it:
http://192.168.124.41:8089/
Now have a look at the console in pycharm. You can see in green the request message sent by the browser
Notice that there appear many request messages (all the same). This is because we have not generate a response to the client's request messages. The browser re-sends the request messages many times, until there is a timeout and the browser writes an error message
This is the request message received from the browser:
GET / HTTP/1.1
Host: 192.168.124.41:8089
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Cache-Control: max-age=0
Our response message should have the following format:
- Status line. We will inform the browser that everything went well. The typicall status line is like this:
HTTP/1.1 200 OK\n
- The header should contain at least two elements:
- Content-Type: This is for indicating the type of content return by the server. It will be typically text/html (but can also be image/png in the case of sending back an image in png format)
- Content-Length: It indicates the total length of the information sent in the body of the response
- The body with the contents we are sending to the browser
In our server we will generate a simple response, which contents are the string: "Hello from my first server!"
import socket
import termcolor
IP = "192.168.124.41"
PORT = 8090
MAX_OPEN_REQUESTS = 5
def process_client(cs):
"""Process the client request.
Parameters: cs: socket for communicating with the client"""
# Read client message. Decode it as a string
msg = cs.recv(2048).decode("utf-8")
# Print the received message, for debugging
print()
print("Request message: ")
termcolor.cprint(msg, 'green')
# Build the HTTP response message. It has the following lines
# Status line
# header
# blank line
# Body (content to send)
contents = "Hello from my first server!"
# -- Everything is OK
status_line = "HTTP/1.1 200 OK\r\n"
# -- Build the header
header = "Content-Type: text/plain\r\n"
header += "Content-Length: {}\r\n".format(len(str.encode(contents)))
# -- Build the message by joining together all the parts
response_msg = str.encode(status_line + header + "\r\n" + contents)
cs.send(response_msg)
# Close the socket
cs.close()
# MAIN PROGRAM
# create an INET, STREAMing socket
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Bind the socket to the IP and PORT
serversocket.bind((IP, PORT))
# Configure the server sockets
# MAX_OPEN_REQUESTS connect requests before refusing outside connections
serversocket.listen(MAX_OPEN_REQUESTS)
print("Socket ready: {}".format(serversocket))
while True:
# accept connections from outside
# The server is waiting for connections
print("Waiting for connections at {}, {} ".format(IP, PORT))
(clientsocket, address) = serversocket.accept()
# Connection received. A new socket is returned for communicating with the client
print("Attending connections from client: {}".format(address))
# Service the client
process_client(clientsocket)
Now we can see the answer in the browser!. Our first mini-web server is working!!! :-)
Let's analyze the information we have in our console. We can see that we have received two requests. The first request message is:
GET / HTTP/1.1
Host: 192.168.124.41:8090
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1
The first line is the request line, the most important part. We can ignore the rest of the message. Our request line is this one:
GET / HTTP/1.1
It has three parts:
- The method: This first word is called the method. It indicates the operation that the client needs. In this case is a GET method. It means that the client wants to have access to some resource
- The resource: The second word is the resource. The meaning of the "/" resource is: "I want to have access to your main page"
- The HTTP version that is being used
In our case, the browser wants to get our main page with the first request
The second request message is this one:
GET /favicon.ico HTTP/1.1
Host: 192.168.124.41:8090
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Let's focused only on the request line:
GET /favicon.ico HTTP/1.1
The server is asking for the resource /favicon.ico. The favicon is a short image file that stores the icon of the webpage you are accessing. We are ignoring this request
Let's response with our first web page written in HTML. We know nothing about HTML. It is the language used for the creating web pages,that describes the structure of the document
In our server we are changing the contents. Instead of responding with a string, we will send a message in HTML. It is important to change the Content-type header from text/plain to text/html for indicating that we are sending HTML code instead of plain text
import socket
import termcolor
IP = "192.168.124.41"
PORT = 8080
MAX_OPEN_REQUESTS = 5
def process_client(cs):
"""Process the client request.
Parameters: cs: socket for communicating with the client"""
# Read client message. Decode it as a string
msg = cs.recv(2048).decode("utf-8")
# Print the received message, for debugging
print()
print("Request message: ")
termcolor.cprint(msg, 'green')
# Build the HTTP response message. It has the following lines
# Status line
# header
# blank line
# Body (content to send)
# This new contents are written in HTML language
contents = """
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
<meta charset="utf-8">
<title>Green server</title>
</head>
<body style="background-color: lightgreen;">
<h1>GREEN SERVER</h1>
<p>I am the Green Server! :-)</p>
</body>
</html>
"""
# -- Everything is OK
status_line = "HTTP/1.1 200 OK\r\n"
# -- Build the header
header = "Content-Type: text/html\r\n"
header += "Content-Length: {}\r\n".format(len(str.encode(contents)))
# -- Build the message by joining together all the parts
response_msg = str.encode(status_line + header + "\r\n" + contents)
cs.send(response_msg)
# Close the socket
cs.close()
# MAIN PROGRAM
# create an INET, STREAMing socket
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Bind the socket to the IP and PORT
serversocket.bind((IP, PORT))
# Configure the server sockets
# MAX_OPEN_REQUESTS connect requests before refusing outside connections
serversocket.listen(MAX_OPEN_REQUESTS)
print("Socket ready: {}".format(serversocket))
while True:
# accept connections from outside
# The server is waiting for connections
print("Waiting for connections at {}, {} ".format(IP, PORT))
(clientsocket, address) = serversocket.accept()
# Connection received. A new socket is returned for communicating with the client
print("Attending connections from client: {}".format(address))
# Service the client
process_client(clientsocket)
Now we will see a different page in the browser:
HTML is a special language used for defining the structure and the contents of the web pages. It consist of text inside tags. There is always an opening tag and a closing tag. This is the HTML code for the green server we used in the previous example (green-server.html)
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
<meta charset="utf-8">
<title>Green server</title>
</head>
<body style="background-color: lightgreen;">
<h1>GREEN SERVER</h1>
<p>I am the Green Server! :-)</p>
</body>
</html>
- HTML documents should always start with the special tag: <!DOCTYPE html>
- The rest of the html code is inside the <html> and </html> tags
- Every html document consist of two parts: the head and the body
- The head contains information about the document for the brower
- The actual content is located in the body
- In this example there are two elementos inside the body:
- The heading: GREEN SERVER. It is a bigger text
- A paragraph: "I am the green server"
- The background color of the elements in the body is set inside the style attribute
- You can learn more about html following this tutorials from the w3school
- You also can learn more HTML in this notes that I prepared for the CSAAI subject (in spanish)
- Modify the green server so that the HTML contents are read from the index.html file
- Change the index.html to check that a new page is generated if this file is modified
- If you finish this exercise, you can start with the practice 4
- Boni García
- Juan González-Gómez (Obijuan)
- Alvaro del Castillo. He designed and created the original content of this subject. Thanks a lot :-)