Session 11: HTTP protocol - myTeachingURJC/2018-19-PNE GitHub Wiki

  • Goals:
    • Learn about the HTTP protocol
    • Write our first web server using sockets
  • Duration: 2h
  • Date: Week 6: Tuesday, Feb-26th-2019
  • This session consist of the teacher's guidelines for driving the Lecture, following the learn by doing approach

Contents

Introduction to the HTTP protocol

  • HTTP protocol is the language spoken between a browser (client) and a web server
  • This is our general scenario, in which there is a communication between one client and one server. There are two kinds of sockets: one just for listening to new connection on the server (Red dot), and others for interchanging data between the client and the server (blue dots)

Requesting a web page

Let's understand what is happening when a browser connects to a web server for viewing a web page. This is the initial scenario:

The client is the browser running in our device (computer, mobile, tablet...). the server is running in another computer on the internet. It is waiting for the clients to connect

Step 1: Connection establishment

When we write an URL in the browser, we are requesting a web page from the server. The client creates a socket and establish a connection with the server. The server creates a new socket (clientsocket) for interchanging data with the client (in both directions). The original sockets continues listening for new connections

Now the client and server can communicate by means of the "blue" sockets. When they write to the sockets, the data is sent. When they read from them, the data is received. There is a bidirectional communication channel established

Step 2: The client sends a request message for a web page

The client takes the initiative (always) and sends a request message for obtaining the web page that the user wants to see

Step 3: The server reads the page from the disk

The server receives the request message and reads the html file from the hard disk

Step 4: The server sends a response message

The server builds a response message, composed of different fields. The HTML contents are located in the end of the message

Step 5: The browser renders the page on the screen

The client receive the html content and shows it on the screen

HTTP messages

There are two types of messages in HTTP: Request and response. They both have the same format: They consist of Lines in plain text (strings) separated by the special characters '\n' and '\r'

The lines are divided into two parts: the heather and the body. There is a blank line for separating both elements

Request messages

This is the format of the Request messages

And this is an example of a real message:

In this example, there is no body (it is empty)

Response messages

This is the format of the response message. It is the same than for the request message

This is an example of a response message:

Creating our first HTTP server

Let's create our first HTTP server, step by step, learning while doing

Reading the browser's request message

We start from a simple server, from the previous week, that just receives the request message and print it on the console. It does no generates a response yet. We already know the code:

import socket
import termcolor

# Change this IP to yours!!!!!
IP = "192.168.124.41"
PORT = 8089
MAX_OPEN_REQUESTS = 5


def process_client(cs):
    """Process the client request.
    Parameters:  cs: socket for communicating with the client"""

    # Read client message. Decode it as a string
    msg = cs.recv(2048).decode("utf-8")

    # Print the received message, for debugging
    print()
    print("Request message: ")
    termcolor.cprint(msg, 'green')

    # Close the socket
    cs.close()


# MAIN PROGRAM

# create an INET, STREAMing socket
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# Bind the socket to the IP and PORT
serversocket.bind((IP, PORT))

# Configure the server sockets
# MAX_OPEN_REQUESTS connect requests before refusing outside connections
serversocket.listen(MAX_OPEN_REQUESTS)

print("Socket ready: {}".format(serversocket))

while True:
    # accept connections from outside
    # The server is waiting for connections
    print("Waiting for connections at {}, {} ".format(IP, PORT))
    (clientsocket, address) = serversocket.accept()

    # Connection received. A new socket is returned for communicating with the client
    print("Attending connections from client: {}".format(address))

    # Service the client
    process_client(clientsocket)

Notice that we are using the termcolor library for highlighting the message received from the client

For testing it, open a new tab in your browser and type it:

http://192.168.124.41:8089/

Now have a look at the console in pycharm. You can see in green the request message sent by the browser

Notice that there appear many request messages (all the same). This is because we have not generate a response to the client's request messages. The browser re-sends the request messages many times, until there is a timeout and the browser writes an error message

This is the request message received from the browser:

GET / HTTP/1.1
Host: 192.168.124.41:8089
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Cache-Control: max-age=0

Sending a simple response message

Our response message should have the following format:

  • Status line. We will inform the browser that everything went well. The typicall status line is like this:
HTTP/1.1 200 OK\n
  • The header should contain at least two elements:
    • Content-Type: This is for indicating the type of content return by the server. It will be typically text/html (but can also be image/png in the case of sending back an image in png format)
    • Content-Length: It indicates the total length of the information sent in the body of the response
  • The body with the contents we are sending to the browser

In our server we will generate a simple response, which contents are the string: "Hello from my first server!"

import socket
import termcolor

IP = "192.168.124.41"
PORT = 8090
MAX_OPEN_REQUESTS = 5


def process_client(cs):
    """Process the client request.
    Parameters:  cs: socket for communicating with the client"""

    # Read client message. Decode it as a string
    msg = cs.recv(2048).decode("utf-8")

    # Print the received message, for debugging
    print()
    print("Request message: ")
    termcolor.cprint(msg, 'green')

    # Build the HTTP response message. It has the following lines
    # Status line
    # header
    # blank line
    # Body (content to send)

    contents = "Hello from my first server!"

    # -- Everything is OK
    status_line = "HTTP/1.1 200 OK\r\n"

    # -- Build the header
    header = "Content-Type: text/plain\r\n"
    header += "Content-Length: {}\r\n".format(len(str.encode(contents)))

    # -- Build the message by joining together all the parts
    response_msg = str.encode(status_line + header + "\r\n" + contents)
    cs.send(response_msg)

    # Close the socket
    cs.close()


# MAIN PROGRAM

# create an INET, STREAMing socket
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# Bind the socket to the IP and PORT
serversocket.bind((IP, PORT))

# Configure the server sockets
# MAX_OPEN_REQUESTS connect requests before refusing outside connections
serversocket.listen(MAX_OPEN_REQUESTS)

print("Socket ready: {}".format(serversocket))

while True:
    # accept connections from outside
    # The server is waiting for connections
    print("Waiting for connections at {}, {} ".format(IP, PORT))
    (clientsocket, address) = serversocket.accept()

    # Connection received. A new socket is returned for communicating with the client
    print("Attending connections from client: {}".format(address))

    # Service the client
    process_client(clientsocket)

Now we can see the answer in the browser!. Our first mini-web server is working!!! :-)

Let's analyze the information we have in our console. We can see that we have received two requests. The first request message is:

GET / HTTP/1.1
Host: 192.168.124.41:8090
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1

The first line is the request line, the most important part. We can ignore the rest of the message. Our request line is this one:

GET / HTTP/1.1

It has three parts:

  • The method: This first word is called the method. It indicates the operation that the client needs. In this case is a GET method. It means that the client wants to have access to some resource
  • The resource: The second word is the resource. The meaning of the "/" resource is: "I want to have access to your main page"
  • The HTTP version that is being used

In our case, the browser wants to get our main page with the first request

The second request message is this one:

GET /favicon.ico HTTP/1.1
Host: 192.168.124.41:8090
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

Let's focused only on the request line:

GET /favicon.ico HTTP/1.1

The server is asking for the resource /favicon.ico. The favicon is a short image file that stores the icon of the webpage you are accessing. We are ignoring this request

Response with HTML contents

Let's response with our first web page written in HTML. We know nothing about HTML. It is the language used for the creating web pages,that describes the structure of the document

In our server we are changing the contents. Instead of responding with a string, we will send a message in HTML. It is important to change the Content-type header from text/plain to text/html for indicating that we are sending HTML code instead of plain text

import socket
import termcolor

IP = "192.168.124.41"
PORT = 8080
MAX_OPEN_REQUESTS = 5


def process_client(cs):
    """Process the client request.
    Parameters:  cs: socket for communicating with the client"""

    # Read client message. Decode it as a string
    msg = cs.recv(2048).decode("utf-8")

    # Print the received message, for debugging
    print()
    print("Request message: ")
    termcolor.cprint(msg, 'green')

    # Build the HTTP response message. It has the following lines
    # Status line
    # header
    # blank line
    # Body (content to send)

    # This new contents are written in HTML language
    contents = """
    <!DOCTYPE html>
    <html lang="en" dir="ltr">
      <head>
        <meta charset="utf-8">
        <title>Green server</title>
      </head>
      <body style="background-color: lightgreen;">
        <h1>GREEN SERVER</h1>
        <p>I am the Green Server! :-)</p>
      </body>
    </html>
    """

    # -- Everything is OK
    status_line = "HTTP/1.1 200 OK\r\n"

    # -- Build the header
    header = "Content-Type: text/html\r\n"
    header += "Content-Length: {}\r\n".format(len(str.encode(contents)))

    # -- Build the message by joining together all the parts
    response_msg = str.encode(status_line + header + "\r\n" + contents)
    cs.send(response_msg)

    # Close the socket
    cs.close()


# MAIN PROGRAM

# create an INET, STREAMing socket
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# Bind the socket to the IP and PORT
serversocket.bind((IP, PORT))

# Configure the server sockets
# MAX_OPEN_REQUESTS connect requests before refusing outside connections
serversocket.listen(MAX_OPEN_REQUESTS)

print("Socket ready: {}".format(serversocket))

while True:
    # accept connections from outside
    # The server is waiting for connections
    print("Waiting for connections at {}, {} ".format(IP, PORT))
    (clientsocket, address) = serversocket.accept()

    # Connection received. A new socket is returned for communicating with the client
    print("Attending connections from client: {}".format(address))

    # Service the client
    process_client(clientsocket)

Now we will see a different page in the browser:

HTML

HTML is a special language used for defining the structure and the contents of the web pages. It consist of text inside tags. There is always an opening tag and a closing tag. This is the HTML code for the green server we used in the previous example (green-server.html)

<!DOCTYPE html>
<html lang="en" dir="ltr">
  <head>
    <meta charset="utf-8">
    <title>Green server</title>
  </head>
  <body style="background-color: lightgreen;">
    <h1>GREEN SERVER</h1>
    <p>I am the Green Server! :-)</p>
  </body>
</html>
  • HTML documents should always start with the special tag: <!DOCTYPE html>
  • The rest of the html code is inside the <html> and </html> tags
  • Every html document consist of two parts: the head and the body
  • The head contains information about the document for the brower
  • The actual content is located in the body
  • In this example there are two elementos inside the body:
    • The heading: GREEN SERVER. It is a bigger text
    • A paragraph: "I am the green server"
  • The background color of the elements in the body is set inside the style attribute
  • You can learn more about html following this tutorials from the w3school
  • You also can learn more HTML in this notes that I prepared for the CSAAI subject (in spanish)

Exercise 1

  • Modify the green server so that the HTML contents are read from the index.html file
  • Change the index.html to check that a new page is generated if this file is modified
  • If you finish this exercise, you can start with the practice 4

Authors

Credits

  • Alvaro del Castillo. He designed and created the original content of this subject. Thanks a lot :-)

License

Links

⚠️ **GitHub.com Fallback** ⚠️