Web Principles - LearningRbcRegistry/Wiki GitHub Wiki

#Web principles Antoine Bour / Oct 16*

###1. HTTP
#####a. Reminder over TCP/IP #####b. History #####c. HTTP Request
#####d. HTTP Response
###2. HTTPS
#####a. Secure Socket Layer
#####b. Pro and Cons
#####c. Certificates #####d. CSR, Certificate Signing Request. How to request officially for a certificate. ###3. Domain Name Server ###4. Web – Browser #####a. CORS: Cross Origin Resource Sharing #####b. Resource Building / Lifecycle #####c. HTTP Header Caching (.htaccess file) #####d. HTTP Server Side Caching #####e. CDN (Akamai, google CDN), Content Delivery Network #####f. SOAP vs REST #####g. Import a certificate on Was


###1. HTTP ###

a. Reminder over TCP/IP #####
Name Description
IP Internet Protocol = Sender Address + Receiver address + messageThis defines an IP routing that let define the communication between 2 pointsIP = 1500 octets max
IP address Unique receipt address from a computer
UDP/IP Protocol using IP + port Number + message(http server: 80 / telnet server: 23, email 1263 …)
TCP/IP UDP + Check that receiver is able to read the message + Cut of big messages in order to use IP + Check of messages by the receiver + rebuild of splitted messages + send of acknowledgement to the server
HTTP HyperText Transfer Protocol based on the OSI7 application layer. IT allows localized file transfer (url) from a browser to a webserver. Only a GET is possible Protocole://server-address:port/path/resource The V0.9 version was only to transfer html format between client and a web server. Since V1.0, the messages contains Header that indicates the content of the message (coded in MIME format). GET + POST + HEAD

#####b. History #####

1991 HTTP 0.9 Read only (GET)Html onlyConnection closed after the transfer is completed
1996 HTTP 1.0 HeadersNot only Html: images, …Read, Input, DeleteSeparate TCP connection for each request
1997 HTTP 1.1 Increased HEADER restrictions/possibilitiesPerformance optimizationsPersistant connection and request pipeliningInteraction possible with serverDynamic content possibleCaching mechanism
2015 HTTP 2.0 Improving transport performance

#####c. HTTP Request #####

  • HTTP Request = Request line + Request Header + request Body.

  • Request Line = Method + URL + protocol Version (1.1)

  • Request Header = Complement of Information (browser, OS, …)

  • Request body = Optional informations.

  • Exemple:

  • Available methods:

    • GET: Request From the resources at specified URL.
    • HEAD: Header request from the resources at specified URL.
    • POST: Send datas (to be accepted) at the application hosted at URL.
    • PUT: Send data (to be stored) to specific URL ( More infos)
    • DELETE: Delete the specified resource at specified URL
  • Some available request Headers:

Accept Content type accepted by the browser (MIME)
Authorization Identification of browser by the server
Content-Encoding Encoding type of the request body
From Email from client
Orig-Url Origin Url from the request
User-Agent Client's information like Name and browser version, OS..

header.png

####d. HTTP Response #####

  • HTTP Response = Status line (Protocol + status code + signification)
  • Header Response (Optional lines about the response or the server status)

  • Header Body (asked Document)

  • Exemple:

header2.png

  • Available Response Header:
Content-Encoding Description
Content –Language Type of response body
Date Transfer date start
Expires Consumption limit date
Forwarded Intermediary server between browser and server
Server Server characteristics
[…] (https://en.wikipedia.org/wiki/List_of_HTTP_header_fields#Response_fields)
  • Some Status codes:
Code Description
20X Success of the request
--- ---
40X Client Issue, REQUEST is wrong
50X Server issue
200 OK, request successful
201 CREATE, following a POST
202 ACCEOTED, request accepted but backend procedure not done (yet)
203 PARTIAL INFORMATION, after a GET indicates that the response is not complete
204 NO RESPONSE, received, but no informations to return from the server
400 BAD REQUEST
401 UNAUTHORIZED, User permission issue
404 NOT FOUND
500 INTERNAL ERROR, server issue, the server cannot handle the request
501 NOT IMPLEMENTED, asked service not implemented (yet)
502 BAD GATEWAY, proxy issue
503 SERVICE UNAVAILABLE (too much traffic..)
504 GATEWAY TIMEOUT

2. HTTPS ![URL.png] (https://github.com/LearningRbcRegistry/WikiPictures/blob/master/webprinciples/URL.png "") ###

HTTP protocol doesn't encrypt data's between the Client's Browser and the Server.

HTTPS = HTTP + SSL(TLS)

#####a. Secure Socket Layer #####

Server and client share a secure secret key. Only them knows the decrypting key.SSL protocol is encapsulating http protocol informations. SSL certificates link Domain Name + Server Name + Host name together. SSL certificates link enterprise name and address.

HTTPSchema.png

#####b. Pro and cons #####

Pro HTTPS Cons
Unhackable/unreadable by unwished persons Uses a lot of server resources (encryption)
Data integrity latenties
Ranking on search engine IE6 can not manage HTTPS browser caching
Trust A SSL certificate is not gratis
Client identification with certificates Mix Mode issues (global HTTPS page containing asset from http pages) = Warning for insecure content
Proxy cache issue
...
  • SSL 1 way:Only the server is authentified by sending it’s certificate. The communication is encypted in the 2 ways (client server and server client)

  • SSL 2 ways: Client and server have a certificate. This is used when the server need a client authentication.

#####c. Certificates #####

They are datafiles that link the secure key to the informations of an organization/server. A certificate activates the security and the HTTP protocol (443) in client’s browser.

  • The SSL certificate has to be installed on the web server in order to initialize a secure session with the browser
  • The server will submit a verification to the certification authority
  • Once the SSL certificate installed in the website, visitors can access to the site through a HTTPS connexion indicating to the server that the connexion has to be secure (the browsing address bar became green, http is replace to https, the organization name is indicated in the address bar, a lock is indicated).

HTTPSchema2.png

In this example, GlobalSign is a known Certificate Authority (CA) who emits digital certificates to enterprises or persons after having checked their identity.

  • Known certificate authorities are Symantec, GlobalSign, Digicert, Comodo, verisign

#####d. CSR, Certificate Signing Request. How to request officially for a certificate. ######

It is a block of encoded text given to a Certificate Authority when applying for an SSL certificate to the wished domain (rbc.com).

It is generated usually on the server where the certificate is installed and contains information's which are included in the certificate, like the company name, the locality, the country. It also contains the public key that will be included in the certificate. A private key id created at the same time in order to make the key pair. The certificate created with a particular CSR will only work with the private key that was generate with it. If the private key is loosed, the certificate key will not work.

Description Code
Country Name (2 letter code) [AU]: LU
State or Province Name (full name) [Some-State]: Luxembourg
Locality Name (eg, city) []: York
Organization Name (eg, company) [Internet Widgits Pty Ltd]: RBCLtd
Organizational Unit Name (eg, section) []: IT
Common Name (eg, YOUR name) []: rbc.com
Email Address []:
  • The private key has to stay private. Do not share it.
  • The common name has to be the web server name.
  • The CSR is created. Open it and copy/paste it in the inscription form when asked.

###3. Domain Name Server###

  • It converts a site web name into an IP address
  • Host Name to IP = Host resolution
  • IP to Host Name = Inverted resolution
  • DNS Resolving can be seen as a tree (first point is root server) with up to 127 levels

DNS.png

  • 13 root servers, they contain the address of first level DNS Server (.com / .fr / .edu...). some of them uses Anycast (lot of DNS Servers shares the same IP) since recent Hacker attacks.

  • Lot of local caches exists before to call a real DNS server. Your internet provider is handling Cache on its own local. So if 2 users from SFR are asking for the same URL, this is going very fast.

  • How to get a domain Name:

    • ICANN is handling first level URL (.com / .net…)
    • And delegates to different operator (named Registry). For .fr it's the AFNIC.
    • The AFNIC in France delegates to seller (named Registrar), like OVH, 1&1…
    • You give your IP to the registrar and you will get a Name.

4. Web – Browser###

#####a. CORS: Cross Origin Resource Sharing #####

  • Browser forbids the access to resources that doesn't are in the same domain as the server (If your own rbc.com site contains flow from facebook).
  • The <script> tag is not concerned by this restriction: JSONP (only GET)
  • CORS has been normalized by W3C, all compatible browser can execute cross domain HTTP requests.
  • CORS uses personalized HTTP header. They allows the browser and server to know enough each one over the other in order to know if the request or the response (that is cross domain) has to success or not;
  • Browser that supports CORS:
    • IE10 (+)
    • FIREFOX 3.5 (+)
    • CHROME (3+)
    • SAFARI(4+)
  • Allow full cros origin support : ACCESS-CONTROL-ALLOW-ORIGIN in each http RESPONSES
  • Sometimes a browser supporting CORS can ask for a preflight request in order to ask the server the permission to send the real request. This happens if:
    • Http method is not GET POST OR HEAD
    • Content-type not in Text/Plain, multipart/form-data/application/x-www-form-urlencoded
    • Cookies and authentication:
      • By default, cross domain request can not propagate cookies / ssl client certificates:
      • On the client side, with JQUERY:fill xhrfields to true.
      • On the server side, provide the following header to true: Access-Control-Allow-Credentials: true

CORS is strict; CORS cannot be read by the client without explicit Server authorization.

  • Examples: Run the following web page in a browser (through a server) and check the request/responses:

CORS.png

Lequipe.fr is allowing cross origin request / lessentiel.lu is not allowing them:

CORS2.png

  • A Server can define if he allows Cross Origin Requests
  • The server can define a list of allowed client's
  • JSONP is an (obsolete) way to handle CORS

#####b. Resource Building / Lifecycle #####

A request is made when a link is clicked REQUEST
Page and resources are downloaded RESPONSE
Web browser use the page resources to build the page BUILD Build the DOM (What to display)Build the CSSOM (CSS Object Map) (Manner to display)Build the Render Tree (Combines the DOM and CSSOM)
Page is rendered to the user RENDER Layout / reflowPaint
The attached links are very useful in order to understand the lifecycle and get tools to optimize the page rendering

#####c. HTTP Header Caching (.htaccess file) #####

#####d.HTTP Server Side Caching #####

  • Caching on Server Proxy
  • Does not overload the Server itself, cached data are stored in the Proxy
  • Use of Reverse Proxy
  • How it works (example with Apache Traffic Server)

ServerSideCaching.png

    • A client browser sends an HTTP request addressed to a host called host.com on port 80. Traffic Server receives the request because it is acting as the origin server (the origin server's advertised hostname resolves to Traffic Server).
    • Traffic Server locates a map rule in the config file and remaps the request to the specified origin server (com).
    • If the request cannot be served from cache, Traffic Server opens a connection to the origin server (or more likely, uses an existing connection it has pre-established), retrieves the content, and optionally caches it for future use.
    • If the request was a cache hit and the content is still fresh in the cache, or the content is now available through Traffic Server because of step 3, Traffic Server sends the requested object to the client from the cache directly.

##### e. CDN (Akamai, google CDN), Content Delivery Network #####

  • Refer To DNS and Anycast. Same principle
  • Computers linked in a network, Computers cooperates in order to provide content to the user
  • They provide cache Servers to Enteprises. This kind of solution is used by Microsoft, allocine, voyages-sncf/adobe/apple…
  • GoogleCDN solution allows to configures it owns CDN for free (by use of the google cloud).
  • They are intermediary between a Server and the final Client
    • Cache Solution in multiple Location
    • They mitigate the load of the Server
    • And accelerates the website response.
  • The user is asking for a website. Instead of contacting the server for a HTML page with images… a user is linked to a CDN server which is probably nearer than the server
    • If the informations are in the CDN, then they are returned to the user
    • They are not in the CDN? Search is done in another CDN from same organization ( Akamai..)
    • Is the information not known in the CDN organization? The info is sent to the origin server and saved in CDN cache.

##### f.SOAP vs REST #####

  • SOAP is a protocol, REST is an architecture style.
  • The flow between SOAP client and SOAP Server is based on XML message. So the dependence for the Client in case of Server update is important. Same for the server in case of Client update
  • Less dependence for REST which ask to use HTTP standards methods. A Restful API ask only to the Server to indicated the API entry point and the expected data types.

More details in a upcoming training.

g. Import a certificate on Was