CHARP Design - pupitetris/charp GitHub Wiki

Table of Contents Structure Principles Transaction Structure Diagram Software Stack

Structure

The structure of a CHARP project is divided in three main parts:

Web Client Application, written in Javascript/CSS/HTML5, which is served from static files from the web server. We provide some libraries for the CHARP part and some basic helper routines for you to code your own application, but you can code your app however you may want. We currently do our hacking in jQuery, and may go to ExtJS in the near future.
Web Server RPC broker, which you basicly just install and configure in your web server. Made in Perl/FastCGI. Unless you have specific needs, you won't edit this code.
Database Server functionality, routines written in PostgreSQL plpgsql or maybe plpython or some other. Some routines come from the CHARP project, which handles the request/response cycle, but most will be your code for operating your database information.

Principles

Our main principles are:

Distributed Processing: let the computer of the client do the hard work. If you are generating a report, let the client do the ordering and formatting, to reduce the requirements on the server as much as possible.
Minimum Transfer: once the application has loaded, the amount of information that travels between client and server should be kept to a minimum. For a report, you send information in raw numbers with no formatting, for example. Avoid sending unnecessary JOINS with information catalogs; maybe send the catalogs to the client and let it do the matching for itself once the information arrives; this of course if your catalogs are small enough, etc. Regarding markup, since XHR connections are the norm with CHARP, there's no need to repetitively send HTML code to render the application's shell. That is mostly done on demand, but only once for every module.
Stateless transactions: Each transaction should be independent from those called before. Each transaction should encompass a whole operation and yet operations should be defined to be as reusable as possible. CHARP takes care of the authentication process, but not of authorization; you should check that the requesting user is authorized to perform the requested procedure, for every procedure. Also, parameter consistency asserts (validation) should be performed for every remote procedure as well. Basicly, remote procedures should be agnostic of other remote procedures. CHARP Remote Procedures are basicly web services and as such, the client cannot be trusted.
UNIX-style toolkit: every piece of technology must have a single, well-defined purpose. No lock-in into complex, multi-megabyte-download bloatware. You can choose your own CHARP stack; maybe you prefer Nginx instead of Apache; maybe you like more vi than emacs; maybe you want to use prototype instead of jQuery. We are kind of locked into Postgres, but an Oracle database-side implementation wouldn't be too difficult to code, and a MySQL implementation would be great. CHARP is composed of ~550 lines of Perl, ~200 lines of plpgsql and ~250 lines of Javascript/jQuery. Simple, direct software.
Per-hacker environment: provide tools that make it trivial to set up an installation of the project in every developer's machine, so he can test his code on his own copy. This includes test databases, so he can code out in the field or while traveling without the need to start remote sessions or an internet connection. This also helps for quick QA setups.
Simple edit and test cycle: on client-side code, if you change a file, you just reload your browser to test your changes. No compilation, preprocessing or server restart necessary (there are a couple of exceptions on this). If you edit database-side code, only reload that code onto the DB server, or re-init the database with an automatic command and you are ready to test.
Exception-driven: by far, the easiest way to handle errors is using exceptions. We make use of them so we can write compact code that isn't checking for result codes like crazy for every call. CHARP takes really good advantage of Postgres' exception-handling code and takes care of elevating this up to the Javascript level, allowing for easy default handlers that report to the user with informative and meaningful information.
Unit testing: unit testing infrastructure would be quite easy to implement, especially for the RPC broker and the database remote procedures.
Free Software: of course, all of our stack is Free. No lock-in into propietary APIs or tools, so your intellectual investment is kept secure. CHARP is so easy and small that you may already have most of the knowledge required to use it, and if you do need to learn, it will be knowledge that you may later use for other non-CHARP developments.

Transaction Structure

In the beginning, CHARP was designed to send authenticated requests to a non-secure (HTTP) server without revealing the user's password over the network. To acheive this, a challenge scheme was implemented. The main idea of challenge authentication is that the client proves to the server that it does have the right password for his account but without revealing the password itself. To do this, the server sends a random number, called a challenge, which the client signs using its password. This signature is sent back through the network, and then the server itself signs the password that it has registered in its own records for that account, and if both signatures match, then the user is authenticated.

On CHARP, authentication takes place for every single transaction. This allows the system to be esentially stateless; that means that session information is not a requirement for the user to make its requests. Authentication is achieved by using two HTTP requests (XHRs) for each transaction: one for the request, and the other for the reply. For the request, the client asks for a resource with optional parameters, stating its username along. The result from the server is just the challenge, which identifies the transaction. The client then immediately computes the signature, and initiates the reply, sending both challenge and signature. If the server finds that the signature is right, the result of the reply is the requested information.

New transaction structures have been implemented since. We now have anonymous transactions for requesting resources whose contents or execution are allowed for any client and so they are completed in only one HTTP request. We also have "file" transactions (which may also be anonymous), where the result is not a JSON-formatted string, but data that may be put into an HTML object, such as an image, a script, a PDF file, etc. Future transaction types include basic authentication, in case CHARP is running over HTTPS where challenge-authentication would be unnecessary, and transactions with non-HTTPS encryption, in case you don't have an HTTPS server available but you still want to somewhat protect the data of your transactions.

Diagram

These are the steps at a moderate level of detail that take place when a challenge-authenticated transaction takes place.

We will assume the following example is generating the transaction, as an example. The user_auth resource is a trivial function that always returns success, which is normally used to check if the credentials provided by the user are correct. The idea is that if the credentials are wrong, the user will never be able to actually run the remote procedure, so bad credentials are caught by the exception-handling mechanism.

Markup:

 <div>Username: <input type="text" id="username" /></div>
 <div>Password: <input type="text" id="passwd" /></div>
 <div><button type="button" id="send-button">Send</button></div>
 <div id="result"></div>

Javascript:

 function authSuccess (data) {
     if (data && data[0].success)
         $('#result').text ('Success.');
     else
         $('#result').text ('Format error.');
 }
 
 function authError (err) {
     switch (err.key) {
         case 'SQL:USERUNK':
             $('#result').text ('User unknown.');
             return;
         case 'SQL:REPFAIL':
             $('#result').text ('Bad password.');
             return;
     }
     return true; // Call default error handler.
 }
 
 function sendClicked () {
     charp.credentialsSet ($('#username').val (),
         MD5 ($('#passwd').val ()));
     charp.request ('user_auth', [], {
         success: authSuccess,
         error: authError
     });
 }
 
 $('#send-button').click (sendClicked);

Firstly, the application is downloaded from the web server. This may happen on incremental bundles, and not necessarily on one step alone. Now that the application is loaded and running on the Web Client, a transaction is about to happen, so let's traverse the diagram, starting from the top with the User:

Action - The user fires an event by operating on an on-screen element of the application. An event handler (sendClicked) is called, which fires up a transaction (using CHARP.js:request), and registers callbacks to be called when the transaction finalizes (authSuccess and authError).
Request - An XMLHttpRequest is sent (CHARP.js:request) to the Web Server request URI, which lands on request.pl.
1. Requests contain the name of the resource to be executed (user_auth), the login of the user and the parameters for the resource (in our case, no parameters, an empty array).
request.pl uses its connection to the database to register a new request in the request table (request.pl:request_challenge). DBI::bind_param is used for all database queries to avoid SQL code injections. The client's IP address and current timestamp are added for further security and to delete expired requests.
The information is received by the Database Server through 05-functions.sql:charp_request_create, which checks that the user and the resource exist. Te resource name is prepended with the prefix rp_, so only functions starting with such a prefix may be called as remote procedures (in our case, rp_user_auth).
All this information is recorded in the request table as a new request, with a set of 32 random bytes (in hexa) which are to be the primary key of the new record, and the challenge for the transaction, which is the returned value.
The challenge goes back to CGI-level.
Challenge - goes back through the HTTP connection serialized in JSON.
The client receives the challenge and using the SHA-256 hashing algorithm generates a 32-byte signature using a concatenation of the username, the challenge and the password as source (CHARP.js:requestSuccess/reply).
Response - The challenge response signature is sent back with another XHR to the Web Server reply URI, along with the challenge as identifier. The HTTP request lands again on request.pl, but is now handled by request_reply.
The challenge/signature tuple is sent along with the client IP address to the Database Server for checking and possible execution request.pl:request_reply
The Database Server compares the response signature with one generated on its own using pgcrypt with the registered password for the user related with the transaction and the challenge. The originating IP address must match as well (05-functions.sql:charp_request_check). If the response is bad, an exception is raised.
If the response is good, then the requested remote procedure is called with the registered parameters (request.pl:request_reply_do and your own code in 05-functions.sql, in this case, CHARP's own rp_user_auth).
All remote procedures are assumed to return a table as a result. This result is sent back to CGI level.
Result - The resulting table and column names are serialized in a JSON array of arrays and sent back through the HTTP connection (request.pl:request_reply_do). These are deserialized and then received by CHARP.js:replySuccess, where data may be optionally transformed into an array of objects using the column names as keys.
Reaction - finally the registered callback is called (authSuccess or authError depending on the result) and the application gives the user feedback related to the resulting data.

Software Stack

These are the technologies we currently use:

Web Client: CHARP is mostly developed on Google Chrome, the most modern and efficient web browser. Firefox is a platform that we need to run perfectly as well. It is assumed that other standard browsers, such as those based on Webkit and Opera should work alright. MSIE 8 & 9 is assumed to work, but your mileage may vary with earlier versions. There is no guarantee for MSIE 6 and it is in the lowest of priorities.
Javascript: of course, Javascript. We use a few jQuery calls to implement the client-side CHARP library. jQuery-ui is used to send message dialogs for the default exception handler. We are studying ExtJs as our next UI toolkit, so maybe we'll have an ExtJs version of Charp.js in the future, but we won't abandon jQuery due to its lightweight nature and because it seems more suitable for quick web projects. jQuery really helps to keep us away from browser compatibility issues.
HTML5/CSS: we recommend the use of HTML5/LocalStorage to keep track of user information that survives from session to session. Some old browsers may not support this technology. We also recommend the use of the HTML5 Canvas to render graphs and so on instead of pre-rendering on the server. HTML layout should be kept to a minimum, so most if not all of your visual description should be achieved through linked CSS.
JSON: JSON is a popular plain-text data format that allows easy serialization of high-level data structures (hashes and arrays of modern languages such as Perl, Python or Javascript). This format is quickly replacing XML as a transport for AJAX calls and web services. We use JSON as our default serialization format for transaction results. JSON-RPC would be a desirable feature for CHARP for better interoperability with other frameworks.
Apache: The Apache HTTPD web server is an excellent choice of proven, stable technology. Our example configuration files are for Apache, but you are free to choose any other web server that suits your needs, as long as it is compatible with FastCGI. For Apache, we use the mod_fcgid module, which has tons of configuration options to tweak the performance of FastCGI.
FastCGI: FastCGI is some really old technology that still gathers the few features required to keep CHARP capable to scale. We were initially inclined to use mod_perl, but FastCGI is more secure and allows for alternative web server choices. We really didn't need most of the features that mod_perl provides anyways. FastCGI allows a CGI program to remain running in a main loop, allowing it to preserve some resources loaded and ready for the next call. In our case, we take advantage of this to keep a database connection alive, and for a cache of prepared statements.
Perl: Perl was chosen since it is really fast and minimal, and it is universally available. It is also great for coding glue software, such as our RPC broker. You don't need to know any Perl if you are going to use the RPC broker out of the box.
PostgreSQL: Warning: we require version 9.x of PostgreSQL, as the new features on this version allow for more compact and easier plpgsql code. plpgsql was chosen because it's a very natural way to send SQL commands to the engine, with very little cruft getting on the way; it has proven to be a very ellegant solution. Sometimes we use plpython to send e-mails and even connect to mysql databases; clever hacks, those two, with good results.