oca - networknt/light GitHub Wiki

Today’s software engineering approach has some challenges and OCA Framework is designed to address these.

Productivity OCA Framework supports and encourages Agile Development. Agile software development is a group of software development methods in which requirements and solutions evolve through collaboration between self-organizing, cross-functional teams. It promotes adaptive planning, evolutionary development, early delivery, continuous improvement and encourages rapid and flexible response to change. A group of people take responsibility for the entire life cycle of the software development and work with other teams for integration. This makes each team more productive as decisions happen locally without management overhead. Teams have their own mission – to produce reusable web component, view or application. Component team needs to be aligned with view team and view team needs to be aligned with application team. In this sense they are loosely coupled but tightly aligned to the same mission.

The framework itself provides so many reusable common components, views and applications that are ready to be used or customized. So most of the applications can be assembled from existing pieces from OCA store and only certain customizations are needed. Of cause, you may need to build your domain specific modules but the existing ones can give you examples to follow. The framework also encourages brands and developer to publish their modules. The more brands using your brand’s experience, the more brand value you have. The more developers are using your modules the more support and customization revenue you will have as developers.

By using the framework, large projects can be break down to more manageable pieces and integration happens continuously to allow components, views and applications grow gradually. This makes the development teams scalable and reduces the risks for large projects.

Quality Assurance

Different teams manage components, views and applications independently and reusability is main goal in design. All pieces have unit tests and end-to-end tests in order to promote and give confidence for the end users. Also, each team have a sample application so end users can play with the product.

Front end AngularJS is known as testable Javascript framework and backend doesn’t have any container so rules can be tested as POJO. You don’t need to start a server to test your backend code.

Agile encourages QA and DEV teams are working together in one team. The developers are writing the unit test cases and the testers are writing e2e test cases.

If your organization won’t allow it, then a DIT exist report will be produced by the development team to assist QA team for testing.

The report contains some information generated by the tools and some information written by the developers.

Version number Scope of the change Unit test cases (generated) E2e test cases (generated) Complexity and coverage (generated) What need to be tested as it cannot be tested in dev environment Dependencies( related to the scope of testing) what modules are depending on the changed module that need to be regression tested.

Release Management

The OCA framework is based on event sourcing and deployment just means to generate events file from development environment and replay the events on DIT, SIT, UAT, PAT and PROD.

Traditionally, release a new version of product is very costly and risky so some organizations might limit the number of releases to 3 or 4 times per year. Each release will involve so many teams and last so long for testing and the changes might be just one line of code☺ An army consists of DBAs, System Administrators and Deployment Engineers will be work together during deployment time and they follow the document step by step to get the job done. This impacts the productivity and makes fixing defects, adding new features so slow and could not meet business need in this dynamically changing world.

In OCA framework, we want the benefits of agile development and continuous integration all the way to production. We encourage more deployments with high velocity and short cycles that lead to financial success. This conflicts with the traditional approach - fewer deployments with big thoroughly test batch deployment that lead to financial success.

Above two approaches have the same goal but it seems conflicting each other. How come they can lead to the same goal for financial success? To understand that, we need to understand how risk is calculated.

ALE (Annual Loss Expectancy) = Single Loss Expectancy * Exposure Rate * Annualized Frequency

In our software release world, we can understand it as

Loss = Single lost of error * Percentage of deployment error * Number of deployments

For example, if one error occurs in 100 deployments, each error will cost $5000 and there are 4 deployments per year, then the ALE would be 0.0150004 = 200

The traditional approach is to reduce the number of deployments to reduce the lost.

And our approach is to increase the number of deployments and reduce the single lost of error and percentage of deployment error. If this can be done, we can avoid financial losses due to downtime, bugs, noncompliance and loss of reputation.

Let’s look at the source of errors and try to lower the percentage of occurrence • Defects in code This can be addressed by unit test cases and e2e test cases. If we have enough coverage, then we can change the code with confidence.

• Errors in assembly or packaging Fast tests in continuous integration and delivery Fail slow tests and violation of architecture and coding standards. Clean build everything from Git repository Deploy the same way everywhere using events Manage dependencies and versions with graph database Manage Git branch and trunk through database to map to different release and environment. Basically, make everything automatic.

• Errors executing changes Make deployment the same process everywhere by just replay serial events which include database updates, business rules updates, rule data updates, template updates, apps and experience updates etc. Basically, we don’t need a army for deployment, it is one click at the right time and place.

Now, let’s look at the cost of error and see if we can reduce it.

Zero down time deployment. Database migrations and schema-less (database change won’t break previous version of code) Versioned identifiers for assets Protocol versioning Endpoint versioning Decoupled architecture Separate data and logic and they can be deployed independently. Configurable default version for every component Let end-user to choose if they want to use the updated version Employees try out the new version before making it default version User can downgrade version if they don’t like the new one. Basically user owns experience.

In order to archive the above, we have the make our deployment unit the smallest possible. Within the framework, we have component, view and app and each of them can be versioned and deployed independently. Further, they can be break up to even smaller piece to be deployed independently.

For example: a component can have the following part that can be deployed independently and versioned independently.

  1. AngularJS code (front end)
  2. Template (front end)
  3. Rules (back end)
  4. Rules Data (back end)
  5. Reference and configuration (back end) For example, only template get a new version 1.0.2 deployed on the server and other pieces are still in version 1.0.1 and we have an component version 1.0.2. One site can user version 1.0.1 and another site can user 1.0.2 and this allows site to customize the template for their channel as well.

Even further, we can set the template 1.0.1 as default so all the customer will have the default template but we ask our employee to try 1.0.2 version for a while before make it as default.

Although each piece can be deployed independently, they are loaded dynamically at the view level as part of angular routing. When Angular bootstraps, providers will be saved and they will be used to lazy load and register controllers, directives, filters, services, factories and providers etc. When angular requires a page, an page id and page version will be passed to the server. (no version means default version will be used) The server will check the dependencies of the page and assembly all piece together (java script code and templates) and send to the angular as response. This is for the first time, the next time the same version is required, it just response back the page cached. The cached will only be updated once any piece of the page is changed through event.

During the assembly phase, the configuration data and be combined with logic and the final page is pre-processed. For example, the dynamic dropdown list will be generated at this phase for a form component.

Breaking up the component to this level is no mandatory and it makes sense to have simple component packaged together and give it only one version. You only need to break it up if you component is so complicated and have too many moving part that is configurable and customizable.

Production Configuration

To make the application configurable on production, we need to separate the logic and data. The framework has three levels of configurations that can be performed on production and they have different level of risk associated with them.

The first level is reference data configuration. Most applications have reference data like dropdowns, translations etc. These will be saved into a set of schemas or tables and can be changed through table maintenance app. The reference data is cached but will be refreshed after midnight. This is the lowest risk change on production as it will only impact the UI look and feel most of the time and can be rolled back if negative impact occurred. Of cause, certain level of validation has to be done and approve process must be in place.

The second level is rules data configuration

All requests are handled by Light Rule Engine rules and rules are designed to be two part, Data and logic. This level is address the rule data change and it is at low risk as it won’t impact rule logic and the rule logic can be written to validate the data for the rules. For example, the system admin has the right to give promotion to discount one product for 10 percent off. The 10 percent is the data. And the rule might have validation between 1 to 99 or 1 to 55. This piece of data is more important then reference data as it is impact application logic but it is isolated from the rules. It can be changed easily without breaking the application.

The third level is rule logic configuration

The rules are just POJOs and can be updated and deployed though application interface. This change is bigger and risk is still manageable as you only need to regression to all the component/view/app that depends on the rule. Rules are working independently and it fails it only impact one area of the app and it can be easily rolled back.

Security

Api security or resource security is done by JWT token. When user is trying to access to protected resources, it will check if the access token is in the http request header. If it don’t exist, it will redirect the user to login page. The access token will be short lived up to 30 minutes and a 401 response along with token_expired will be sent back to client for refresh token if the user checked remember me when logging in or login page will be shown up.

Access token contains roles and userId so that the resource serve can grant access based on role-based authorization or based on user-based authorization.

Visibility control will be put into place based on the role of the users. For example, certain menu won’t be shown up unless you login as an admin role or certain web component shows only partial of data the user role is just anonymous.

OCA framework server provides another layer of security for the back-end legacy system for Angular application is not talking to back-end API directly. Also, this layer will do the validation before calling to back-end API so that a lot of invalid requests will be filtered out.

Performance

Monitoring

Traceability is more important with Angular application as it is running on the end users’ browser. The server doesn’t have the state of the user session and only angular application knows. In this case, event sourcing is utilized to log all the events happening on the browser side. Every user action will generate an event and it is sent to the server along with JWT token that is identifier for the user. The server is logging events into event store.

Un-caught runtime exception in Angular will logged as an event and it will be easily reproduced given a serial events leading to it for the same user in event store.

Server error response will be logged on server side as it is known who sent the request. For example, 404 error response is sent to the client and support team need to reproduce it.

Server side exception is logged with stack trace and it can be reproduced along with events leading to it.

Security violation will be logged when system identify that the request is not sent from our AngularJS app but some raw request with missing data or wrong parameters.

System statistic can be viewed from admin page with information like how many users are online, how many requests are served within a period of time etc.

Health check is an application that will check certain area of the application based on the configuration data in order to make sure the over all system is healthy. For example, it will check the connectivity with legacy system etc. It is normally called once new release is deployed and when system is behaved strangely.

User behaviour analysis is an app that analyzes user online behaviour and it can be very valuable to drive sales. If customer goes to a bank branch to save a check and the sale person knows the customer was browsing life insurance product yesterday with his mobile phone.

Module update notification will monitor if there are any security updates from the framework and notify system admin to take action.

Legacy Coexistence The framework can work with legacy web application together and this might be the requirement for some organization that invested big effort on a large system and they cannot convert everything into OCA framework in one step. They can switch part of the site to OCA and leave the rest still running on the legacy server. When user logs in, one request will be sent to the legacy server to create the session and another request will be sent to the framework Authentication/Authorization server to get access token. OCA modules will use the JWT access token to take to OCA server and the existing pages will still talk to legacy server using session.

Above assumes that OCA application and legacy application have only routing relationship. If OCA component will be embedded into legacy page, then thinks will be more complicated. We need to manage the communication with the legacy components, manage css conflicts etc. It is doable but not encouraged as there might be more work then just convert apps page by page.

OCA Server

The most important role of the OCA sever is to add another layer of security before our legacy system API. Otherwise, our legacy systems will be exposed to outside world and subject to attacks. The OCA Server will validate all the request from browser and make sure only valid request goes to resource server and it is designed to identify attacks or misuse

The OCA server supports integration with legacy system. Instead changing the legacy systems to provide REST API, we leave legacy system along and our OCA server will be acted as a proxy to the existing legacy API. Two benefits: avoid updating legacy system that is costly and shield our OCA application and experience to have a relatively stable API if legacy API changed.

The OCA server provides references, configurations for our OCA apps and experiences. This allows us to develop data driven components with customization in mind. For example, we can have a form component that renders different form given different form schema and form configuration. Another example would be account summary, number of columns and column headers can all be customized.

The OCA server also serves as a distributed cache layer for legacy system and data is cached in the final consumption json format.

Future proof

AngularJS 1.X vs AngularJS 2.X and ES5 vs ES6 The change between AngularJS 2.x and 1.x are huge and there is no clear path for migration at the moment. In order to protect our investment, we should write our code in 2.x style if possible so the migration won’t be so painful. Also, Javascript language is in a transition between ES5 and ES6 and today we can leverage some of the ES6 features and using transpiler like Traceur and 6to5. Both of them have grunt and gulp tasks ready to be used.

Background and Attribution Some of the challenges faced by the OMNI-Channel Architecture group are not unique to TD. Other organizations have used a similar software engineering approach as OCA to deal with these issues. Recently, Spotify consultant Henrik Kniberg assembled two videos (here and here) describing the music site's engineering culture which is uncannily similar to that of our own. Michael T. Nygard's video presentation "Dispand the Deployment Army" and lecture deck make a quantifiable call-to-arms to simplify and cut the costs of software deployments. A few years back, Google employee Steve Yegge wrote a scathing missive within Google that leaked out, lambasting his employer for not embracing the company-as-a-platform model so well monetized by his former employer Amazon. It gives a cursory overview of Amazon's foret into that market and references others entrenched in being platform-first companies. What follows is a breakdown and explanation of the OMNI-Channel Architecture influenced by the tone and principles defined in Kniberg's videos, Nygard's presentation and Yegge's rant.