LiNK Distributed Network - RadiusDataSystemsLLC/LiNK GitHub Wiki
The LiNK Distributed Network was designed for large scale data processing, AI, and analytics. It's a complete data document and metadata-driven distributed processing infrastructure. It's highly modular to support all levels of customization.
The LiNK Distributed Network was built on top of LiNK Core and LiNK Simple Services, as such it shares many of the design considerations with those portions of the LiNK infrastructure.
Click image to Enlarge
The LiNK Distributed Network was designed with the following considerations:
Each portion of the data profile corresponds to a portion of the LiNK Distributed Network operational workflow to which those instructions are utilized. See Operational Workflow for more information on the operational workflow. The LiNK Distributed Network utilizes a workflow that is split into five operational steps:
Each operational step of the workflow preforms the following operations:
Click image to Enlarge
The Data Acquisition and Delivery stages are a bit different with regards to concurrency. Data can be acquired from or delivered to multiple sources in their respective steps concurrently or in an ordinal fashion when data profiles are chained together. See Data Profile Chaining for more information on data profile chaining.
Click image to Enlarge "Core operations" refers to functionality that is built natively into the LiNK infrastructure (LiNK Core specifically) and therefore doesn't require any additional investments in time or resources (no need to create an operational module) to utilize. Those features are available right-out-of-the-box.
Core Operations Feature Matrix:
Click image to Enlarge "Operational Modules" are modules that are created for the purpose of extending the functionality of the LiNK Distributed Network. These operational modules are segregated by the workflow operational steps to which they were designed to enhance or extend.
These operational modules are usually for specific extensions to a workflow operational step to accommodate complex, industry or business specific, and uncommon operations. These modules range from Artificial Intelligence-driven analysis to domain or industry-specific integrations or otherwise operations.
The LiNK Distributed System was designed by Radius Data Systems to help accommodate and provide an easy-to-use and manage infrastructure to power our intelligent products and solutions. As such, we've created the following operational modules which are not only available for utilization but serve as an example of the modularity and flexibility of the platform itself:
Click image to Enlarge All data profiles support "chaining". "Data Profile Chaining" is the ability to embed multiple child data profiles within a parent profile. When this happens, during execution, the data profiles are executed in an ordinal fashion, starting with the parent and then moving onto the first child. Once that first child profile is executed, the system moves onto the next child in line and so on.
At each data delivery step of the parent and each child profile, they can be configured independently to deliver the resulting data to a provided destination or to pass it onto the next profile in line. If the user wishes to, these profiles can also be configured to pass the data to the next profile in line AND to a provided destination concurrently at each delivery step as both delivery and acquisition support concurrent destinations and sources.
This chaining ability is incredibly useful and can be a very powerful tool to streamline or simplify operations.
It is because of the flexibility and numerous utilization options available that we decided to illustrate an example of how the LiNK Distributed Network and it's data profiles can be used to power an intelligent application.
The below is a fictional product but a good example on how the LiNK Distributed Network and data profiles can function, especially when data profiles are chained together:
The user often listens to podcasts and video streams relating to the financial industry and what laws and regulations may be upcoming. Often while listening, they determine if the subject matter provides any meaningful or new information and spends time searching for and researching topics or other information discussed. This may involve searching for other media or analysis on topics mentioned, referenced documents, or additional content from social media.
This process is seen as valuable to stay "in the know" and for professional purposes but can be very time-consuming and is left mainly up to their search skills and their known sources of information.
An application that takes a list of video streams and podcasts from popular sites or hosting services, transcribes them, automatically identifies tags that could be indexed based on recognized topics and entities, and finally compiles available references and additional media for review and research purposes.
The application would then allow users to search for topics to which they are interested in, and once a podcast or video stream was selected, displays it along with all associated material.
This improvement would allow for a more topic-targeted search and increase availability to more information while saving time and effort.
We could have easily set up a third data profile in this chaining and operational example that would first crawl the internet and identify podcasts and video streams for processing. However, for the sake of ease of illustration, we only illustrate two chained data profiles and the subsequent operations. We also assume that a video stream or audio file (acquisition location and method specified in data profile or physical file provided alongside data profile) would have been identified prior to execution.
In the below example, we have two data profiles which are chained together. The first (or "parent data profile") will take the audio provided or extract the audio from the steam, it will then transcribe the audio, identify key terms and subjects, and pass all resulting information to the second (or "child data profile"). The second data profile would then take the information provided and crawl the internet to locate, verify, and collate those findings to create a complete dataset. That dataset would then be delivered to a data warehouse or otherwise data storage mechanism for the application to ingest and be emailed as a data briefing to those individuals if configured to do so.
Click image to Enlarge The LiNK Distributed Network is built upon LiNK Simple Services. LiNK Simple services allows the LiNK Distributed Network to be deployed and configured in a wide range of supporting architectures to best fit the requirements and scalability of the targeted solution or system.
Below is one such configuration and illustrates the role of LiNK Simple Services within the LiNK Distributed Network:
Click image to Enlarge
Click image to Enlarge
The LiNK Distributed Network was designed with the following considerations:
-
Scalability
- To accommodate varying levels of load and flexibility of implementation, it was important that the LiNK Distributed Network support both horizontal and vertical scaling.
-
Reduction of Networking Overhead
- The LiNK Distributed Network utilizes a node announcement pattern for node communication, diagnostics, and simplification regarding triage (self-healing), preventative, or diagnostic-reactive messaging. This advantage allows for the reduction of network traffic as compared to poll-based systems and operations.
-
Modularity
- The LiNK Distributed Network was designed with modularity in mind. Most features pertaining to common services, processing algorithms, and networking scenarios are configuration-only. All custom processes and logic are delegated to an interface-defined module implementation.
-
Data document and metadata-driven operations
- The LiNK Distributed network was built around the concept of data document-driven operations. The use of data documents allows for the delivery of a single and complete set of operational instructions from the end user or application to the LiNK Distributed Network.
-
Freedom of Infrastructure
- Already using Kubernetes or Docker? Want to run the LiNK Distributed Network on Azure, AWS, or GoogleCloud? No Problem. Infrastructure and Platform-specific APIs aren't used in the LiNK Distributed Network. This allows for a portable system with a freedom of choice without getting locked into a golden cage or tied to a particular platform or infrastructure.
-
Cross-Platform support
- All components of the LiNK Distributed Network were designed on the .NET Standard 2.0 specification. Specifically, all components and functionality support Windows, Centos, and Ubuntu unless otherwise noted. The LiNK Distributed Network may also function on MacOS, Red Hat Linux, and other Linux distributions but those are all that we test on a regular basis.
-
XML and JSON support
- Both XML and JSON are supported. All communication contracts and serializable data structures including but not limited to metadata-defined operational instructions are required to support both XML and JSON for improved interoperability with other languages and technologies.
-
Commenting Structure
- The source code of the LiNK Distributed Network was designed to be easy to read in itself. However, it's expected that someone who doesn't know C# or JAVA can easily read the comments and discern operating structure. All that's required is an understanding of object-oriented fundamentals.
-
Doesn't require a certain database technology or schema
- While the LiNK Distributed Network has support for several different kinds of RDMS, document, and distributed databases it doesn't require one to function. It can be all configuration-file and directory based if desired.
-
No UI required
- While there is a UI framework (LiNK UI) that may have some corresponding modules for ease-of-use and a LiNK Simple Services service module to simplify the creation of metadata-based operational instructions, the LiNK Distributed Network doesn't require a UI to operate. How you use it, and how you generate the specified data document, is up to you.
Each portion of the data profile corresponds to a portion of the LiNK Distributed Network operational workflow to which those instructions are utilized. See Operational Workflow for more information on the operational workflow. The LiNK Distributed Network utilizes a workflow that is split into five operational steps:
- Data Acquisition
- Data Identification
- Data Validation
- Data Processing
- Data Delivery
Each operational step of the workflow preforms the following operations:
- During each stage of the workflow, the parameters for that stage that were defined in the provided data profile will be parsed and executed.
- These "parameters" are executed via passing them to an operational module and executing thereof or by utilizing them in a core operation. See Core Operations for more information on core operational functionality. See Operational Modules for more information on what operational modules are currently available.
- The workflow step's functionality of the platform can be extended at any time. All one has to do is create an operational module that implements the step's module interface, generate a verification signature from the assembly post-compile, register it with the network, and transmit it alongside the data profile and associated payload -> the rest is handled automatically.
Click image to Enlarge
The Data Acquisition and Delivery stages are a bit different with regards to concurrency. Data can be acquired from or delivered to multiple sources in their respective steps concurrently or in an ordinal fashion when data profiles are chained together. See Data Profile Chaining for more information on data profile chaining.
Click image to Enlarge "Core operations" refers to functionality that is built natively into the LiNK infrastructure (LiNK Core specifically) and therefore doesn't require any additional investments in time or resources (no need to create an operational module) to utilize. Those features are available right-out-of-the-box.
Core Operations Feature Matrix:
Click image to Enlarge "Operational Modules" are modules that are created for the purpose of extending the functionality of the LiNK Distributed Network. These operational modules are segregated by the workflow operational steps to which they were designed to enhance or extend.
These operational modules are usually for specific extensions to a workflow operational step to accommodate complex, industry or business specific, and uncommon operations. These modules range from Artificial Intelligence-driven analysis to domain or industry-specific integrations or otherwise operations.
The LiNK Distributed System was designed by Radius Data Systems to help accommodate and provide an easy-to-use and manage infrastructure to power our intelligent products and solutions. As such, we've created the following operational modules which are not only available for utilization but serve as an example of the modularity and flexibility of the platform itself:
Click image to Enlarge All data profiles support "chaining". "Data Profile Chaining" is the ability to embed multiple child data profiles within a parent profile. When this happens, during execution, the data profiles are executed in an ordinal fashion, starting with the parent and then moving onto the first child. Once that first child profile is executed, the system moves onto the next child in line and so on.
At each data delivery step of the parent and each child profile, they can be configured independently to deliver the resulting data to a provided destination or to pass it onto the next profile in line. If the user wishes to, these profiles can also be configured to pass the data to the next profile in line AND to a provided destination concurrently at each delivery step as both delivery and acquisition support concurrent destinations and sources.
This chaining ability is incredibly useful and can be a very powerful tool to streamline or simplify operations.
It is because of the flexibility and numerous utilization options available that we decided to illustrate an example of how the LiNK Distributed Network and it's data profiles can be used to power an intelligent application.
The below is a fictional product but a good example on how the LiNK Distributed Network and data profiles can function, especially when data profiles are chained together:
The user often listens to podcasts and video streams relating to the financial industry and what laws and regulations may be upcoming. Often while listening, they determine if the subject matter provides any meaningful or new information and spends time searching for and researching topics or other information discussed. This may involve searching for other media or analysis on topics mentioned, referenced documents, or additional content from social media.
This process is seen as valuable to stay "in the know" and for professional purposes but can be very time-consuming and is left mainly up to their search skills and their known sources of information.
An application that takes a list of video streams and podcasts from popular sites or hosting services, transcribes them, automatically identifies tags that could be indexed based on recognized topics and entities, and finally compiles available references and additional media for review and research purposes.
The application would then allow users to search for topics to which they are interested in, and once a podcast or video stream was selected, displays it along with all associated material.
This improvement would allow for a more topic-targeted search and increase availability to more information while saving time and effort.
We could have easily set up a third data profile in this chaining and operational example that would first crawl the internet and identify podcasts and video streams for processing. However, for the sake of ease of illustration, we only illustrate two chained data profiles and the subsequent operations. We also assume that a video stream or audio file (acquisition location and method specified in data profile or physical file provided alongside data profile) would have been identified prior to execution.
In the below example, we have two data profiles which are chained together. The first (or "parent data profile") will take the audio provided or extract the audio from the steam, it will then transcribe the audio, identify key terms and subjects, and pass all resulting information to the second (or "child data profile"). The second data profile would then take the information provided and crawl the internet to locate, verify, and collate those findings to create a complete dataset. That dataset would then be delivered to a data warehouse or otherwise data storage mechanism for the application to ingest and be emailed as a data briefing to those individuals if configured to do so.
Click image to Enlarge The LiNK Distributed Network is built upon LiNK Simple Services. LiNK Simple services allows the LiNK Distributed Network to be deployed and configured in a wide range of supporting architectures to best fit the requirements and scalability of the targeted solution or system.
Below is one such configuration and illustrates the role of LiNK Simple Services within the LiNK Distributed Network:
Click image to Enlarge