Plug ins introduction - czcorpus/kontext GitHub Wiki

Plug-ins introduction

Motivation

Because it is practically impossible to implement KonText in such a way that it meets all the possible requirements in terms of integrability into existing information systems and servers, some parts of the application are implemented as replaceable components (plug-ins) with predefined interface.

For example, if you have an existing user database or if you do not want to bother with user authentication at all you can easily implement your own version of the auth plug-in. If you want for example to store user session data to files instead of a database, all you have to do is to rewrite the sessions plug-in appropriately.

You can start by exploring prepackaged plug-ins located in the lib/plugins directory. Please note that when writing your custom plug-in you should always inherit from predefined abstract plug-in classes (located in lib/plugin_types) because KonText may test which class the plug-in belongs to.

The fastest way to make the plug-ins work

KonText comes with so called default plug-ins (located in lib/plugins directory), which provide a complete, working set of components needed to run KonText with all the features enabled. All the plug-ins use the same database interface defined in plugin_types.general_storage.KeyValueStorage. The interface has currently two implementations available in KonText:

  • sqlite3-based (uses pysqlite library) for testing and small installations
  • Redis-based (uses redis library) for production installations

Initial user data import is up to an administrator but KonText comes with two helper scripts:

  • plugins/default_auth/scripts/import_users.py (for importing user credentials),
  • plugins/default_auth/scripts/usercorp.py (for making corpora accessible to users),

See plugins/default_auth/scripts/users.sample.json file to find out how to prepare initial user data for the import.

Server-side notes

From a programming point of view, a KonText plug-in is a Python object returned from a Python module via a factory function create_instance(). Plug-in object is expected to implement proper abstract specification. In some cases (e.g. sessions) the abstract specification may be 3rd party.

from plugin_types.query_persistence import AbstractQueryPersistence

class MyQueryStorage(AbstractQueryPersistence):
    # implement all the methods
    pass

def create_instance(settings):
    return MyQueryStorage()

All the defined plug-ins can be accessed via plugins.runtime.[UPPERCASE_NAME]. This is used throughout KonText code. Each property (e.g. plugins.runtime.DB) is actually a wrapper object which is always available even if a concrete plug-in does not exist (= is not installed).

# the following code is ok even if the "live attributes" plug-in is not installed
data = []
with plugins.runtime.LIVE_ATTRIBUTES as lattrs:
    data = lattrs.get_attr_values(plugin_api, corpus, attr_map, aligned_corpora, autocomplete_attr)

# alternative solution

if plugins.runtime.LIVE_ATTRIBUTES.exists:
    data = plugins.runtime.LIVE_ATTRIBUTES.instance.get_attr_values(
        plugin_api, corpus, attr_map, aligned_corpora, autocomplete_attr)
else:
    data = []

To access other plug-ins, a plug-in code may declare its dependencies in a convenient way - via @plugins.inject decorator. KonText will automatically inject them to the create_instance() function as positional arguments. Please note that the settings object will be always injected as the first argument:

import plugins

@plugins.inject(plugins.runtime.DB, plugins.runtime.SESSIONS)
def create_instance(settings, db, sessions):
   return MyPlugin(db, sessions)

Plug-in lifecycle

An important thing to consider is that plug-ins are singleton-like objects instantiated during KonText startup and existing as long as a web server process runs. It means that any state they keep is shared between requests.

The best way to avoid problems is to keep plug-in object stateless. The current request and related data can be accessed via action.plugin.ctx.PluginCtx which is often passed as an argument to different plug-in methods. The PluginCtx object also provides a way how a plug-in can share data with other plug-in(s) - using set_shared and get_shared methods.

# ... auth plug-in ....
def validate_user(self, plugin_ctx, username, password):
    ans = self.log_in(username, password)
    if ans:
        plugin_ctx.set_shared('user_info', ans['user'])
    # etc...

# ... application bar plug-in ....
def get_contents(self, plugin_ctx, return_url):
    # we know that auth's validate_user is called before this
    # which means we can fetch stored info about a user
    user_info = plugin_ctx.get_shared('user_info', {})
    # etc...

Passing general data to the client-side

If your plug-in requires some data (besides the ones passed via defined interface methods) the plug-in may implement method export:

def export(self, plugin_ctx):
    return {'some': '...JSON serializable data...'}

The optional export method is called by KonText after an action is processed. It is expected to return a JSON-serializable data which are then passed to the current template's client-side model:

// within a page model

import { init as fooInit } from 'plugins/foo/init';

export class MyPageModel {
    private layoutModel:LayoutModel.PageModel;

    constructor(layoutModel:LayoutModel.PageModel) {
        this.layoutModel = layoutModel;
    }

    init(conf):void {
        fooInit(this.layoutModel.getConf<any>('pluginData')['fooPlugin']);
    }
}

Plug-in configuration

Plug-ins are configured in conf/config.xml under /kontext/global/plugins XML element. KonText uses a set of predefined names for the plug-ins throughout the application (see plugins._Names for concrete names and also for the order in which plug-ins are initialized).

A concrete implementation of a plug-in can have any name but it is better to follow these rules:

  1. A name should be composed of a custom prefix followed by the general plug-in name (e.g. acme_sessions)
  • default installation plug-ins use prefix default_
  • MySQL/MariaDB installation plug-ins use prefix mysql_
  1. Client-side code (TypeScript) plug-in directory should have the same name only using camel case (e.g. acmeSessions)

Example:

<kontext>
  <global>
    <plugins>
      <live_attributes>
        <module>acme_live_attributes</module>
        <js_module>acmeLiveAttributes</js_module>
        ... additional configuration ...
      </live_attributes>
      ...
    </plugins>
    ...
  </global>
</kontext>

Exporting custom actions

A plug-in can export custom actions to different controller objects to allow specific functionality to be implemented without affecting the core code. The following requirements must be met:

  1. Plug-in cannot overwrite existing actions
  2. action cannot be defined as an instance method (i.e. either a function or a @classmethod should be used)
  3. Plug-in object provides a method export_actions which returns a dictionary where keys are controller classes (not names) and values are lists of functions (not names) to be exported

Custom action names have no restrictions (as long as they do not conflict with the first rule mentioned above) but to prevent future name collisions, KonText guarantees that it will not be using actions with prefix plugin_.

"""
my plug-in module
"""
from action.control import http_action
from sanic.blueprints import Blueprint

bp = Blueprint('my_live_attributes')

@bp.route('/filter_attributes', methods=['POST'])
@http_action(return_type='json', action_model=CorpusActionModel)
async def filter_attributes(amodel, req, resp):
    return do_stuff()


class MyLiveAttributes(CachedLiveAttributes):
    @staticmethod
    def export_actions():
        return bp

Exporting custom periodic tasks

A typical KonText installation uses a lot of caching (concordance, frequency distribution, collocations) and archiving (subcorpora queries, shortened queries,...) which requires some regular maintenance. Although you can define a bunch of cron scripts, KonText offers a way how to register such tasks via a scheduler (either Rq Scheduler or Celery Beat. In case your plug-in requires some specific maintenance, you can export tasks which will be registered automatically and performed according to respective settings (see code example in conf/beatconfig.sample.py or conf/rq-schedule-conf.sample.json).

class MyPlugin(SomeAbstractPlugin):
    def export_tasks(self):
        def foo():
            return "foo"
        def sum(a, b):
            return a + b

    return (foo, sum)

👷 TODO

Client-side notes

Although some plug-ins are server-only (i.e. implemented in Python and running on server), there are many cases where it is necessary to implement a client-side functionality too. The client-side logic may just manipulate/analyze page's contents or it may generate most of the content by itself (see e.g. the implementation of default corparch plugin - public/files/js/plugins/defaultCorparch/). In an extreme case, KonText just passes data between server-side and client-side of the plug-in having no clue about details.

Plug-in structure and initialization

Source code must reside in its own directory within the plugins directory. A name of the directory must be specified in config.xml as js_module element. Plugin's entry point module must be always named init (i.e. the file must be init.ts)

The init.ts module should provide a default export of its initialization function:

export class MyPluginClass implements PluginInterfaces.SomePlugin.Plugin {
// some implementation
}

export default function init(pluginApi:Kontext.PluginApi):MyPluginClass {
    return new MyPluginClass();
}

Plugin factory should work in a synchronous way - any asynchronous operations (e.g. data loading) should be triggered by user interaction (e.g. user opens a widget for the first time => we load some data from server).

In KonText client-side application, most of the communication is performed via actions sent by components to models and by side-effect actions generated by models.

Plug-in name remapping

❗ Please note that no matter how you name your plug-in directory, modules will be always referred via plugin general name. E.g. if you implement your own implementation of the corparch plugin and install it in public/files/js/plugins/myCoolPlugIn, KonText will still be expecting you to use plugins/corparch module path:

import * as plugin from 'plugins/corparch/init';

// plugin variable will actually point to your 
// public/files/js/plugins/mySuperCoolPlugIn/init.js module

The name remapping (aliasing) is handled by Webpack.

Inside a plug-in, it is possible to add a custom module path remapping - see Mixing external and internal dependencies.

Mixing external and internal dependencies

In case you have a local JS/TS file depending on one or more external files loaded by browser on the fly, Webpack (the build tool) must be informed that these dependencies should be excluded from the production build process (in the devel build process, this is not an issue since there is no JS optimization/merging involved). To achieve that, create a file called build.json within your plug-in directory (public/files/js/plugins/yourPlugin) and create an ignoreModules entry:

{
    "ignoreModules": ["external-vendor/auth", "visit-counter"]
}

Please note that this expects you to export proper URLs and module identifiers via get_scripts(self, plugin_ctx) method on the server side. This currently works only for the application_bar plug-in.

To use custom modules with specific import paths (e.g. typical production JQuery file defines a jquery import), you can use remapModules entry in build.json:

{
    "remapModules": {
        "jquery": "plugins/myTagBuilderPlugin/jquery-3.2.1.min"
    }
}

Please note that the path is relative to public/files/js and there is no suffix in the module path.

Additional notes for the client-side

  • Available 3rd party libraries include:

  • JavaScript code related to a concrete page can be found in public/files/js/pages/pageName.

  • Plug-ins are located in public/files/js/plugins/pluginNameCamelCase

  • New client-side libraries and plug-ins should be written in TypeScript

  • User interface components should be implemented in React with state managed by (included) kombo library

  • Component styling should be done using styled components

Best Practices

Directory structure, files

Each plug-in should be inside a package:

lib/plugins/my_plugin/
lib/plugins/my_plugin/__init__.py
lib/plugins/my_plugin/config.rng
lib/plugins/my_plugin/README.md

The package directory should contain a file named config.rng containing a RelaxNG schema describing plug-in configuration inside a respective subtree of config.xml. Thanks to this, KonText validation script is able to test application configuration for errors.

The package directory should also contain a README.md file containing a brief description of the plug-in. It is recommended to put also the following sections:

  • Exposed HTTP methods to describe whether there is a custom HTTP communication between plug-in client-side part and its server-side
  • Exported tasks describing custom Rq (or Celery) tasks plug-in defines (e.g. for regular clean-up or some asynchronous processing
  • Reusability - to let other users know whether the plug-in is designed for general use and what are possible obstacles.
  • External dependencies for both client and server-side so an administrator of a KonText instance is able to update in-house configurations and scripts
⚠️ **GitHub.com Fallback** ⚠️