Table of Contents

Other Languages

简体中文

Introduction

The main entry point of the program is an object called Parser. Typically, the first step in all operations is to generate a complete syntax tree via the Parser.parse method, then query and modify the syntax tree to output the wikitext.

✅ Available in the Mini and Browser versions.
🌐 Available in the Browser version.

const Parser = require('wikiparser-node'); // CommonJS

import Parser from 'wikiparser-node'; // ES module

import Parser = require('wikiparser-node'); // TypeScript

Properties

config

✅ Expand

type: Config | string
The absolute path or relative path to the parsing configurations, or a complete configuration object. The default configurations include English Wikipedia (enwiki), Chinese Wikipedia (zhwiki), Japanese Wikipedia (jawiki), Moegirlpedia (moegirl) and LLWiki (llwiki). To customize the parsing configuration of a MediaWiki site, please refer to .schema.json for the relevant content.

For MediaWiki sites with Extension:CodeMirror installed, you can use the Parser.fetchConfig method to fetch the parsing configurations automatically.

// config
var config;
// Use the default configuration of Chinese Wikipedia
Parser.config = 'zhwiki';
config = Parser.getConfig();
// Equivalent to the above using relative paths
Parser.config = './config/zhwiki';
assert.deepStrictEqual(Parser.getConfig(), config);

configPaths

✅ Expand

version added: 1.25.1

type: string[]
Additional paths to search for parsing configurations. The default value is an empty array. The paths in this array will be searched before the default configuration path.

// configPaths (Node.js)
Parser.configPaths.push('config');
Parser.config = './enwiki';
assert.strictEqual(Parser.getConfig().articlePath, '/wiki/$1');

conversionTable

Expand

type: Map<string, string>
Used to define unidirectional language variant conversion.

// conversionTable (main)
Parser.conversionTable.set('頁', '页');
assert.strictEqual(Parser.normalizeTitle('首頁').title, '首页');

debugging

Expand

type: boolean
Whether to output debugging messages, defaults to false.

i18n

✅ Expand

type: string
The absolute path or relative path to the language file used to specify the linting message. The default language is English, and other preset languages include Simplified Chinese and Traditional Chinese.

// i18n
var message;
Parser.i18n = 'zh-hans';
[{message}] = Parser.parse('<!--').lint();
assert.strictEqual(message, '未闭合的HTML注释');
Parser.i18n = './i18n/zh-hans'; // Equivalent to the above using relative paths
[{message}] = Parser.parse('<!--').lint();
assert.strictEqual(message, '未闭合的HTML注释');
Parser.i18n = {'unclosed-comment': 'unclosed HTML comment'};
[{message}] = Parser.parse('<!--').lint();
assert.strictEqual(message, 'unclosed HTML comment');

lintConfig

✅ Expand

version added: 1.22.0

type: LintConfig
See Rules for instructions on how to configure linting rules.

redirects

Expand

type: Map<string, string>
Used to define redirects. Note that the page name must be capitalized and spaces must be replaced with underscores.

// redirects (main)
var title;
Parser.redirects.set('main_page', 'project : 首页#EN');
title = Parser.normalizeTitle('main page');
assert.strictEqual(title.title, 'Project:首页');
assert.equal(title, 'Project:首页#EN');

rules

✅ Expand

version added: 1.5.1

type: LintError.Rule[]
All linting rules.

// rules (Node.js)
assert.deepStrictEqual(
	Parser.rules,
	[
		'bold-header',
		'format-leakage',
		'fostered-content',
		'h1',
		'illegal-attr',
		'insecure-style',
		'invalid-gallery',
		'invalid-imagemap',
		'invalid-invoke',
		'invalid-isbn',
		'lonely-apos',
		'lonely-bracket',
		'lonely-http',
		'nested-link',
		'no-arg',
		'no-duplicate',
		'no-ignored',
		'obsolete-attr',
		'obsolete-tag',
		'parsing-order',
		'pipe-like',
		'table-layout',
		'tag-like',
		'unbalanced-header',
		'unclosed-comment',
		'unclosed-quote',
		'unclosed-table',
		'unescaped',
		'unknown-page',
		'unmatched-tag',
		'unterminated-url',
		'url-encoding',
		'var-anchor',
		'void-ext',
		'invalid-css',
	],
);

templateDir

Expand

version added: 1.10.0

type: string
The absolute path or relative path to the directory of templates used by Token.prototype.expand. In a Windows file system, the colon (:) in the page title needs to be replaced with a modifier letter colon (꞉).

templates

Expand

version added: 1.10.0

type: Map<string, string>
Instead of Parser.templateDir, the templates can also be manually added to this map. The key is the title of the template, and the value is the wikitext of the template.

// templates (main)
Parser.templates.set('template:a', '1');
assert.equal(Parser.parse('{{a}}').expand(), '1');

viewOnly

🌐 Expand

version added: 1.9.0

type: boolean
Whether to parse the content without changing, defaults to false. When set to true, the parser's performance will be improved.

warning

Expand

type: boolean
Whether to output warning messages, defaults to true.

Methods

createLanguageService

✅ Expand

version added: 1.16.1

param: object The document object, optional
returns: LanguageService
Create a language service task. Note that calling this method will automatically set viewOnly to true.

fetchConfig

Expand

version added: 1.18.4

param: string The site nickname
param: string The script path
param: string The URI for a wiki userpage or the email address of the user, optional
returns: Promise<Config>
Fetch the parsing configurations for the specified MediaWiki site with Extension:CodeMirror installed.

Parser.fetchConfig('frwiki', 'https://fr.wikipedia.org/w/', '[email protected]');

getConfig

✅ Expand

returns: Config
Get the parsing configurations.

getWMFSite

Expand

version added: 1.22.0

param: string The script path
returns: [string, string]
Get the nickname and hostname of a WMF site.

// getWMFSite (Node.js)
assert.deepStrictEqual(
	Parser.getWMFSite('https://en.wikipedia.org/w/'),
	['enwiki', 'https://en.wikipedia.org'],
);
assert.deepStrictEqual(
	Parser.getWMFSite('https://zh.wiktionary.org/w/'),
	['zhwiktionary', 'https://zh.wiktionary.org'],
);

isInterwiki

Expand

param: string Interwiki link
returns: RegExpExecArray | null
Determine whether it is an interwiki link. Note that when using the default parsing configurations, no interwiki information will be included.

// isInterwiki (main)
Parser.getConfig();
Parser.config.interwiki = ['mw'];
assert.deepStrictEqual(
	Parser.isInterwiki('mw :Main_Page'),
	Object.assign(['mw :', 'mw'], {
		index: 0,
		input: 'mw :Main Page',
		groups: undefined,
		indices: Object.assign([[0, 4], [0, 2]], {groups: undefined}),
	}),
);

normalizeTitle

✅ Expand

param: string Title (with or without namespace prefix)
param: number Namespace, default to 0
returns: Title
Normalize the page title. Note that when using the default parsing configurations, no interwiki information will be included.

// normalizeTitle
var title = Parser.normalizeTitle('lang#參考資料', 10);
assert.strictEqual(title.title, 'Template:Lang');
assert.strictEqual(title.fragment, '參考資料');
title = Parser.normalizeTitle('File:<');
assert.ok(!title.valid);

// normalizeTitle (main)
var title;
Parser.getConfig();
Parser.config.interwiki = ['zhwp'];
title = Parser.normalizeTitle('zhwp : 模板 : lang#參考資料');
assert.equal(title, 'zhwp:Template:Lang#參考資料');

parse

✅ Expand

param: string Wikitext
param: boolean Whether to be transcluded
param: number | string | string[] Maximum stage of parsing
returns: Token
Parse wikitext. Note that the list elements (ul, ol, dl) will not be fully parsed, which needs Token.prototype.buildLists to be called.

// parse
var wikitext = '<includeonly>i</includeonly><noinclude>n</noinclude>';
assert.strictEqual(Parser.parse(wikitext).text(), 'n');
assert.strictEqual(Parser.parse(wikitext, true).text(), 'i');
wikitext = '{{a}} [[b]] ';
assert.equal(Parser.parse(wikitext, false, 'template').lastChild, ' [[b]] ');
// When there is an unknown stage, the text will be parsed to the last stage
assert.equal(Parser.parse(wikitext, false, ['ext', 'unknown']).lastChild, ' ');

// parse (main)
var wikitext = '*a';
assert.strictEqual(Parser.parse(wikitext).lastChild.type, 'text');
assert.strictEqual(
	Parser.parse(wikitext, false, 'list-range').lastChild.type,
	'list-range',
);

setFunctionHook

Expand

version added: 1.22.0

param: string The name of the parser function
param: (token: TranscludeToken, context?: TranscludeToken) => string The handler function
Define a custom parser function hook. The handler function will be called during expansion of the parser function, and the return value will be used as the result of expansion.

// setFunctionHook (main)
Parser.setFunctionHook('invoke', token => `[[${token.module}]]`);
assert.equal(
	Parser.parse('{{#invoke:Foo|Bar}}').expand(),
	'[[Module:Foo]]',
);

setHook

Expand

version added: 1.22.0

param: string The name of the extension tag
param: (token: ExtToken) => string The handler function
Define a custom extension tag hook. The handler function will be called during HTML conversion of the extension tag, and the return value will be used as the result of expansion.

// setHook (main)
Parser.setHook('img', token => `<img src="${token.getAttr('src')}">`);
assert.strictEqual(
	Parser.parse('<img src="http://example.com/Foo.jpg"/>').toHtml(),
	'<p><img src="http://example.com/Foo.jpg">\n</p>',
);

Parser (EN) - bhsd-harry/wikiparser-node GitHub Wiki

Other Languages

Introduction

Properties

config

configPaths

conversionTable

debugging

i18n

lintConfig

redirects

rules

templateDir

templates

viewOnly

warning

Methods

createLanguageService

fetchConfig

getConfig

getWMFSite

isInterwiki

normalizeTitle

parse

setFunctionHook

setHook

相关软件包

Related Packages

⚠️ GitHub.com Fallback ⚠️

Parser (EN) - bhsd-harry/wikiparser-node GitHub Wiki

Other Languages

Introduction

Properties

config

configPaths

conversionTable

debugging

i18n

lintConfig

redirects

rules

templateDir

templates

viewOnly

warning

Methods

createLanguageService

fetchConfig

getConfig

getWMFSite

isInterwiki

normalizeTitle

parse

setFunctionHook

setHook

相关软件包

Related Packages

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️