Setup - ticalc-travis/nikkybot GitHub Wiki

Introduction

This is a rough set of notes of the steps needed to get NikkyBot set up and running on a GNU/Linux distro. This set of instructions was developed from a default installation of a Debian-based distribution. Adjustments may be needed for other distros. This documentation assumes familiarity with *nix shell usage.

System setup

These steps are assumed to be performed from a user account named “nikkybot” with sudo access.

Installing basic development tools

$ sudo apt-get install -y git

(Optional: Install an alternative Python shell such as IPython to make interactive testing/hacking easier.)

Installing NikkyBot dependencies

$ sudo apt-get install -y postgresql python3-psycopg2 python3-twisted pastebinit

(PostgreSQL, Python 3, Twisted, and Psycopg2 are absolute requirements. Pastebinit is needed for the “botchat” feature to work.)

Setting up PostgreSQL

The “nikkybot” role will be used to manage the DB as a normal system user:

$ sudo -u postgres createuser -d -r nikkybot

The “markovmix” role will be used by NikkyBot itself to access the DB:

$ sudo -u postgres createuser markovmix

Create the DB:

$ sudo -u postgres createdb markovmix

Edit “/etc/postgresql//main/pg_hba.conf” (replace “” with the version of PostgreSQL you're using). On the line after the comment ‘"local" is for Unix domain socket connections only’, replace “peer” at the end with “peer map=default”. This will define a default user map for allowing NikkyBot running under the “nikkybot” system user to connect to the DB as role “markovmix”.

Edit “/etc/postgresql/*/main/pg_ident.conf”. At the bottom of the file add the line:

default nikkybot markovmix

to set up the user map.

To make it easier to administer the database from your own user login account, you can also grant yourself permission to the database:

default <your_login_name> markovmix

Update Postgres with the configuration changes:

$ sudo systemctl reload postgresql.service

Under the “nikkybot” system user, test DB access for the “markovmix” role:

$ psql markovmix markovmix

A psql prompt should appear if successful (press Ctrl-D or use the command “\q” to quit psql)

Setting up NikkyBot

Get the NikkyBot files:

$ git clone https://github.com/ticalc-travis/nikkybot.git

Rudimentary test of the Markov/AI engine (optional)

$ cd nikkybot

Start a Python shell (e.g. “python3” or “ipython3” commands) and try the following lines:

import nikkyai
n = nikkyai.NikkyAI(debug=False)
m = n.markov
m.add_markov_rows(m.make_markov_rows("Hello! I am NikkyBot!"))
n.markov_forward(('I',))

This adds the sentence “Hello! I am NikkyBot!” to the Markov chain corpus and makes the bot complete part of it. If everything goes well, none of the lines except the last should output any messages, and the last line should result in the output “I am NikkyBot!” (assuming it was not trained on any other text yet).

If this test succeeds, this means that DB access is working and the Markov data is successfully being stored and retrieved.

Training the chat data

The “train.py” script can be used to train the bot with text it will use to generate its chat output. It takes a personality name and reads lines of text from standard input. Each line of text starts with a “nickname” in angle brackets, followed by the text “spoken” by that nickname. The format is a bit like an online chat log, since the bot was made for IRC.

Each personality is trained separately. The personalities are configured in the “personalitiesrc.py” file. The “personality_regexes” dictionary in this file defines regular expressions which specify each personality's nicknames. During training, each input line's speaker nickname will be checked against the nickname regular expression for the personality being trained. If the regular expression matches the speaker nickname, the text on the line will be considered as having been spoken by the person the personality is trying to mimic. The personality will thus be trained on the content of the spoken text so that it can output responses based on it.

If the speaker nickname does not match the personality's nicknames, it will be processed as contextual data, which may be used to try to find more relevant output responses based on keywords that seem to be contextually similar.

Blank lines, or those otherwise not following the expected “<speaker name> spoken text” format, can be used in the input to contextually separate groups of spoken lines. Contextual data will be kept separate between such separating lines, but the content of the separating lines itself will be ignored. Each group of consecutive lines that are “spoken” by the personality will be trained as a single chunk of speech and will be delimited by either these lines or by lines spoken by nicknames other than the personality's.

The primary personality is “nikky”. An example of training input for the “nikky” personality looks like this:

<nickname> This is a line of text spoken by “nickname”.
<nikky> This is spoken by “nikky”.
<nikky> These lines will be fed into the nikky personality for training
<someone_else> Somebody else talking
<nikky> Another line of “nikky” corpus data to train

<some_dude> This is an entirely different conversation
<some_dude> It has nothing to do with the above lines
<some_dude> So there's a blank line to let the training program know
<nikky> Hi, I'm back!

This input can be saved to a file or generated by another program and piped to “train.py”. For instance, if it resides in the file “train.txt” in the current directory, then it can be used to train the “nikky” personality like this:

./train.py nikky < train.txt

On each run of “train.py”, the input will be processed and added to the personality's existing knowledge from previous runs. To wipe the personality clean and start training over with a brand new set of input, include the “-r” or “--reset” option:

./train.py nikky -r < train.txt

Configuring and starting the bot

When running the bot, the PATH environment variable should include the directory where the NikkyBot Python files and executables are so that the bot can find them when it needs to execute certain functions that launch in an external process (such as the “bot chat” function). Otherwise, these functions will fail when called.

The “start_nikkybot.py” script is run to start the bot. It should only be launched with appropriate arguments specifying the IRC configuration. (Although options have default values, many are intended as examples only and should not actually be used unless the argument-parsing code in “start_nikkybot.py” itself is modified to provide suitable defaults.)

“start_nikkybot.py” should be called with the following options specified:

-s [SERVER [SERVER ...]], --servers [SERVER [SERVER ...]]
                      A list of IRC servers to connect to in host:port
                      format (example: myserver.net:6667)

--real-name REAL_NAME
                      "Real name" to provide to IRC server

-n [NICK [NICK ...]], --nicks [NICK [NICK ...]]
                      List of nicks to use, in descending order of
                      preference

-c [CHANNEL [CHANNEL ...]], --channels [CHANNEL [CHANNEL ...]]
                      List of channels to join, including initial
                      character such as “#” (the channel names may need
                      to be escaped from the shell)

--client-version CLIENT_VERSION
                      Client version response to give to CTCP VERSION
                      requests

--admin-hostmasks [ADMIN_HOSTMASK [ADMIN_HOSTMASK ...]]
                      Trusted hostmasks to accept special admin commands
                      from

The following options can be specified if desired (the defaults for these are generally reasonable to use):

--max-line-length MAX_LINE_LENGTH
                      Maximum characters to send per line in messages

--min-send-time MIN_SEND_TIME
                      Minimum allowed time in seconds between message
                      lines sent

--nick-retry-wait NICK_RETRY_WAIT
                      Seconds to wait before trying to reclain preferred
                      nick

--initial-reply-delay INITIAL_REPLY_DELAY
                      Seconds to wait before first line sent

--simulated-typing-speed SIMULATED_TYPING_SPEED
                      Seconds per character to delay message (simulated
                      typing delay)

--direct-response-time DIRECT_RESPONSE_TIME
                      Seconds to search for responses to highlight
                      messages

--random-response-time RANDOM_RESPONSE_TIME
                      Seconds to search for responses to non-highlight
                      messages

--state-cleanup-interval STATE_CLEANUP_INTERVAL
                      Seconds to do AI state housekeeping/cleanup

--channel-check-interval CHANNEL_CHECK_INTERVAL
                      Seconds to check joined channels and rejoin if
                      necessary

--max-user-threads MAX_USER_THREADS
                      Maximum threads invoked from untrusted commands
                      to run simultaneously

Upon starting the script with the necessary parameters, the bot will connect to IRC using the given parameters, automatically reconnecting as necessary if the connection fails. To shut NikkyBot down, use Control+C (if run in a terminal) or send SIGINT to the Python process running it.

Example:

PATH=/path/to/nikkybot:$PATH /path/to/nikkybot/start_nikkybot.py \
    -s my.irc.server:6667 my.alt.server:6667 --real-name "Some name for the bot" \
    -n primarynick altnick altnick2 \
    -c "#somechannel" "#anotherchannel" \
    --client-version "Nikkybot blah blah" \
    --admin-hostmasks '[email protected]'
⚠️ **GitHub.com Fallback** ⚠️