GSoC Ideas List - abotkit/abotkit GitHub Wiki

Ideas List: Google Summer of Code 2021

abotkit is a collection of tools and resources for developers at every level to efficiently build and deploy chatbots.

We strongly believe that the current set of open source tools is amazing - but also prohibitively hard to combine into something you actually want to use. Sure, there's commercial offerings but no end to end FOSS solution. Let's change that!

Idea 1: A head full of bots

Charlotte Robert

Chatbots are a huge UX revolution (not a new one though!). The small restaurant in your area, your train service provider or even COVID restrictions - a great natural language interface could transform our daily interactions with a wide variety of topics and entities.

abotkit currently only provides two example bots: Charlotte and Robert. They showcase the different backends we support and they are very limited. We need people to use abotkit more, build and deploy practical bots to highlight areas of improvement for our tooling. Every new bot helps, especially combined with a well-written article that doubles as a guide to new users.

What we want to see

  • Usable bots that we can deploy with the student and use to showcase the project
  • Strong data management skills
  • Python notebooks
  • UX affinity

Prerequisites

  • python intermediate experience
  • being comfortable with some web scraping, pandas, ...
  • spacy.io or generic transformer-based NLP would be awesome

Idea 2: Cloning Clementine: An ultimative integration

Clementine

Not every bot is useful as a standalone website or can be embedded in a company's website. In fact most developers already use bots in their day-to-day work, usually in some chat platform. We want to provide these integrations, without the headache of rewriting half your bot or making it impossible to deploy across multiple ecosystems.

Clementine is an abstract repository which means you get an empty shell where you insert your integration specific hooks. We already support website integrations and are working on an instagram integration, but we need many more:

  • Telegram
  • Slack
  • Facebook
  • Mail
  • etc.

What we want to see

  • Solid integrations that abstract away the delicate details
  • API design should be your strength. Documentation, tests, ...

Prerequisites

  • node.js intermediate level
  • prior knowledge with any (facebook, slack) integrations is a bonus

Idea 3: Teddy: Script Kiddies are welcome

Teddy

When you build a bot for an existing website, it is cumbersome to transfer the existing (semi-) structured data into a format your bot library of choice will understand. Most websites follow some basic building blocks:

  • Pricing lists
  • Tables
  • Image galleries
  • FAQs
  • ...

Teddy is the abotkit crawler. The goal of Teddy is to crawl any website and generate the training data (e.g. for rasa) directly. The tool itself is written in Elixir with Phoenix but the crawled data can be edited in any language. Teddy not only offers a nice user interface but operates highly efficiently, thanks to the BEAM vm. 😊

What we want to see

  • Data engineering interests
  • Basic knowledge of HTTP(s), browsers, HTML selectors etc
  • Python data transformation tooling
  • You'll be building an important web-based UI to simplify web scraping

Prerequisites

  • elixir and phoenix would be awesome, but isn't strictly necessary
  • rails is a plus if you have never used phoenix
  • generic crawling knowledge is a plus
  • data transformation, including NLP is a plus

Idea 4: Ten second Robert

Robert

Our dream would be to offer "blank" bot that can you can train by talking to it! In order to have some basic understanding of what the person is saying, we are using pretrained transformer models and some logic on top.

Robert is based on BERT but for now only acting like a (smart) hashmap. We use the transformer to find the nearest example of any text input, infer the intent and execute it's action. That way we can train a bot by giving a few examples and have the bot ask the human for guidance if no matching example is present. Currently Robert does not remember the last sentences you wrote and so the bot is currently not really "smart". We need some kind of story ability to guide a conversation and some form of memory.

This could easily be the most exciting project if done correctly.

What we want to see

  • Curiosity. Build a generic bot core you can train on the fly!
  • Ability to reason about transformers, cosine-distance retrieval etc

Prerequisites

  • python advanced knowledge
  • nlp experience would be good to have
  • transformers basics

Idea 5: (Code)Mirror of Erised

More advanced users may want to add custom actions. Currently we have a bunch of action scripts. But our idea is to add CodeMirror to implement actions using Dolores.

If we truly want to deliver on our promise, people should be able to build and deploy their own bots using their browser. Integrating coding tools, hosting options and storage would be great.

Prerequisites

  • javascript knowledge
  • Frontend experience

Idea 6: abotkit on 🔥!base

abotkit

Deployment is still one of our biggest issues. We have multiple moving parts that move closer and closer together, but still depend on some data exchange service.

Currently we deploy abotkit on AWS, GCP or Azure using vanilla kubernetes. This is a really general approach but it's also really expensive. We could also use Firebase which could be for now a cheap solution which on the same time allow us to grow. But we also need some changes to make our services run on firebase.

What we are looking for

  • An easier deployment story for abotkit
  • Your work directly impacts every other project. Solid database understanding required!

Prerequisites

  • kubernetes and docker
  • firebase
  • node.js intermediate knowledge would be good

Idea 7: Electrons calling

Some users just want to use abotkit on their PC/Linux/Mac to test and train chatbots. Electron should help there to build executables for each platform. We also need a CI/CD flow to generate those executables on-the-fly.

Ideally we want a version that is well-integrated into a user's desktop environment and can even perform basic OS functions like search.

What we are looking for

  • Detail-oriented, meaningful integrations. We have a cool project, let us use that on our computer
  • Offer more actions that integrate with the OS. Go beyond the web as a platform
  • Deliver great UX

Prerequisites

  • node.js
  • electron would be good to know but can be learned during the project

Idea 8: The control room

If you love Westworld like we do, you may dream of the detailed user experience of the bot creators and wonder how we can guide people on their journey to their own conversational machine.

Dolores has a statistics page showing the accuracy for the match of input and intent. But we need also a feedback loop to allow users to add examples, add new intents, cluster outliers to adjust the chatbot directly from the statistics page.

Prerequisites

  • node.js
  • electron would be good to know but can be learned during the project