GSoC Ideas List - abotkit/abotkit GitHub Wiki
Ideas List: Google Summer of Code 2021
abotkit is a collection of tools and resources for developers at every level to efficiently build and deploy chatbots.
We strongly believe that the current set of open source tools is amazing - but also prohibitively hard to combine into something you actually want to use. Sure, there's commercial offerings but no end to end FOSS solution. Let's change that!
Idea 1: A head full of bots
Chatbots are a huge UX revolution (not a new one though!). The small restaurant in your area, your train service provider or even COVID restrictions - a great natural language interface could transform our daily interactions with a wide variety of topics and entities.
abotkit currently only provides two example bots: Charlotte
and Robert
. They
showcase the different backends we support and they are very limited. We need
people to use abotkit more, build and deploy practical bots to highlight areas
of improvement for our tooling. Every new bot helps, especially combined with a
well-written article that doubles as a guide to new users.
What we want to see
- Usable bots that we can deploy with the student and use to showcase the project
- Strong data management skills
- Python notebooks
- UX affinity
Prerequisites
python
intermediate experience- being comfortable with some web scraping,
pandas
, ... spacy.io
or generic transformer-based NLP would be awesome
Idea 2: Cloning Clementine: An ultimative integration
Not every bot is useful as a standalone website or can be embedded in a company's website. In fact most developers already use bots in their day-to-day work, usually in some chat platform. We want to provide these integrations, without the headache of rewriting half your bot or making it impossible to deploy across multiple ecosystems.
Clementine is an abstract repository which means you get an empty shell where you insert your integration specific hooks. We already support website integrations and are working on an instagram integration, but we need many more:
- Telegram
- Slack
- etc.
What we want to see
- Solid integrations that abstract away the delicate details
- API design should be your strength. Documentation, tests, ...
Prerequisites
node.js
intermediate level- prior knowledge with any (
facebook
,slack
) integrations is a bonus
Idea 3: Teddy: Script Kiddies are welcome
When you build a bot for an existing website, it is cumbersome to transfer the existing (semi-) structured data into a format your bot library of choice will understand. Most websites follow some basic building blocks:
- Pricing lists
- Tables
- Image galleries
- FAQs
- ...
Teddy is the abotkit crawler. The goal of Teddy is to crawl any website and
generate the training data (e.g. for rasa
) directly. The tool itself is
written in Elixir
with Phoenix
but the crawled data can be edited in any
language. Teddy not only offers a nice user interface but operates highly
efficiently, thanks to the BEAM
vm. 😊
What we want to see
- Data engineering interests
- Basic knowledge of HTTP(s), browsers, HTML selectors etc
- Python data transformation tooling
- You'll be building an important web-based UI to simplify web scraping
Prerequisites
elixir
andphoenix
would be awesome, but isn't strictly necessaryrails
is a plus if you have never usedphoenix
- generic crawling knowledge is a plus
- data transformation, including NLP is a plus
Idea 4: Ten second Robert
Our dream would be to offer "blank" bot that can you can train by talking to it! In order to have some basic understanding of what the person is saying, we are using pretrained transformer models and some logic on top.
Robert is based on BERT but for now only acting like a (smart) hashmap. We use the transformer to find the nearest example of any text input, infer the intent and execute it's action. That way we can train a bot by giving a few examples and have the bot ask the human for guidance if no matching example is present. Currently Robert does not remember the last sentences you wrote and so the bot is currently not really "smart". We need some kind of story ability to guide a conversation and some form of memory.
This could easily be the most exciting project if done correctly.
What we want to see
- Curiosity. Build a generic bot core you can train on the fly!
- Ability to reason about transformers, cosine-distance retrieval etc
Prerequisites
python
advanced knowledge- nlp experience would be good to have
- transformers basics
Idea 5: (Code)Mirror of Erised
More advanced users may want to add custom actions. Currently we have a bunch of
action scripts. But our idea is to add CodeMirror to implement actions using
Dolores
.
If we truly want to deliver on our promise, people should be able to build and deploy their own bots using their browser. Integrating coding tools, hosting options and storage would be great.
Prerequisites
javascript
knowledge- Frontend experience
Idea 6: abotkit on 🔥!base
Deployment is still one of our biggest issues. We have multiple moving parts that move closer and closer together, but still depend on some data exchange service.
Currently we deploy abotkit on AWS, GCP or Azure using vanilla kubernetes. This is a really general approach but it's also really expensive. We could also use Firebase which could be for now a cheap solution which on the same time allow us to grow. But we also need some changes to make our services run on firebase.
What we are looking for
- An easier deployment story for abotkit
- Your work directly impacts every other project. Solid database understanding required!
Prerequisites
kubernetes
anddocker
firebase
node.js
intermediate knowledge would be good
Idea 7: Electrons calling
Some users just want to use abotkit on their PC/Linux/Mac to test and train
chatbots. Electron should help there to build executables for each platform. We
also need a CI/CD
flow to generate those executables on-the-fly.
Ideally we want a version that is well-integrated into a user's desktop environment and can even perform basic OS functions like search.
What we are looking for
- Detail-oriented, meaningful integrations. We have a cool project, let us use that on our computer
- Offer more actions that integrate with the OS. Go beyond the web as a platform
- Deliver great UX
Prerequisites
node.js
electron
would be good to know but can be learned during the project
Idea 8: The control room
If you love Westworld like we do, you may dream of the detailed user experience of the bot creators and wonder how we can guide people on their journey to their own conversational machine.
Dolores
has a statistics page showing the accuracy for the match of input and
intent. But we need also a feedback loop to allow users to add examples, add new
intents, cluster outliers to adjust the chatbot directly from the statistics
page.
Prerequisites
node.js
electron
would be good to know but can be learned during the project