1. Data analysis - adriannaziel/EmojiProject GitHub Wiki

Data analysis for collected tweets

Data collecting

In our project we considered 45 following emojis. The reason for choosing these emojis was the fact that there are 45 most popular and widely used emojis according to EmojiPedia

0:😂 1:❤ 2:🤣 3:😊 4:🙏 5:💕 6:😭 7:😘 8:💜 9:😔 10:😎 11:😇 12:🌹 13:🤦 14:🎉 15:💞 16:✌ 17:✨ 18:🤷 19:😱 20:😌 21:🌸 22:🙌 23:😋 24:💗 25:💚 26:😏 27:💛 28:🙂 29:💓 30:👍 31:😅 32:👏 33:😁 34:🔥 35:💔 36:💖 37:😢 38:🤔 39:😆 40:🙄 41:💪 42:😉 43:👌 44:🤗

Data from twitter was collected from the 8 months period 04/2019 - 11/2019 and different hours randomly chosen during data scraping. The reason of sampling time from all day hours was the fact that tweets have different characteristics depending on time (for example during night there are mostly comments, official tweets are posted during working hours, etc) and we wanted to gather reliable data.

Data was scraped using twint library giving emoji, day and starting time as a search keywords. We performed scraping procedure equally for all emojis.

We collected over 514 000 tweets.

Number of tweets collected containing given emoji:

As we can see on a chart, for some emojis there are more tweets in the dataset then for others. It is the result of the fact, that there are often more than one emoji in one tweet.

Preprocessing

Next step was data preprocessing. The ekphrasis library was used. Preprocessing included tokenization, noisy tokens removal (links, mentions, emails, phone numbers, etc) and emoji from text separation.

Analysis

We performed analysis on collected data set to see how emojis are used in tweets, which emojis are frequently used together, what are words used with emojis.

Most common co-occurrences of emojis:

Top 15 emojis that are commonly used by in tweets together with a given emoji. Ordered starting from the most common co-occurrence.

😂['🤣', '😅', '😆', '😭', '😁', '💔', '😉', '🙄', '🤔', '🤦', '👌', '👍', '🤷', '👏', '🔥']

❤['😘', '🤗', '💖', '💕', '🙏', '💜', '🔥', '💪', '💚', '💛', '😭', '💞', '👏', '👍', '😊']

🤣 ['😂', '😅', '😆', '😭', '😁', '😉', '👍', '🤦', '👌', '🤔', '👏', '🙄', '🤷', '🔥', '😘']

😊 ['👍', '🤗', '😘', '🙏', '💖', '😉', '😁', '💕', '👌', '👏', '❤', '🌹', '😂', '💜', '💪']

🙏 ['👍', '😇', '💪', '🌹', '🤗', '💖', '👏', '🙌', '😊', '❤', '🔥', '😘', '😭', '👌', '💕']

💕 ['💖', '💞', '💗', '💓', '😘', '😭', '💜', '🤗', '🌸', '✨', '❤', '💚', '🌹', '😊', '🙏']

😭 ['💔', '💖', '😂', '😢', '💕', '💜', '💗', '🤣', '💞', '💓', '🙏', '❤', '🔥', '😅', '💚']

😘 ['🤗', '💖', '💕', '❤', '🌹', '😊', '😉', '💞', '😂', '🙏', '💜', '👍', '😁', '👌', '🔥']

💜 ['💚', '💛', '💖', '💕', '💗', '💞', '😭', '❤', '🤗', '💓', '😘', '✨', '🙏', '🌸', '😊']

😔 ['💔', '😭', '😢', '💖', '✌', '💕', '😂', '🙏', '👌', '💗', '💞', '💓', '💜', '🙄', '❤']

😎 ['👍', '🔥', '💪', '👌', '😂', '😉', '😁', '✌', '🙏', '🤗', '👏', '😊', '😘', '🙌', '🤣']

😇 ['🙏', '😊', '😂', '🤗', '😘', '💕', '💖', '❤', '👍', '😁', '🙌', '💪', '😎', '😉', '😭']

🌹 ['😘', '🙏', '💖', '💕', '🤗', '🌸', '👍', '❤', '😊', '💞', '💜', '✨', '🔥', '👌', '💗']

🤦 ['😂', '🤷', '🤣', '😭', '🙄', '💔', '😅', '🤔', '😆', '😢', '😱', '🔥', '😁', '👍', '😉']

🎉 ['👏', '💖', '💕', '😘', '🤗', '🙌', '✨', '💜', '👍', '🔥', '❤', '😊', '😁', '💪', '🙏']

💞 ['💖', '💕', '💗', '💓', '💜', '😘', '😭', '❤', '💚', '🤗', '💛', '✨', '🌸', '🌹', '🙏']

✌ ['😂', '👍', '🙏', '😎', '💪', '👌', '❤', '😁', '😊', '👏', '🔥', '😔', '😘', '😉', '💖']

✨ ['💖', '💕', '🌸', '💜', '💞', '🙏', '💛', '💗', '😭', '🔥', '🤗', '💓', '🌹', '🎉', '🙌']

🤷 ['😂', '🤦', '🤣', '🤔', '😅', '🙄', '😆', '😉', '😁', '😭', '😏', '✌', '🔥', '👍', '💔']

😱 ['😂', '😭', '🔥', '🤣', '🤔', '😢', '👍', '😅', '😁', '👏', '😆', '😉', '🙏', '💜', '🙄']

😌 ['😂', '✨', '💖', '👌', '💕', '🙏', '🤗', '😭', '✌', '😏', '😉', '😘', '😁', '🔥', '😊']

🌸 ['💕', '💖', '✨', '🌹', '💜', '💗', '😊', '💞', '😘', '🙏', '🤗', '❤', '💓', '💛', '😭']

🙌 ['👏', '🔥', '🙏', '💪', '😂', '👍', '👌', '😁', '❤', '🤗', '🎉', '😊', '💖', '😭', '✨']

😋 ['😂', '😘', '😉', '😁', '🤗', '😊', '🔥', '👌', '👍', '🤣', '😎', '😆', '😅', '💕', '💖']

💗 ['💖', '💕', '💞', '💓', '💜', '😭', '💚', '💛', '❤', '🌸', '😘', '✨', '🤗', '😊', '🙏']

💚 ['💛', '💜', '💖', '💕', '❤', '💗', '💞', '💓', '😭', '😘', '🙏', '🤗', '😊', '💪', '👏']

😏 ['😂', '😉', '🙄', '🤔', '😁', '🤣', '😎', '🔥', '😘', '😅', '👌', '😌', '😭', '👍', '😆']

💛 ['💚', '💜', '💖', '❤', '💕', '💗', '💞', '💓', '😭', '✨', '🤗', '😘', '🙏', '😊', '💪']

🙂 ['👍', '😂', '😊', '💔', '😉', '😁', '🙏', '❤', '👌', '🤗', '😘', '👏', '😅', '💕', '😭']

💓 ['💖', '💕', '💞', '💗', '💜', '😭', '💛', '💚', '❤', '😘', '✨', '🌸', '🤗', '😂', '🙏']

👍 ['👏', '👌', '💪', '😁', '😊', '😂', '😎', '🙏', '😉', '🤗', '😘', '🌹', '✌', '❤', '🤣']

😅 ['😂', '🤣', '😆', '😁', '😭', '😊', '😉', '🤔', '😘', '🤷', '🤦', '🤗', '👌', '👍', '😎']

👏 ['👍', '👌', '💪', '🙌', '😂', '🙏', '🎉', '🔥', '😊', '❤', '😘', '😁', '💕', '🤣', '😎']

😁 ['😂', '👍', '😊', '🤣', '😉', '😆', '😘', '🤗', '😅', '👌', '😎', '🙏', '❤', '👏', '💪']

🔥 ['💪', '👌', '🙌', '😂', '😎', '🙏', '❤', '👏', '😭', '😘', '👍', '💕', '💖', '✨', '💜']

💔 ['😭', '😂', '😢', '😔', '🙏', '❤', '💜', '💕', '🤦', '🤣', '💖', '💚', '💛', '💞', '💗']

💖 ['💕', '💞', '💗', '💓', '😭', '💜', '😘', '✨', '💛', '💚', '❤', '🤗', '🙏', '🌸', '😊']

😢 ['😭', '💔', '🙏', '😂', '😔', '💕', '❤', '💜', '💖', '😱', '😘', '💗', '🤣', '🤦', '😊']

🤔 ['😂', '🤣', '😉', '😏', '😁', '🤷', '🙄', '👍', '😅', '😱', '😎', '😆', '🤦', '😭', '😊']

😆 ['😂', '🤣', '😅', '😁', '😉', '👍', '😊', '😘', '💕', '😭', '🤔', '😎', '💜', '👏', '🤷']

🙄 ['😂', '😏', '😭', '🤣', '🤦', '🤔', '🤷', '😆', '😁', '😅', '😉', '😘', '😔', '😱', '😢']

💪 ['👍', '🔥', '👏', '🙏', '😎', '😂', '👌', '❤', '🙌', '😊', '✌', '😁', '😘', '💜', '😉']

😉 ['😂', '👍', '😘', '😊', '😁', '🤣', '👌', '😎', '🤗', '😆', '😏', '💪', '🤔', '🙏', '❤']

👌 ['👍', '😂', '👏', '🔥', '😎', '💪', '🙏', '😊', '😘', '😉', '😁', '❤', '🙌', '✌', '💕']

🤗 ['😘', '😊', '💕', '❤', '💖', '🙏', '😂', '💜', '👍', '🌹', '💞', '😁', '😉', '😇', '💛']

Word clouds for emojis

Words that are most frequently used in tweets containing given emoji. By analyzing the content of word clouds we can see what words are most commonly used with emoji.

Word cloud for 😂

Word cloud for ❤

Word cloud for 🤣

Word cloud for 😊

Word cloud for 🙏

Word cloud for 💕

Word cloud for 😭

Word cloud for 😘

Word cloud for 💜

Word cloud for 😔

Word cloud for 😎

Word cloud for 😇

Word cloud for 🌹

Word cloud for 🤦

Word cloud for 🎉

Word cloud for 💞

Word cloud for ✌

Word cloud for ✨

Word cloud for 🤷

Word cloud for 😱

Word cloud for 😌

Word cloud for 🌸

Word cloud for 🙌

Word cloud for 😋

Word cloud for 💗

Word cloud for 💚

Word cloud for 😏

Word cloud for 💛

Word cloud for 🙂

Word cloud for 💓

Word cloud for 👍

Word cloud for 😅

Word cloud for 👏

Word cloud for 😁

Word cloud for 🔥

Word cloud for 💔

Word cloud for 💖

Word cloud for 😢

Word cloud for 🤔

Word cloud for 😆

Word cloud for 🙄

Word cloud for 💪

Word cloud for 😉

Word cloud for 👌

Word cloud for 🤗

It appears, that many emojis are used with similar, universal words but there are some with unique set like 🎉 which is apparently used for birthday wishes and congratulations or 💔 for expressing loss and sadness.