Group the related TED talks - UMKCNSF/UMKC--HACKATHON GitHub Wiki
Use case id: 15-UM-TedTalks
Title: Group the related TED talks
Presenter: Vijaya Kumari Yeruva
Problem statement:
TED talks are very inspiring, and many people are motivated by them. The talks are available online for free. In June 2011, TED Talks' combined viewing figure stood at more than 500 million, [1] and by November 2012, these talks have been watched over one billion times worldwide [2].
When someone wants to view/listen to the talks about specific topics, they must search through all the TED talks. In any related talks, if they want to view only the top-rated talks, they must spend the time to search the most relevant TED talks. To facilitate the users, we want to group the related TED talks based on the title and description of the talks. We also need to sort them based on the number of views. Thus, whenever a new talk is added to the list it should automatically be added to one of the existing groups.
Datasets:
https://data.world/owentemple/ted-talks-complete-list
Application Specifics:
-
Cluster the related TED talks
-
Whenever encountering a new TED talk, classify the new talk into one of the existing groups
-
Develop a Web/Mobile application that will display related TED talks together in a sorted order based on the number of views with the topic name as the header (optional)
Model Implementation and Evaluation:
1. Develop and implement a model on your datasets using any machine learning tools (TensorFlow, Spark MLlib, Weka, R, etc.)
2. Perform training, testing and n-fold cross validation
3. Evaluate the model with Error Sum of Squares (SSE) while doing clustering
4. Evaluate the model with Confusion Matric, Accuracy, F-measure and Receiver Operating Characteristic (ROC) while doing classification.
5. Visualize the cluster points using any machine learning visualization library
Additional points for innovation
References:
- "TED profile". Mashable.com. June 27, 2011. Retrieved December 20, 2014.
- "TED reaches its billionth video view!". TED Blog. November 13, 2012. Retrieved December 20, 2014.
Questions:
Please create an issue on Github with usecase-id or email to: [email protected]