ZZZ_Technologies Used - GetRecced/IR670_Spring2018 GitHub Wiki
Being a Graduate Course in Computer Science and Engineering (CSE) at Texas A&M University, the CSCE 670 - Information Storage and Retrieval course would obviously require us to write code as a part of the submission.
We outline below at a high level what programming languages we used, as well as which libraries and what purpose they were used for.
Programming Languages
Python
- Implementation of Baseline Approaches
- Scripts to split datasets into Train/test based on our customized requirements
- Scripts to scrape Amazon website to reverse lookup product details from an ASIN (Amazon Standard Identification Number) code
- Data Analysis and Exploration of our datasets
- Code to convert .json files to .votes files as required by HFT
C++
- Running of HFT models with custom modifications
Libraries/Packages:
Library Name | Usage |
---|---|
Pandas | Reading and writing data in 'dataframes' in Python |
SkLearn, Scipy, Numpy | Matrix operations, calculation of RMSE, MAE, Pearson correlation, etc |
Surprise | Implementation of SVD Approach |
Matplotlib | Plotting data for analysis |
Requests | Hitting URLs for Asin lookup |
BeautifulSoup | HTML Parsing |
pyLDAvis | Visualization of LDA topics |