Home - FarhaKousar1601/DATA-SCIENCE-AND-ITS-APPLICATION-LABORATORY-21AD62- GitHub Wiki
Data Science and Its Applications Laboratory (21AD62)
Module 1:
Python/R Installation: Install Python/R language and Visual Studio Code editors, demonstrate using Kaggle datasets.
Programming in Python/R: Write and execute programs in Python/R using Visual Studio Code or PyCharm.
Study Hours vs. Exam Performance: Plot a line chart to visualize the effect of study hours on exam scores.
Histogram of Miles per Gallon: Use the mtcars.csv dataset to plot a histogram showing the frequency distribution of 'mpg' (miles per gallon).
Module 2:
Books Dataset Analysis: Using the BL-Flickr-Images-Book.csv dataset:
Import data into a DataFrame.
Drop irrelevant columns.
Change DataFrame index.
Clean fields like the publication date using regular expressions.
Use string methods and NumPy to clean columns.
Module 3:
Logistic Regression on Iris Dataset: Train a regularized logistic regression classifier on the iris dataset using sklearn and report the best classification accuracy.
SVM Classifier on Iris Dataset: Train an SVM classifier on the iris dataset using different kernels and hyperparameters, and report the best classification accuracy and support vectors.
Module 4:
Decision Tree ID3 Algorithm: Demonstrate the ID3 algorithm on a given dataset.
Clustering Methods on Spiral Dataset: Analyze the spiral.txt dataset using K-means, single-link hierarchical, and complete-link hierarchical clustering methods. Compute the Rand index and visualize the dataset to determine the best algorithm.
Module 5:
Web Scraping Mini Project: Implement a simple web scraping project focusing on social media data.