Home - FarhaKousar1601/DATA-SCIENCE-AND-ITS-APPLICATION-LABORATORY-21AD62- GitHub Wiki

Data Science and Its Applications Laboratory (21AD62)

Module 1:

  1. Python/R Installation: Install Python/R language and Visual Studio Code editors, demonstrate using Kaggle datasets.
  2. Programming in Python/R: Write and execute programs in Python/R using Visual Studio Code or PyCharm.
  3. Study Hours vs. Exam Performance: Plot a line chart to visualize the effect of study hours on exam scores.
  4. Histogram of Miles per Gallon: Use the mtcars.csv dataset to plot a histogram showing the frequency distribution of 'mpg' (miles per gallon).

Module 2:

  1. Books Dataset Analysis: Using the BL-Flickr-Images-Book.csv dataset:
    • Import data into a DataFrame.
    • Drop irrelevant columns.
    • Change DataFrame index.
    • Clean fields like the publication date using regular expressions.
    • Use string methods and NumPy to clean columns.

Module 3:

  1. Logistic Regression on Iris Dataset: Train a regularized logistic regression classifier on the iris dataset using sklearn and report the best classification accuracy.
  2. SVM Classifier on Iris Dataset: Train an SVM classifier on the iris dataset using different kernels and hyperparameters, and report the best classification accuracy and support vectors.

Module 4:

  1. Decision Tree ID3 Algorithm: Demonstrate the ID3 algorithm on a given dataset.
  2. Clustering Methods on Spiral Dataset: Analyze the spiral.txt dataset using K-means, single-link hierarchical, and complete-link hierarchical clustering methods. Compute the Rand index and visualize the dataset to determine the best algorithm.

Module 5:

  1. Web Scraping Mini Project: Implement a simple web scraping project focusing on social media data.