Chapter 5 problem set 1 - UCD-pbio-rclub/python_problems GitHub Wiki
Enter your problems below. Remember to use markdown formatting.
Julin
Q1
- Import the "Tomato.csv" data set from this repository
- Make a new data frame that retains the "trt", "hyp", and "species" columns.
- Now subset the new data frame do that it only has the S. chilense data.
- Calculate the mean of the hyp column separately for the "H" and "L" treatments.
Min-Yao
Q2.
- Import the data that we used last week as a DataFrame.
- We want to only focus on wyo_leaf_FPsc samples, so please slice out these samples.
- In addition, assuming that wyo_leaf_FPsc_02_052 is our control sample, we want to only select the genes having expression level > 1 based on their expression level in wyo_leaf_FPsc_02_052. Please slice out a new DataFrame based on these criteria.
- How many genes and samples we have in this new DataFrame?
Rie
Q3.
- Import Brapa_cpm.csv dataset that we used last week.
- Extract wyo_leaf_FPsc_04_141,wyo_leaf_FPsc_04_170, wyo_leaf_FPsc_04_174 samples, assuming those samples are biological replicates.
- Make a new data frame (I call it dataA) with these 3 samples. Get averages for each gene. Append the averages to dataA.
Ruijuan
https://github.com/cuttlefishh/python-for-data-analysis/blob/master/assignments/assignment6.md