Rain Dev Diary - TheEvergreenStateCollege/bioinformatics GitHub Wiki

08/17-08/23

From Paul, about Rain's hours

  • +1 hours on Tuesday for Smarty Plants meeting and screenshare, ended at 11am.
  • +1.5 hours on Friday
  • +2.5 hours total for Smarty Plants work recorded to Gusto.
  • Thanks for your help with Smarty Plants

08/10-08/16

Work Log

Monday

4 hours - updating suffix tree to use d3-zoom for panning and zooming

Tuesday

2 hours - working on the appearance of the tree and trying to improve clarity and readability

Notes

The display of the suffix tree is more difficult to get right than I expected. I want to find the right balance for spacing of the labels, clarity, and appearance of the tree

For instance, these are the same tree with the spacing adjusted: image image

Overall, I'd say the bottom one is better but the last few links get very cramped. With larger strings spacing will get even more important.

With more complex strings, I worry that readability can be affected especially for parent nodes.

image image

Ignore that only the last character is being highlighted...

The Visualization of Ukkonen's Algorithm for comparison, the tree is slightly different because it includes a termination character but I think the beginning parts of this are more readable image

Summer

06/20/24

  • What did you do or think about yesterday?

    the web interface and how best to display the data

  • how are you feeling today?

    Good

  • what will you do or think about today? start creating the web interface, getting it set up so I can work on displaying the data as soon as it's available

06/19/24

  • What did you do or think about yesterday?

    I went through the notes and resources Taylor shared with me on de brujin graphs and Eularian walks

  • how are you feeling today?

    Good, I feel like I have a lot of questions but I'm figuring things out

  • what will you do or think about today?

    My focus today is on the web interface. I want to think about how to best design it for the end user and what format to display the data in.

04/09/24

This week

What I want to do: Start coding for the project. My goals may change after our meeting tomorrow but at the moment I plan on focusing on rewriting the program written in the last meeting to read files. I plan to add in functionality like cutting out all the other information in the files so that only the MRNA is output, and chopping the MRNA into smaller chunks.

In addition, I want to work on suffix trees and try implementing that in Rust.

Finally, I want to do more research in order to better understand the graph algorithms being discussed.

Hours I plan to spend on this: at least 9

Last week

What I did: Over the weekend I focused a lot on learning Rust. I came into this project knowing no Rust so I feel like it's the area I'm weakest. I'm trying to get myself up to speed so I can understand and contribute more to programming discussions. I did this by reading through the Rust textbook, working ahead on Rustlings, I also found a tutorial on Rust that's meant for C++ programmers which I've been working through and have found really helpful.

I've also been reading through resources posted in the discord and the readme and trying to understand them. I've had mixed success with that. Some of them have definitely gone over my head even trying to read them multiple times.

Hours Spent: ~6

Questions

How are suffix trees going to be used for data preprocessing? I understand that Suffix trees are good for finding patterns in DNA, but I'm confused at how they will be used for data preprocessing. Once patterns are found are they then sent over to the graph algorithm?

What are our community goals and expectations?