Important links
Github: https://github.com/swapnasrita/Bioinformatics
Codecademy: https://www.codecademy.com/enrolled/courses/learn-r
Rosalind: https://rosalind.info/problems/list-view/
Youtube, Introduction to Bioinformatics: https://youtu.be/eZfyWdHnzR0?list=PLuiPz6iU5SQ-PAjlyz4b3EIIQ6dpZ2DE_
Some context: I am trying to use Problem-based learning to approach learning Bioinformatics. I am understanding the biology and computer science of it from this course on Youtube. Meanwhile I am brushing up on my R and solving few problems a day from Rosalind. I generated a problem statement using ChatGPT. My aim would be to understand the problem, execute it and write a report on my experience.
Day 12 - I am learning how to do differential analysis using DESeq2. I grossly underestimated how complex it was going to be. Next week, I plan to explore the mathematics behind each step a little bit more and stabilise my fundamental understanding.
Day 11 - The first time I am doing RNA-seq. I learnt the workflow of RNA-seq.
Going on a slightly philosophical tangent, someone someday used programming to automatically enrich the RNA, perform differential analysis to understand the differences - a genuine amalgamation of experimental techniques with computational techniques. Today, we are understanding the true power of computational modelling through chatGPT and GenAI. Behind it all lies the original philosophy of the need to answer why.
Day 10 - This Rosalind shortest superstring is making me crazy. The wording itself is difficult. I have foregone me trying to solve the problem. Now I am just trying to understand the solution. Even that is difficult. Clearly, this is a complex one because there are articles about why it is complex, for example, https://noobest.medium.com/rosalind-genome-assembly-as-shortest-superstring-1db2c7408a64
Tomorrow I will either finish this or move on.
Day 09 - Trying to use Bioconductors. But understanding the basics by doing Rosalind problems also helps. Still stuck at trying to find the shortest superstring.
Day 08 - I am stuck at trying to find the shortest superstring. I do not really know where exactly I am stuck!!!!
But on a better note, I got access to some pretty nice bioinformatics courses, thanks to wonderful professors at Maastricht University.
So alongside my other knowledge gathering, I am learning how to use Bioconductor in R. It seems like this is what will give me the best chance to solve my problem statement of RNAseq. But I would still continue using Rosalind. I still need to understand how these packages were developed in the first place. It is truly like standing on the 'shoulders of the giants'.
Day 07 - We are getting into the fun part. Comparing substrings, creating an average DNA for common ancestry, finding motifs in DNA - all of which I believe to learning about indexing.
In the Youtube lesson today, I learnt how BLAST and BLAST2 works. I know there will be newer version now that 13 years have passed since the Youtube lesson was updated.
Day 06 - Created my own local alignment python code. It would have taken much longer with R and I know that packages must exist to do this already.
Day 05 - Probablity has always been the hardest topic for me and it still is. AAAAAAAAAAAAACCCGGGG (get it?!!!). But I am learning about dominant and recessive gene. Also, today's video course (Youtube) was about local and global alignment and how to use the table to find the optimal alignment (trace back from which node you came).
horizontal = gap under the horizontal gene letter. diagonal = match. vertical = gap for the letter in the vertical gene.
Consider the gaps when moving backward to make matches.
Day 04 - Today I learnt about FASTA, gc content and Hamming distance to measure point mutations separating two species. The lecture today was about sequence alignment and how to avoid recursion by creating a table.
Day 03 - Continuing to learn R through codecademy but now they are pushing their Pro and Plus plans (the names are too confusing! one should know what the difference is from the names). I learnt aggregating data and grouping by data and then summarizing. I also tried to solve 2 problems in rosalind - fibbonaci sequence and a more complex version. I got very frustrated when the precision of jupyter notebook was not enough (a lot of other students had the same problem) and I could not increase it because gmp package would not get installed. So now I have RStudio and everything works.
Day 02 - Learning R through codecademy. I learnt ggplot and data cleaning. I also tried to solve 3 problems in rosalind - counting nucleotides, DNA to RNA, Complementing a strand of DNA. Uploaded them to github.
Day 01 - Learning R through codecademy. It is a free course. I have learnt R before <- so this is a refresher. I want to do the project in R.