Logged in as: guest Log in
COMP 555 - Spring 2022

Announcements

[Zoom Lectures Link] [Office Hours Zoom Link]

  • May 7: All course grades are now posted and visible via your course login. Have a good summer.
  • May 5: The final exam can be downloaded from this link after 12:00pm and must be turned in before 3pm. I will be in the classroom during that time and available on Zoom at at this link. If you are on Zoom and have questions send a message in the chat. 
  • April 26: The time-warp is now fixed, Lectures are matched to their proper day and number.
  • April 21: Today's lecture is from 4/26. I will fix the time warp this weekend. Also I need to move today's office hour to 4:30-5:30.
  • April 19:  We are still living in the future. Today I will cover the material from Lecture 25 that was scheduled for 4/21. To make things even more confusing the slides and notebook downloads are labeled Lecture 24. This is so you have all the background materials to complete PS#6. Also the submissions page for PS#6 is now open.
  • April 11: Problem Set #6 is now available and is due on 4/21. This one is going to be tricky. I'll explain in class tomorrow.
  • March 31:  Problem Set #5 is now available and is due on 4/14. No office hours today.
  • March 23: I need to cancel tomorrow's class. Problem Set #4 is still due before midnight tomorrow. I will also hold office hours via zoom only from 3:30-4:30.
  • March 22: I may need to cancel Thursday's lecture (3/24). Monitor this website for updates. 
  • March 11: Problem Set #4 is now available and is due on 3/24. 
  • March 8: I am moving the due date for problem set #3 from tonight, 3/8, to Thursday night 3/10. The midterm has been graded and should be visible when logged in to your course website.
  • March 6: The midterm can be downloaded from here after 2:00pm and it must be turned in using the link at the bottom of the exam before 3:20. Press "Save and Checkpoint" before submitting. You can submit the exam as many times as you wish, but only the last version is kept. 
  • March 1: I was unable to find a good time and place for the Midterm review session. I will therefore use today's lecture period as a review session. I will not be recording today's lecture or supplying lecture notes and or Notebooks. I ask that no one attempt to take photos or record by any means the practice questions that I will present. I will update the course schedule to reflect this change over the break.
  • Feburary 23: My SARS-C0V-2 PCR test was negative (not detected). The campus positivity rate is 11.37%
  • Feburary 17: Problem Set #3 is now available and is due on 3/8, which is after the midterm. However, this material will be on the midterm. Also, I will be holding an additional office hour today from 3:30-4:30.
  • Feburary 16: My SARS-C0V-2 PCR test was negative (not detected). The campus positivity rate is ~12%
  • Feburary 15: The grades for Problem Set #1 are now posted. You can see the overall sccore on a button that appears on your Setup page. If you press the button you will see the grading for each problem.
  • Feburary 10: I will be holding extra office hours today from 3:30-4:30, and on Thursdays from this day forward (excluding holidays and exams) 
  • Feburary 9: My SARS-C0V-2 PCR test was negative (not detected). The campus positivity rate is 12.94%
  • Feburary 9: There is an issue with Problem #4 on Problem Set #2. The last line of the grading cell should be:
                         %time HammingMotifSearch(DNA, 11)
  • Febuary 4: Problem Set #2 is now available on-line and is due on 2/17.
  • February 3: The video of the Python Tutorial session has been posted. See the link in the schedule below.
  • February 2:  My SARS-C0V-2 PCR test was negative (not detected). The campus positivity rate is 13.87%
  • February 2: Something has come up and I need to shorten my office hours today to 2pm-3pm. I will add additional office hours from 3:30pm-4:30pm tomorrow. I realize that this comes just hours before the Thursday 11:59.59pm deadline for problem set #1 and I have been warning you that it is, in general, unwise to start a problem set the night before the deadline. The bottom line is you certainly should make some effort before tomorrow after on the problem set. However, this particular problem set in only time consuming with regard to learning concepts; it is not an issue of execution time.
  • January 22: A Jupyter notebook for tonight's Python Tutorial.
  • January 22: Sena Atay has created a Comp555 GroupMe chat that you can subscribe to at this link.
  • January 22: Tonight's Python Tutorial session will be held online only via this Zoom Link. Come with a Jupyter notebook up and running.
  • January 21: My SARS-C0V-2 PCR test was negative (not detected). The campus positivity rate is 14.97%
  • January 20: Problem Set #1 is now available. It is due before midnight on 2/3/2022.
  • Jamuary 19: My SARS-C0V-2 PCR test was negative (not detected). The campus positivity rate is 13.76%
  • January 13: Last day to fill out the roster form
  • January 12: My SARS-C0V-2 PCR test was negative (not detected). The campus positivity rate is 13.18%
  • January 11: First class meeting. See you there. Class Roster
  • January 10: I am vaccinated (3/10/21) and boosted (10/25/21). I intend to be tested regularly (weekly) as long as the positivity rate on campus is over 5%. My last test on 12/28/2021 was negative.

Course Description


Computational methods are fueling a revolution in the biological sciences. Computers are already nearly as indispensable as microscopes for understanding, analyzing, and interpreting biological systems. As a result, two new multidisciplinary fields, bioinformatics and computational biology, have emerged. This course will explore the computational methods and algorithmic principles driving this revolution. It will cover basic topics in molecular biology, genetics, and proteomics. The course also addresses basic computational theory and algorithms including asymptotic notation, recursion, divide-and-conquer approaches, graph algorithms, dynamic programming, randomized algorithms, and greedy algorithms. These fundamental concepts from computer science will be taught within a context of motivating problems drawn from contemporary biology. Example biological topics include sequence alignment, motif finding, gene rearrangement, DNA sequencing, protein peptide sequencing, phylogeny, and gene expression analysis.

This course is suitable for both computer science and biology students at both undergraduate and graduate levels. Students who wish to take this course should have some programming experience in a modern programming language. Knowledge of data structures, algorithm design, and biology is helpful but not required. There will be 6 problem sets each with short programming assignments. No late problem sets will be accepted, however, I will drop the score of the lowest when calculating the course grade. The grade will be computed as follows: the best 5 of 6 problem sets (5 each worth 8%), a midterm (worth 25%), a final exam (worth 25%), and many unannounced in-class exercises (in total worth 10% with the lowest 2 dropped).

A syllabus for this offering of Comp555 can be downloaded from here.

COVID-19 Considerations


Classes will be held face-to-face this semester so long as it is prudent. All students must wear masks and maintain social distancing from each other and the instructor. Lectures will be simulcast via Zoom and should be used under the following conditions.

  • If you experience any COVID-19 symptoms

  • If you have tested positive for COVID-19 in the previous 5 days prior to a class meeting.

  • If you have been in close contact with anyone who has tested positive in the past 5 days and you have not yet been tested. 

Should the instructor test positive or develop COVID-19 symptoms, he will attempt to teach remotely via zoom if possible. In any case, notification of whether or how the class will be taught will be made in the "Announcements" section of this website a minimum of one hour before the start of class.

Office hours will be held on Zoom, at least for the start of the semester. If any student is unable to attend an exam or submit a problem set due to severe symptoms, exceptions to the late problem set policy and/or alternate exam times will be considered on a case-by-case basis. Please attempt to notify the instructor at least 24 hrs in advance, and be prepared to provide documentation. The path to a successful semester relies on every person in our community taking responsible actions that include mask-wearing, social distancing, frequent hand washing, and getting vaccinations. 

 

Book, Course Information, and Prerequisites


This semester I will not be using a book. I will be teaching from my notes and I plan to add at least two modules of new material.

Credit Hours: 3
Location: SN014
Time: TTh 2:00-3:15
URL: http://www.csbio.unc.edu/mcmillan/?run=Courses.Comp555S22
Prerequisites: COMP 410, Math 381, or equivalents

Course Instructor


 

Instructor:  Leonard McMillan Leonard's Mug
Office:  SN316
email:  mcmillan@cs.unc.edu
Office Hours:  Wednesdays 2pm-4pm

 

Schedule


Date Topic Homework
January 11 Lecture 1. Introduction (slides)  
January 13 Lecture 2. Jumping into Genomes (slides) (notebook)
(SARS-COV-2Wuhan.fasta) (SARS-COV-2Omicron.fasta)
 
January 18 Lecture 3. Finding Patterns in DNA (slides) (notebook)
(Human Genome) (Chr4.seq)
 
January 20 Lecture 4. Finding frequent DNA patterns (slides) (notebook)
(LTR14A.fa)
PS #1
January 25 Lecture 5. Searching for Shared Patterns ((slides) (notebook)  
January 27 Lecture 6: Finding Motifs in our Lifetime (slides) (notebook  
January 27 Optional Python Tutorial 5:00pm-6:30pm (Zoom Link) (notebook) (video)
February 1 Lecture 7. Assembling a Genome (slides) (video)  
February 3 Lecture 8. Finding Paths in Graphs (slides) (notebook) (video) PS #2, PS #1 due
February 8 Lecture 9. Realities of Genome Assembly (slides) (notebook) (video)  
February 10 Lecture 10. Combinatorial Pattern Matching (slides) (notebook) (video)
February 15 Lecture 11. Suffix Arrays and BWTs (slides) (notebook) (video)  
February 17 Lecture 12. Suffix Arrays and BWTs continued (slides) (video)  PS #3, PS #2 due
February 22 Lecture 13. Multi-string BWTs (slides) (notebook) (video)  
February 24 Lecture 14. Adventures in Dynamic Programming (slides) (notebook
March 1 Midterm Study Session  
March 3 Midterm Exam covering Lectures 1-13 (open notes, open internet)
March 8 Lecture 15. Comparing Sequences (slides) (notebook)

March 10 Lecture 16. Sequence Alignments (slides) (notebook)  PS #4, PS #3 due
March 15 Spring Break
March 17 Spring Break
March 22 Lecture 17. Advanced Sequence Alignment  (slides) (notebook)

March 24 Lecture 18. Divide and Conquer (slides)
#PS 4 due
March 29 Lecture 19. Determining a Peptide's Sequence (slides) (notebook)
March 31 Lecture 20. Scaling Up Peptide Sequencing (slides) (notebook)  PS #5
April 5 Lecture 21. Hidden Markov Models (slides) (notebook)  
April 7 Lecture 22. Inferring Ancestry using HMMs (slides) (notebook)  PS #6
April 12 Lecture 23. Genome Rearrangements (slides) (notebook)  
April 14 Wellness Day
April 19 Lecture 24. Transforming Genomes (slides) (notebook)  
April 21 Lecture 25. Randomized Algorithms (slides) (notebook) PS #6 due
April 26 Lecture 26. Evolutionary Trees (slides)  
May 5 (Thurs) Final Exam (zoom) 12:00pm-3:00pm

 

Resources

Jupyter

All coding examples, problem sets, and exams will use Jupyter Notebooks and Python3. If you have a Jupyter Notebook enviroment set up, I recommend that you use Anaconda as follows:

  • Go to https://docs.anaconda.com/anaconda/install/
  • Follow the installation instructions for your operating system
  • Open the Navigator
  • Select Launch under Jupyter Notebook
  • A screen like the Jupyterhub should appear in your browser
  • Create a folder for the class


Site built using pyWeb version 1.10
© 2010 Leonard McMillan, Alex Jackson and UNC Computational Genetics