DNA Library Part 2
The DNA library we made last time had a few disadvantages. In this assignment, we will hopefully fix some of them, as well as add some additional features.
As a reminder, you are limited to the following methods on strings:
- indexing (
x[i]
) and slicing (x[i:j]
) - append (with the
+=
operator) len
,in
,not in
- equality comparison (with the
==
operator)
but you may use other structures provided by Python (tuples, dicts, sets, etc.) and any methods/functions associated with them.
Assignment
Your first goal will be to convert the DNA library we made last time into a class-based library. Start by writing the following class:
class DNASeq(object):
def __init__(self, seq):
self.sequence = seq
Then convert each of the methods from the previous assignment into class methods. The __init__()
method
should call isValid()
and return None
if the result is not a valid DNASeq.
All the other methods we implemented last time should be turned into methods that modify the DNASeq
object (except for isValid()
and countBases()
). As a reminder, these are the methods from
last time:
isValid()
addBase()
countBases()
extendDNA()
insertBase()
removeBase()
Remember that class methods must take self
as the first argument when you are writing
the function.
Then, add the following methods to the class:
myFind()
myFind takes two arguments: self
, and a base (A
, C
, G
, or
T
). It returns a list of all indices in self.sequence
where that base occurs. That is, the
following code should never print anything:
for i in dna.myFind('A'):
if dna.sequence[i] != 'A':
print("Something went wrong!")
myReverse()
myReverse
takes self
as its only argument, and reverses the DNA sequence in-place.
complement()
myComplement
takes self
as its only argument, and does not modify self
.
Instead, it generates the complement sequence of DNA, puts it in a new DNASeq
object, and
returns that. The DNA complement sequence is the original sequence subjected to the following rules:
- All occurrences of
A
in the original sequence becomeT
in the complement - All occurrences of
T
in the original sequence becomeA
in the complement - All occurrences of
C
in the original sequence becomeG
in the complement - All occurrences of
G
in the original sequence becomeC
in the complement
main()
Finally, write a main()
function that does the following actions, in order:
- Generates the DNA string
"ACGAGCATGGACTACTGACGAGGAACCCTTTT"
- Checks that this is a valid string
- Appends "A"
- Appends "G"
- Appends "T"
- Prints how many "C"s are in the string
- Extends the DNA string with
"AGCTAGGAT"
- Inserts "C" at index 4
- Removes 4 bases from the end of the string
- Reverses the sequence
- Prints the complement sequence
and call this from the top level of your program.
Insight
Compare the difficulty of writing the main()
function in this assignment versus the last assignment. Did
you find it easier or harder? Why? Write one or two sentences summarizing your thoughts.
Submission
Submit a single file named DNA2.py
on Canvas. Your file needs to compile and run. It should also have a
header with the following information (this goes in your source file, not in the program output):
# File: DNA2.py
# Student:
# Course: Intro to Programming
#
# Date:
# Description of Program:
# Was the main() function easier or harder to write? Why?
The description should be a short (1-3 sentence) description of what the program does. Do not describe how it's written!