☰ Menu

      Bioinformatics: Command Line/R Prerequisites 2020

Home
Introduction
Intro to the Workshop and Core
Schedule
Resources
Zoom help
Slack help
Software and Links
Cheat Sheets
Intro CLI
Logging in
Introduction to the Command Line part 1
Introduction to the Command Line part 2
The cluster and modules
Introduction to Python
Python Part1, Background
Python Part2, Data Types
Python Part2, Solutions
Python Part3, Flow Control
Python Part3, Solutions
Python Part4, Working with Files
Python Part4, Solutions
Advanced CLI
Advanced Command Line
Advanced Challenge Solutions
A Simple Bioinformatics Workflow
Software Installation
Using make and cmake
Using Conda
Getting set up with AWS
AWS Event Engine
Starting and Stopping your AWS Instance
Introduction to R
Introduction to R
Introduction to tidyverse
Organizing, manipulating, and visualizing data in the tidyverse
Advanced R
Linear models in R
ETC
Closing thoughts
Github page
Biocore website

Part 3: Flow Control

Outline


If, else if, else.

if True:
    print("Hi there!")

if 22:
    print("Hello world")

if True or False:
    print("Binary operators work here.")


if 0:
    print("This won't be printed")
elif 1:
    print("Any value other then 0 is treated as True")
else:
    print("This won't be printed")

>>> if True: ... print("Hi there!") ... Hi there! >>> if 22: ... print("Hello world") ... Hello world >>> if True or False: ... print("Binary operators work here.") ... Binary operators work here. >>> >>> if 0: ... print("This won't be printed") ... elif 1: ... print("Any value other then 0 is treated as True") ... else: ... print("This won't be printed") ... Any value other then 0 is treated as True >>> # Be sure to include a couple of enters here so it all executes!

For loops

Iterating through lists

wizard_list = ["Harry", "Ron", "Hermione", "Fred", "George"]
for name in wizard_list:
    print(name + " says Wingardium Leviosa")
>>> wizard_list = ["Harry", "Ron", "Hermione", "Fred", "George"] >>> for name in wizard_list: ... print(name + " says Wingardium Leviosa") ... Harry says Wingardium Leviosa Ron says Wingardium Leviosa Hermione says Wingardium Leviosa Fred says Wingardium Leviosa George says Wingardium Leviosa

Iterating through dictionaries

wizard_dict = {"Harry": "Lumos", "Ron": "Alohomora", "Hermione": "Wingardium Leviosa", "Fred": "Riddikulus", "George": "Sectumsempra"} 
for wizard in wizard_dict.keys():
    print(wizard + " says " + wizard_dict[wizard]) 


for spell in wizard_dict.values():
    print(f"A wizard says {spell}")
>>> wizard_dict = {"Harry": "Lumos", "Ron": "Alohomora", "Hermione": "Wingardium Leviosa", "Fred": "Riddikulus", "George": "Sectumsempra"} >>> for wizard in wizard_dict.keys(): ... print(wizard + " says " + wizard_dict[wizard]) ... Harry says Lumos Ron says Alohomora Hermione says Wingardium Leviosa Fred says Riddikulus George says Sectumsempra >>> >>> for spell in wizard_dict.values(): ... print(f"A wizard says {spell}") ... A wizard says Lumos A wizard says Alohomora A wizard says Wingardium Leviosa A wizard says Riddikulus A wizard says Sectumsempra

List Comprehension

>>> wizard_dict = {"Harry Potter": "Lumos", "Ron Weasley": "Alohomora", "Hermione Granger": "Wingardium Leviosa", \ ... "Fred Weasley": "Riddikulus", "George Weasley": "Sectumsempra"} >>> weasley_list = [wizard for wizard in wizard_dict.keys() if "Weasley" in wizard] >>> weasley_list ['Ron Weasley', 'Fred Weasley', 'George Weasley'] >>> h_list = [wizard if "H" in wizard else "Voldemort" for wizard in wizard_dict.keys()] >>> h_list ['Harry Potter', 'Voldemort', 'Hermione Granger', 'Voldemort', 'Voldemort']

Challenge Questions:


While and Iterators

Iterators

While loops

h_iter = iter(h_list)
cont = True
while cont:
    if "Voldemort" == next(h_iter):
        cont = False
    else:
        print(next(h_iter))

h_iter = iter(h_list)
cont = True
while cont:
    value = next(h_iter)
    if "Voldemort" == value:
        cont = False
    else:
        print(value)
>>> >>> h_iter = iter(h_list) >>> cont = True >>> while cont: ... if "Voldemort" == next(h_iter): ... cont = False ... else: ... print(next(h_iter)) ... Voldemort Voldemort >>> h_iter = iter(h_list) >>> cont = True >>> while cont: ... value = next(h_iter) ... if "Voldemort" == value: ... cont = False ... else: ... print(value) ... Harry Potter

Break, continue, pass, try, except

Break vs Continue vs Pass

a = [0, 1, 2]
for element in a:
    if not element:
        pass
    print(element)


for element in a:
    if not element:
        continue
    print(element)


for element in a:
    if not element:
        break
    print(element)
>>> a = [0, 1, 2] >>> for element in a: ... if not element: ... pass ... print(element) ... 0 1 2 >>> for element in a: ... if not element: ... continue ... print(element) ... 1 2 >>> for element in a: ... if not element: ... break ... print(element) ...

Try and Except:


Simple Functions: parameters, scope, returning, passing

Lets define two functions or definitions

The two following functions are used to compare the different way that parameters can or can not have default arguments defined.

def simple_function(some_string):
    string_passed = True
    if some_string:
        final_string = "You passed the following string: %s" % some_string
    else:
        string_passed =False
        final_string = None
    return string_passed, final_string

def default_function(some_string="Default string"):
    string_passed = True
    if some_string:
        final_string = "You passed the following string: %s" % some_string
    else:
        string_passed =False
        final_string = None
    return string_passed, final_string


simple_function()
simple_function("Testing")
default_function()
default_function("Testing")

# these are kind of advanced topics but good to briefly introduce you to
default_function(**{"some_string": "Dictionary_testing"})
default_function(*["some_string"])

>>> def simple_function(some_string): ... string_passed = True ... if some_string: ... final_string = "You passed the following string: %s" % some_string ... else: ... string_passed =False ... final_string = None ... return string_passed, final_string ... >>> def default_function(some_string="Default string"): ... string_passed = True ... if some_string: ... final_string = "You passed the following string: %s" % some_string ... else: ... string_passed =False ... final_string = None ... return string_passed, final_string ... >>> >>> simple_function() Traceback (most recent call last): File TypeError: simple_function() missing 1 required positional argument: 'some_string' >>> simple_function("Testing") (True, 'You passed the following string: Testing') >>> default_function() (True, 'You passed the following string: Default string') >>> default_function("Testing") (True, 'You passed the following string: Testing') >>> >>> # these are kind of advanced topics but good to briefly introduce you to >>> default_function(**{"some_string": "Dictionary_testing"}) (True, 'You passed the following string: Dictionary_testing') >>> default_function(*["some_string"]) (True, 'You passed the following string: some_string')

Group Exercises (~30 mins)

  1. Use the below to perform the following tasks:
    seq = "CGGTAAGGCACTTGATATATTAATGTTCAGGCGGACATCGGCAGGGTATATTGCGTTAGGATCTAATTAATCTATCAGACTACTTGAGGATTGTCACCTGATGGTAATAGCGTAGTGGGGATCGCTAATCCTACTGTCGATAGACGGCTGCGGTTAAACTAAGCATCTTGCTTCCGGACGGTGGAACCGATTCAAGCGTTAGCAAATGTCAGGTTCACACTAAAGAATCAGCGGGTCTCCCCTACATCTTGAGTTTTATGGCTAACCCTATATCTGTCGATAGCATGGCAGGTTCAATTTAATCACAGTGCTTGCACTGACCTCGCCTACCGGAAGCCCCGCCGCCCAAAGTGACACCAGAGTCTTCGTCACGCAGATAACGCCCCGTGCTATCTGTCCCCCGTCCTTCGAGCAGGAGTTTGGTCTGACGCCTACTTCGGCGAACGTAACCCCTCCTTGTCTGTATTAAACTGCCCCGGATGTCGACTGGTAACAAGACGGACTACAAATAATGTGTACCTTTAGACGTTCTTTAGTGATATATTGTGACCACGTATTACAGAGATACACGACATCTCCTTTATAGGTACACATAACCTAGGATCCGTAGCACCGGGCCGGATTCTGCCCAGCAAGGTGCATCCCATAAGCGTAAACACTGCACGGCGGTCAGAACTCCCCACAATCGATACGGTGTCAACTATGGTACGGATGTCAGTCGGTCCTGTAGATGCACTTGGAAGTGTACCGCGTAGCGTCGCAGGAGCGTAAACATCACCTGGACGGTCCTTTCCAGTAATGAGGAACTCAAACATATAGGGCAGG"
    
    • What is the complement of this strand of DNA?
    • What is the reverse complement of this strand of DNA?
  2. Use the below to perform the following tasks:
    table = { 
         'AUA':'I', 'AUC':'I', 'AUU':'I', 'AUG':'M',
         'ACA':'T', 'ACC':'T', 'ACG':'T', 'ACU':'T',
         'AAC':'N', 'AAU':'N', 'AAA':'K', 'AAG':'K',
         'AGC':'S', 'AGU':'S', 'AGA':'R', 'AGG':'R',
         'CUA':'L', 'CUC':'L', 'CUG':'L', 'CUU':'L',
         'CCA':'P', 'CCC':'P', 'CCG':'P', 'CCU':'P',
         'CAC':'H', 'CAU':'H', 'CAA':'Q', 'CAG':'Q',
         'CGA':'R', 'CGC':'R', 'CGG':'R', 'CGU':'R',
         'GUA':'V', 'GUC':'V', 'GUG':'V', 'GUU':'V',
         'GCA':'A', 'GCC':'A', 'GCG':'A', 'GCU':'A',
         'GAC':'D', 'GAU':'D', 'GAA':'E', 'GAG':'E',
         'GGA':'G', 'GGC':'G', 'GGG':'G', 'GGU':'G',
         'UCA':'S', 'UCC':'S', 'UCG':'S', 'UCU':'S',
         'UUC':'F', 'UUU':'F', 'UUA':'L', 'UUG':'L',
         'UAC':'Y', 'UAU':'Y', 'UAA':'Stop', 'UAG':'Stop',
         'UGC':'C', 'UGU':'C', 'UGA':'Stop', 'UGG':'W',
     }
    rna = "AUGAGCGCGAGUCUUGGUCUCGCAUGGCCCGUGGUCGGGACAGCUGCACAUGGGGGACCAGUUAUCCUCAGUAUAACUUUCUGCACCGAGACAAUGACUCAGUUACAGCGUAAUUUCAUUCUCCCUUCGCCAAAUGAUCCCUGGCUAAGUAACCAAAUUAAGCCCAAGCCUUGGACCGGCAGGAGUGAGCUGCUCUUUCAUAGGGGCAAAUCCAGCACGCCGCGGACGUCGAAGCCGUUGAAUCCACCACCGACGGUAUCGGGCUCCGGCAAUACACCAAUAAUUCCCUGCUUAACCCGUCCGUCUACUUCAGCUGUGACCCCAACGCGUGAUGACUGGCUGCGCCACGAGCACCCAGGACUCCGGAACCUUGAGCAUGGAAUGAUAAAAUUGGGCUCAGCCUGCUCAGUUCAAAGGGGCCCUUCAAGCUUACGCCCGAUCAGACAUUACAGCACUUCGAGGGAUCGUAUAAGAGCUUGUCUCCAUAACACGACCCGAUGGGGCCAACAGACACUACUUCGAACUGCCGGGGGUCGAGAAUUAAUUCUACUGACUCCGUCCAGACACUGGGGGUUCCAAACUCAUUGGCUUCUAUGUGCGCCAGAGUCGUGGUGUAGGAUUUUUACUCCAUCUAUGAUGCUCGCUCUAGCACCACGCUUAUCCUACGUUAAAGCGUCAACUUCUGCUCUCACCCGCUUCUAUAGAACCAAUAGGAUGACCUGUUACCUUAAUAUAUUCUGCGCAGGGGGUGAUAAGAGACCGAAGAGAUAUCAGACCCUUUAUCAGACCUUGGCGCCCGCUAGCGCGACAUGUAAUACCCCGAUCGGCCCAAUACAUUCGCCUCCCCACGUGGGCCCUCCCCGUGCCACUCACCCCCACCCCCUGUUAUUACAGCUCCGAUUCUGCUAUUUUACCAUGCUCCAUCCCUUUAGAACAUCCUGGCGAACCAGCUAUUCUGUUCCUCUGGGGACAAGUCACGGAGUUCUCCCCCCGACCCCAGCCGCGUUUGCCCGUAUUACGAUUAGGUAUCGUCGCUUGAUGCUCAUAUGCCACAGGUCCACGAAAAGUCGCCUAUUUCCAAUCAGAGACCCGAAGGCUGUACGUUGGUUUAUGACAAUGAUCUUGAGCAUCGUUUCGAGCGCAGAGACUGGAGAUUUCCCUUGUCUAGACCAGAUGACGGGCGAUACGAUACCCUGGUUCAUGAAUGCAAAACUCGUUUCGACCUUUCUAUGGAGAUCGACUGGAGAAGCGGGCGUAUUCUGUAAAACUCGCAAGACCUACUACUUGACUCAGUUGAGCAUAUAUCGCAUCAGCCGGAUUGGUCCACAGAAAGGUUGUAAACCCACAGCGCUCUUAGAAGGUAAAAGGCAUUACCUUAUCCUAUCAAGACCACAUGUUCUAUUUGCCAACCGAACACUCUGCAAAAGCCGUAAGUCGAGCGCUUUGAGGGGAUCAGGGUGCUCCCUUGUCGACUAUGCGGCGGCGACGUUCAUCGUGGACUGGAGUCUCUAUCGAGUUGUCGAGCAUGAGGAGUUUUUUGUCCCUCUCAAGUUGUACGGCGUAGAUUGGUCGGUGGAUCAACUCCGUGUCACUCGCCCCCCCACAGAAGGGAUUCGGCAAGUGUGUUAUAUGGGAAAUGUACAAGAAGAUAUGUCAGUCGUCGACCUAUCUACAAAGAACCGUAGGCGUCGUGAAUCUUCUAACACGCGGCCAAGAAACGAAGAUCUCGUGGUUGUUCACUCUCCGGGCCUAACUACAGACAGUUCCGGACCUUGUCCUCAUUGUACGGUGAUACCGUCAUCGCACACUCACCGUUUUUCCGCGCCAGGGAGACCGGGCGAUUCGCGCACAGCAUAUGGCAAGGUGAGAUGUCUGUCUCACCGCCCUCGCAUCACUGACCCUAGCGUAUUCUCACCGAAAUCACGUCGUUGCUUAAGGCUAGGUCAUUCAUCCACGCGACGGGUACCAGCAAGUCCUCGCAGCGCUGCCGUGCCACACCAAACACGUGUGGACGUACCAAGGGGAGGCCAAAGGCAUCAGAGCGGUCGGUCCCUCGUUUACAUUUAUCAGAAUAAUCCAAACGCAAUGCUUCUUGCAGGCCAAUGGCGAUGCCCCGCGUAUGCCGCACUAUUGCAAGCCCCCUUUCCGUGGUGGAUAAGCUUGUCUUGUUGGUUACUCAAUGUAACUUCACAUCCAUUGCUUGCCGUUAUUAAUAGUACUUACCUGGGUGUCUUUUAUAUGACGCCUCGAAACCAGUCAAGUCGGGCAGGCGAGAGACUACGUAGAUAUCCAGUGGUUACUGCUGUGCUUCCCAGGUUAAUCUUGAUUGUUAUCCGAUUUUCAGACUUCUGCGCCGCAAAGUGGAAGAGUAAGUGGCUCUAUUCUGAUGGGUGUCAAUUGGCAGAAGCAAAAGAGGGCCACAGCGCCGAUCGUUCCCCAACAAGGUCGAUGUUCAAAGUGCAAAAGGCGGAUUUGGUCACUCUGGGAGAAAACGGCGCCGUAAGUCUUUGGUCAUUUGGGCCCAGUGGCAACCGGUACAUAACCACUGGCUAUCGUAUCAACGCGACUUGCACACGGGCUAGUAGUGAGAGUGCGCACCUCUCGAGGGUGGGCGGGCUUCUGAAGGGGUGUGAUAACUGUGAGCUUCCCUCAUGCUUGUACGUUCUCUUACCCUCUCUUUCUUUGUGUCAUCAAUUUCCCAGUGAAUGGCAUGAGCGCAUACAAUCCAAGUGGUUACGCACUCCCAAAACCCCGGAUGAACUAGAACCGCCGGAGCCUCAAAUUGGUGUGGGCAGCACGACGCGCCCACUAGAGAUUUGUACUUUUAUCCGGAAGCAUCGUAUAGGAACCGCAUUAUGGAUUGUGUUCCGUGAACCCUACCAGGUAGCACGGCUUAACUCGUGUUCCGCGCCCGAAGCUCUAAACGAGUUGCCACAUCCUCUCUUCAAACUUGAAAGUAUAUUAGAAGCAGCCCUGCAAACACGCUCAAGCACGGAACAGUAUCUUAUACGAAGUAGCGGAUACGGAGAUAUUCGAUCUGGCCACGCGAUUUCCAUCGGACCAGACCUGCAAAAUGUGGAAGUGGUGUCCGUAUGGUAUCUCCAACGACAGUGGCGCCACAAGAGGUCAGGCUCUCCGCUUAGACUGCAUUCUUCAUUUCUAUGGAAGAAUGGGUUGGCUUCUCCAGCGCAGUCACUUAUAACAGCCUCGUGUGAGGAAGCUAAGAGGUGUCCUCAUAAGCGGUCGCACCUAUUGGAUCUUAAGCUAUAUAAGUUGACCGCGAAGGCGGACUUGCCACCGUUACCAGCAUCGCAACCGAUUAUACCGCCUGUCGACGACGGGUACACCCAUGUCCCUUGUCGCUCUUAUCAGGGGCGUGGUCGAGGACGGAACGUAGCGGUUGGUCCCUUCCCCAGUACCGGAAACCCUACUCUGCAUCAGUCCGUUCACUCACACACUCGAUCUCACGUACUCCCAACCCAAGGAUGUUGGACGGGGUCUAAAGGCACUAUCGCUCCGGCUAUGCAGUUCAUAGGGUUGAGGCGUCAUCAUGCGUUCAGGGUGCACCAACACCCUAAAUCUUCUUACAGCCAUUGCGGUUCCUCGCUAGAGCCAGGGUACGUCGUAGGCGUGCCGAUAAUUGGGUCAGUCCUGGCCGCGGUUACUAGAGGGAAUGGAAGAGAGUAUAAAAAAGGUAAUGCCUGGUCGCGUUACCUCACGACCGAACGUUUGGCGGUGCGAAUAGCGAGGGAUUCAGGGGCUAUUCCCGUGGCGAGGGUAAGAGUUACACUUAACGGAGAGGAUCGAAUUCGGAUCCCCUUGGCCUUACGCGUGUAUGCCUUUCCGCCUGUCCCAUCAGCCAAGUUACCGUCCAAAACGGAAUCCACCUAUAGCUUUGCCAUGCGCUGUACUGACUGUUUGUGCGUAGUGAGCUUUACCGGAUCACUGUGGUGGAGGGCCCGACUAUCCAAGAGUGGCGACUAUUUCCUGCAUUCCCCUCAGAAUGCACGGUCGGGGACCUUGACCGGUCAUUCCCGCAUUCUGUGGGUUCUUACGUCUUUACGAAAUCCCCUCUUCACGGUCGUUCCCGGCCAAUUCAACAGUAUUAAAGGACGACCGACCGAGCACCUAUCAGGGGUAAGGUCUAGACCACUCCUGGAACUCGCACGCAAGACGCCAGGCCAAGGGAUUUUCUCAUUAGGCCCUUACGGCGGGCCCCAAUAUUUGAAUGCACUGAAAUCCUCAUUAGCCAGCAGACAACAUCUCACCGUUAACGCGGGGCAGGUGGGUGAUCGAGCUCUAACCUACGGUAUAGCUGAGUCGCACUCCAAAAGUAGCAGCAACUUUCAGUCGCUCCGGCGGCACGAACGGGAUGGCUCAAAAGACAUUCUUACACCGACGAGCGCAAAUUACAUUCGGAAGCCCACGUGCUGGCUAGUCGCGGGUAAGGUGUGCUAUUUUAGACAAACAAUUCUAGUCCCCGCAGUAAUGCGCACAUCGGCUGUGCGUCCACUAAAACAGAUCUCCUAUGAUCAUUAUUUUUGGGAGUGGAAAUGCUAUUAUAUAUGGCCAACUGGCCCCUGCUCUGUCCCUGACUGGGUACGGGAUGACUCCCCAACUCUGUGGAGAAUCGCUUUUCUAGCUUGCCUACGGAUCUGUUCAAGCGUUUUUCCCGUCAAUAUACGGGGACCCGAUAUAAUACAAAUCUGCAAGGGGUUUGAAAAAGGGCACAUGCAUCAAAAUCACGCGAAGCGCUCACUGAGCGCCAAUAACCGAAACCUAAGGAUACUCUCCGGCGCUGAAAGGCGGGCGAUGAUGGAUUCCCACAGUGGCCUAGCGAGGGUAAGGGCGUUGCGUCGACAUAUGGAAGGUGCGCGAUGCGACGUAAGGGGCGUAACUCAUGGGUCACAAUAUAGCUGCGGUAUACCUCAUCAACUAAUGGAACGGCGUACCUCAGAAUACUGCCUAACGCCGAAGGAGGCCUGCCUUAGUCGGUUCAUUAGAUUUGCCUUACCGGGGAGUCAUAUUCUGCAAGUUACUCCUGAGUUAGGGCCGGUCAUCGUCCCAGAUCAGAACUCUACGAGCAGCACAGGGGCCCUCCGGCCAGCCAACCCAAAUUGGAUCAAACUCGUUGGCGUCGUAGUUUACAAACCUGGAUGUGGAAUGCUCGUUUACUUGUCCGCAAGCGGACAGCAUAUCGGACCUGACAACCGAAACGGAACCAAGAACCGAGCUCAUCGCACACGGGUGUUCUCCUACCUGAUCCUAACUAUAACACGGCACCCGUCUGGACUGUGCUUUCAGGGGGACCCUAGGUCCAUCAAUUUGGCUGCCGACUGGAAGUCGGGUUCUGUGAUCAUCUCAUUGCAGUAUAGGUUCUCGACACGCUCUCCACUGUUAUCAUCAUAUCAGUUUCUUCAAGGCUCAACUCCCGUCAGAGCCGAUGUACUCAGAACCGGCAAGAGGACCUCGGGUGGGUUGUCUCCCAUCGAUCGGCUGGUCACGCCUGGGUCCGGUUGUCCGCCCCCUAUCCGUACGGACAUCCCCGAACGAGACACCGUCGCGUCGACGUCUGGUUUGGUUAUAAACAUUCCAUGUACAGCAUGUGCGACUCGAAGGACUGAUCAUUUGCGUAGUACCAAAAAGUGGGAGUCUGGGUGUUGCCCUGCUGGAAGCCCCCGUUAUCCCCGGUCUGAUAUUUCGUUGCCACGGAAUGUAGAAGUCCUGCAGUUGUGUAUUGCCAUCAAACUCAUUACCCACAAGGCUUGUAUAUGCUAUUUUCUGGCGCCUCUGGUUAGCCUGGAUACAAUUGCUCAAUCAAGGUAUGUUCCAUCGGAGUCGAAGACUCCUGAGGUAAUGUCCCACUCUGACGAUACCCACAGUUUCUCAUCUCUAGUAUCAAACGAAAUAUCCUCUACACUAAAAGCUUCUUACAAAGCAGGUGACUUAUCAUCAGUAGCCUGCUACAUACUGCCCUUAAACCCCAGUCUGAGUUCUGUUGAAGGCUGCACAACCUUCUGUCUAUCGGGGUUACUGCAAGGUGGGGUACACUCUGCAGGGGCACAUCAAGGUAGUAUACUAGAUCGGAUUAGUUCUUGGAGCAUUCAUUUUGCAAUGUGCAUCAAAUCCCUCUAUUGUCAGAUCCACAAGUUUAUAUUCAUGCACCCCGACGCUAAUAGGAAAAACCAAGGUUCUUUGAUCCCAGGUCAUGUUUGGGCGCGCGCUUGGGCCACAAAAUGCUGCAUCUGCUUCUGCGUCGGAGUGGACGGGACUAAUAGCAGGGAGGUGAGCUUGGGUCAAAUGCCUGCCAUGGAACCUAUUGCGUCGAGAGGAGCUACAUCAAGACGCAUAACUCUGUAUUGCAGCAAUGAAGUGAGUGACAUGACGGCGUUUUACUUCUCCGCGAGGAGUCUACCGGUAAUGAAAAACAGACCGCAACUAUGGAGGGAAUUAUGUAAGUCCCCCUCCGAGUUAUACGUAUGCCGGCUCUACGCCGGGGCUAUUCGUCGAAUUCGAAAUUUGGAAUACAACUCUUGUAAGAGGACCCGGAGCGAUUGCAAAGCGUAUGCCCGAGCGAGCAUAAUACGGUGUGCCCCGCUCCCAAUGGUGUAUUCUCACCUUCAGGCGGGGUGGAAUAAACUCUGUGGCCAUUGGAGGCGAGGAGUUUGUGCGGCGAACUGGUGCAUAUAUAGACCCCCCUUAUGCACGUGCACCAAGCCCUUGUGUACCAGGGCCGCUGAUCCAAAGAACAAGGACGAUAGCGGUCUUUCUUUUCUAGGUUACGUUCGGCCGUCCGAAGAAUCCACCUCGGCAGCAGGCAAAGUGCCUGGUACAAAGGUCCGACAUUGUCACUCAAGGCUAUCCACGGAUGAGCAGAUGACUGGCCUAACUGGCAUGCCGCUGCCCUGGUCUGAAGUACACUCCAGGAGUUUUCGUAGUCAUGUAGGGUGGCCCCAGUCAAGGGCUAUGCCGCUGCGACUCCCCCUCACUAGCGGCGAAGGUGGUCAUGUCUUAUCUACUCGCUCUGCCAGACCACAGAAUAGGCUUGGUAUUUCCCACGUCCCCGCGAUCUUACCACAUGUGUCCAAACAGGAAGUCCGUUCCAAGCAGUCAUUUGCUCGAGGGAUGGCUAAAAAAAUAGUCGGAGUUGGCGGGAAAGCUAACGCCCGGCAUUCACUCGCGUGCAUGCUGGAUUUACACCUUUGUGCGGUCUAUUGCGACAAAGCCGAUUCUUUGCUAAUCGCAAAGAAAGCCGAUGAACUAACGCAGUACAUCACAGGGCAAAGAAAGCCUGACUCCGUGAAGGCUUGUAGCGCAACAUCUAUCCUAGAUGAAAGACCGAUUAAAGCAUUACACAGCUACGGACACAGCGCGGUCCCUCGCAUACCCGCCUCCAACAUGUCCGCCGGCGCUUCUAGACACUUGGGCUCACCGUGCCGCAUUGCGUCUUCAAUUUUGAGUAGUCCAUUCUAUGCGAGAAAUUAUCUGAGGGUUGCCACUAACCCAUUCCAAGUGAGGUCCCGCUCCCUCAAAACACCAAUAUUCUCCAAUCCCCUUGAUACUAUGGGGCGACUUUGCCGGCAGAACCCACCCGUCUGCACUUCCGGGGAAGUAAUUAGUACGGAGGUUCGGAGCUCCUUCCUGGCGAUGCAUACUCUCAUUUCGCUCCAGCUCUCUCAACCGUUUCACUACGGGCAGAACAACGGUCACGAUGUAAAUGUGCCCCCAGCAUCUGAAGCACAUUUUACGCGCAGCGUAGAAUUCAGGACGUGGUGCCGCUUGCCCUGUGUAUUCAAUGAGGUAUUAAAGCAAAACUAUCAUUGUUUCGAUCAUUUACGCUCGCGCCUCUGGACUGAAGUCCGACACUGUGAGGCCAUCAACAUCCGGAUAUACUUUUAUCAUAUUAAGGGCACUCCCGUGCCGCGCUUACAUUAUCAUUCGAUUACCACCCUCUUUGAGACACAGAAAUUUGUGAGACUGCUCCCAUCCCUUUGCGGUCAUUUUUGUGCGAGCUUCGAGUUGCACAGUUAUUGUUUACGUUGCAGGUCCACCUGGAUUGUAGGUAAGAGUCGAGAGAUGGGGUUGAUGCGAGCCGCCCAUAACACAUCUUGCGGUGGCAUGCCCAUCCGAACGCAGUAUCGAGGGAGCGUUGUACUGCCAGGCCGGUCCGACAUGUGGAUUUUACACCUCUGCACUAUGCGGAAGAAUGAAGGCACCACCGGCUUCUUCUGGGCAUACUUUAUUCUAGUCUGUGGCAUAUGCUUUUCAUCCAUAGAAGCAACCCGCACAAUGAGAAGACGAGACCGCAGAAUACGUUUAAGGGUGUUAGCACUGAAGCCUUCGGAACUGGUAACUGAUAAGCUACACUCUCAAUCACCGACUUCAGAUCUGUCUUGUAAACAGAAUCCGAUAGAUCCUACAGCCCUGGCCUCCUUCAGGACUCGCGUCACCCUCCGUAAGGCCGCUGCAAGCUGCAUUAUGGGCCCCUUGGAGCAGGGAAGUAAGGCCCAGAUUGCCAGGUCCGUGCUUAUAAAAUUUGGUGGGCCUUUCAAGCAAUUAACGAUGUGUAACAUCUCGGCCAGUAGUAGGGACCAGUGCGCCGGACCCACGUACGCCUUAGUUGGGACUCAUGGAGAGGUGCAUGGACAUGUGCUUGUCAGACACGAGCAACCCAGACAUAUGUUCGCUUUUACUUUACUGUGGUCCCAUCGAGUAUACUUAGGAAGUCCUCGUCGUUUUUACAUGCUCUGCCGGCGCCGUGCGCCGCUAUCUUUUGUAGAACGCAGAGUACUGCAAACAGCAGUUGAGUACUCACCAACCGGCGCACUACUCUCAUGGUUGGGCCGGUCACUACUCUGGCUCCCACACACCGCUAUGGGUGUAGGGAAGCGAUUUGCCACAAACAGAGCGGAGACUUUAAUUCAUUCAGGGUAUCCCGCAGACCGAGUCCAGUCUGGCAAUGGUCAUAUUAGUAUUAGCGAUUCCAAUGGGUCCCUUGCGCCUAGUGGUAGGGAUUCUAAUCUAUCAUGCAAACGGUUUUCUACGCCUCCCUCGGUUUGCUUCGCCCUAUCAAGUUAUGGAACUGCAAGUUACAAAUUGACAAUAGCUCGACCCACAGUAAUCGCCGGCACUCCUAGAUCACUUGUAGCUAAAGGUAGUUACUUCAUCCCCAGGGCGUGGGAAGUCGGAACUGCGAUGGGAUGUAGAUCAACCUACUGCGAGGACGGAAGCUUCAAGGUUCGUCAACAUAUGGGUGAGCUGGCGUUCACGCCAAUGGCGUCGAAGCUUGCCCCGCGACUCCAUCAUUUGUAUCUAGGGAUCUCGACGUAG"
    
    • What is the protein encoded?
    • OPTIONAL: What are ALL the possible proteins possible from this rna string given any start position?
  3. Use the below to perform the following tasks:
    str1 = "GCAGTGTCCACGGGATGTAAACCCCGTTTGGGACCTGCCAGGCTTCCGCACTACAGTGTGTTGCGCGTACTAGACCTTATGCCCATATGAACCTAGCTCGGGCTACTTGGATCGAGGAACTCGATACTTCCCCTACTGCAGCCCAACAATATCGTAGAAGGCAATCGAATCGCCTAAAAATTTATCGCCGCATTTTACGTATATGCCGGCTGTGGCGTATAGCAGAAAACCGCTCCTCGCAAAGTCTGGAGTATTGGATGAACAGTACCCTGGATAGAGTTAAAGGGCCAGAAAAAGCCCAAGACCATGGCTTACCAAGTTCCGTCCTATTCAATATAACCTCAGGCAGCTATGCGGAACTTTCAATAAAATACCAAGACATGTCTCTGTCTGCTCCAGAGCGAGTGGATAGGCCCGATTATTTATGTGTCGGAGGACCATGGTTCGCAATTACACTCAAGCGAGCGATTTTTTAACTGTCCGTTCACCTCACGGACCTCGGGAGGATACAGAATCGGGTGGTAATATAACGAACATAACCGTTTGTGATCCTGGAAATACACAAGTTCATAAAAAGTGTGACCCGAAAGTGGTTTCACTATATAGGTATTCCCCCTAGTATGGCAATCTCGGTCAGTCAGCACTGGCCGATGGCTCAAGGCAACATTAGGGGTCGGCGAACGCAGTTGCTCCAAATACAAGTGTGGTTGCAAGTAGCATACGGCACACCGCGGTCTGGGTATGAGCCATACGTGTGATTTTGGTCAATTACCTAAGCAGCTTGTAGTCATAGCTATCTTACCGATTAGCGAACCCAATGAAGAGTTCCAGCATTCGCGAATGGGGGGTACAAGTAGGCTCTCGGCCGCCTACTAGACCGGGCGGTAAGGTGGCGTGTAGACCAAATCCTTA"
    str2 = "AACTAATCAAACTGGCTGGGATTCCTTCGGCTAAATACAGTCCCATCCAAGTATAGTCTGTTATTTGTGCTACGCCTCAAGCCTCTGATATGCGGTACCCTTGGATTTGGAGTGTAGGCCACCAGATCTGTTTTGATCCGGCCGGATCGTATCGAAAGATACTGAAGGAGCTCATAAACATTGAGCGTGTCAGAGTGGCCGAACGCCGGCGGTCGCTACAACAACAGGTCGTCGACTCCCTAACGCGCTGACATAGGTCTAATAGTCACGTGTATGGTTCTATCTGTGGAACAAAAACTAAAGCTCTTGGCTTAAGATGATCGTACCTGACGGATGTAAGATTCGACAACTAAACCGTTTTATCGTTAAAACATGCCGACCTGACTGTTGCTTCTACAGAGCGCGTTCAAAATCGTGCCTATCCTACAGTCGTACGCTGAGTGAACGCTCTTAAAATCTAGGAAGAGTAATCTTGCATATCCGAACCGAGCTTAGCCCTCCAATGAATAGAAGTACGCGCCTTTGTATTCCGAACATTTTCGTCTGCGATACTGGAAACATGTATGATCGTAAACACTGGGACCCGTCATTAGGTCTTTGATAGTATCGGTGCCTATATGGTGTCATTCTTATTCGGGCTAAACGAACGCTTGGCTCTGCGAATTGTAACCTGAGCTGAGGCGCACTCATAACACAACAAACGGAGCCTGGATATATCATAATGATTCTCCTGGGATGGGTACACGAGGCACCTTAGACTGCGCATTGCTCTAAGCACACCTAAGAGCCTGGTCAAGAACAGCTATTTGCGTACGCAAAGATGCGTTGATTCAATGGGGAACGCCGGGATCTTCTAGACTCGCAGCGCCCTCACTGCCGGGTGCGTAGCCTTGCGCCTGTCCATGATCACCA"
    
    • How many matches are there between the two strings?
    • How many mismatches are there between the two strings?
    • What is the GC content of the first string?
    • What is the GC content of the second string?
  4. If you have any free time after these tasks take a moment to check out Rosalind. It is a great platform for continuing to challenge your python and bioinformatics!