BioInfo 4U

Steve's Quick Links

Steve's Home page

Steve's SGCEP Courses: Introductory Biology I and Biology II, and Environmental Science

Steve's VSU nonmajors introductory General Biology courses: I and II

Steve's VSU graduate level Molecular Phylogenetics course

Steve's FSU Introduction to Bioinformatics Laboratory course

Steve's FSU Comparative Genomics Outline

Steve's FSU GCG Workshops

Florida State University Department of Scientific Computing

High Performance Computing at Florida State University

Florida State University Biology Department

Guide to Online Higher Education in the United States


Introduction to Bioinformatics -- Lab Syllabus

Spring 2009 Laboratory Section:
Wednesdays from 2:30 to 5:00 PM in Dirac 152.

Course lectures include demonstrations of biocomputing techniques. However, in our experience we have seen that learning occurs more readily when actually using real data with real biocomputing software. Students apply theory learned in lecture to experimental settings yielding an advanced understanding of evolution, form, and function.

Steve Thompson is available to assist students in using their own laboratory and office computers or the SC and Biology Computing Lab computers for server access, and to help with their term projects throughout the semester.

The order of the labs roughly follows the order of lectures in the course. Exceptions are required to maintain the project-like progress of the labs, with each tutorial building on the previous.

Lab Reports are to be completed online using the provided form each week, and are due anytime before the subsequent week's lab session.

Lab 1, Wed. Jan. 7, 2009:
An introduction to the computing platforms on which the course is taught (pdf) (Lab Report #1).
This includes background information on computers in general, all forms of remote computing, text editing, basics of the UNIX operating system, and the X environment, as well as requesting your new FSU HPC account.
Lab 2, Wed. Jan. 14, 2009:
Molecular databases and how they are organized and accessed (pdf) (supplemental lecture pdf) (Lab Report #2).
Internet sequence and structural databases as well a brief introduction to the Wisconsin Package (aka Genetics Computer Group or GCG) and its graphical user interface (GUI) SeqLab and the on-site GCG sequence databases will be reviewed. Access methods such as those available on the WWW, including NCBI's Entrez, and those available locally, GCG's LookUp, will be emphasized but data entry and format conversion are also covered.
Lab 3, Wed. Jan. 21, 2009:
Unknown DNA -- rational probe design and analysis -- the "guessmer" (pdf) (Lab Report #3).
How to design and analyze oligonucleotide primers for discovering genes in organisms where they have not been identified when the gene's encoded protein sequence is known in other organisms. Techniques used include basic multiple sequence alignment, consensus creation, back translation, and primer discovery and evaluation.
Lab 4, Wed. Jan. 28, 2009:
DNA fragment contig assembly (GCG's SeqMerge) and restriction enzyme mapping (pdf) (Lab Report #4).
How to get sequencing fragment data from an automated sequencer into the computer and assembled into a contiguous sequence (contig) using GCG's SeqMerge, and then how to perform restriction enzyme mapping and compositional analysis on that contig for subcloning and other purposes.
Lab 5, Wed. Feb. 4, 2009:
Database similarity searching and the dynamic programming algorithm (pdf) (supplemental lecture pdf) (Lab Report #5).
What's available, the methods and algorithms, their limitations, and the significance of their finds. You should never search DNA against DNA, if dealing with coding sequences -- six frame 'blind' translation. Searching methodology -- motifs, substitution matrices, hashing and heuristics, homology versus similarity, dot matrix analysis, pair-wise comparisons, and significance testing.
Lab 6, Wed. Feb. 11, 2009:
Gene finding strategies. How are coding sequences recognized in genomic DNA (pdf) (supplemental lecture pdf) (Lab Report #6)?
Searching by signal versus searching by content, i.e. transcriptional/translational regulatory sites and exon/intron splice sites, versus 'nonrandomness,' codon usage; and homology inference. Understanding the concepts and limitations of the methods and differentiating between the approaches.
Lab 7, Wed. Feb. 18, 2009:
Multiple sequence alignment, expectation maximization, profiles, and Markov models (pdf) (supplemental lecture pdf) (Lab Report #7).
Lab covers: 1) using MEME to discover hidden motifs; 2) running the progressive, pairwise alignment program ClustalW with the SeqLab editor to develop a multiple sequence alignment, and refining that alignment with SeaView and MAFFT; 3) understanding traditional Gribskov profiles and using HMM profiles for remote similarity searching and further alignment; 4) visualization and annotation techniques for multiple sequence alignments.
Lab 8, Wed. Feb. 25, 2009:
Molecular evolutionary phylogenetic inference (pdf) (Lab Report #8).
How to use PAUP* (Phylogenetic Analysis Using Parsimony [and Other Methods], PHYLIP (PHYLogeny Inference Package), and other tools to ascertain and draw phylogenetic trees from multiple sequence alignment datasets. Emphasis is placed on the reliability, congruence, and accuracy of model-based approaches, especially using Maximum Likelihood methods, though time limits restrict the lab to quicker methods.
Lab 9, Wed. Mar. 4, 2009:
Estimating protein secondary structure and physical attributes (pdf) (Lab Report #9).
The various methods, their usefulness, and their limitations are all covered. This includes proteolytic digestion mapping, molecular weight and amino acid composition determination, isoelectric point estimation, hydrophobicity and hydrophobic moment determinations, surface probability and antigenicity mapping, and secondary structure prediction, particularly using methods based on homology inference (e.g. PredictProtein,, in North America).
Spring Break! Wed. Mar. 11, 2008
Lab 10, Wed. Mar. 18, 2009:
Molecular modelling and visualization (pdf) (Lab Report #10).
Homology modelling combines sequence analysis and molecular modelling to predict three-dimensional structure. Students pick a homologue of their chosen protein that has not had its structure yet solved and use the SwissModel WWW resource ( to model the molecule. The theoretical structure is then visualized with RasMol ( and Swiss PDB View ( to gain insight into the way in which its structure relates to its function. Color coding different physical attributes such as residue charge, hydrophobicity, and secondary structure elements, different representation models, such as alpha-carbon traces, and super-positioning of the model with an actual structure all assist in the interpretation.

After students have had their introduction to basic UNIX concepts, utility operations, editing procedures, and molecular databases within the first couple weeks, they decide on a protein of current interest from a list of molecules for which complete structural coordinates and the corresponding genomic sequences are known. They then perform all of the laboratory computer exercises upon that particular molecule. This way they are able to gain experience in all aspects of biocomputing in the course in a project-oriented fashion using the same natural progression as would be used in an actual experimental setting.

Resultant predictive data derived from sequence analysis will no doubt conflict with aspects of the known structural data, but elements of truth will also be found. In this way the strengths and weaknesses of each approach can be better understood and a greater empathy can be found for the tremendous problems encountered in the all-too-common case of a newly sequenced gene product without any structural information available. With this approach to computerized molecular biology, students will "come full swing" gaining appreciation for the full biocomputing spectrum available.

This structured exercise tutorial sequence lasts for the first two thirds of the semester, ten weeks. After the laboratory tutorial portion of the course has completed, students then devote scheduled lab sessions to working on their individual research projects. Students should begin dialogue with their instructors regarding their project topic early on in the semester, and will be required to submit a one page project proposal no later than March 4. Students are encouraged to choose term projects related to their academic research. This helps to insure excellence by providing a vested interest.

© 2013 Steven M. Thompson, acknowledgements and thanks to the Florida State University Biology Department for generously extending Web hosting and e-mail services beyond my FSU tenure.
fsu seal