Large Datasets in Python


This lesson contains two parts: 1st, accessing and storing data from personal genomic DNA sequencing results using the pandas.DataFrame structure; 2nd, finding shared motifs in single-cell PCR sequencing results.

To follow along, visit the IPython notebook.

The generic_gdna.txt file contains a sample personal gDNA chip sequencing output.

The generic_tcr.txt file is a tab-separated plain text file containing 10 sample single-cell PCR sequencing results of the T cell receptor.