Skip to main content

Data Wrangling And Visualization

Biologists working in the laboratory perform many essential biochemical tasks to prep and run molecular analyses on their specimens and samples. Similarly, biologists working on data they collected, or on aggregated data collected by many researchers, just perform essential tasks such as cleaning, reshaping, and transforming their data so that they can explore and visualize it. The interdisciplinary field of data science integrates tools from scientific computing, data visualization, communication, and other fields to help biologists and other knowledge workers perform these tasks and extract insights from data. This three-credit course aims to provide a brief introduction to data science for biologists and to wrangling, transforming, exploring, and visualizing data via scripting languages such as R, Python, and Julia. Students will get an opportunity to wrangle, explore, and visualize datasets from a variety of fields in biology as well as datasets of their own choosing. Alongside the tools of data science, the course will also introduce the tools required to document, maintain, share, and replicate data analyses and visualizations. More broadly, these tools help constitute the paradigm of "literate programming" and aid in the production of "reproducible research" wherein replicable and publication quality research products are generate directly from underlying source files in one integrated workflow.

Prefix:
BIO
Course Number:
540
Credits:
3.0