We intend to identify differentially regulated genes using the Cancer Genome Atlas (TCGA) dataset which contains gene expression data and associated metadata of certain tumor samples. We will perform an exploratory data analysis to mine interesting patterns in the dataset by integrating gene and RNA-seq metadata. Then, we plan to summarise results using effective visualization methods. For contextualization, we will present an overview of breast cancer in the U.S from 199 to 2015 using data from the United States Cancer Statistics (USCS/CDC).