PacktLib: Bioinformatics with R Cookbook

Bioinformatics with R Cookbook

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

Starting Bioinformatics with R

Introduction

Getting started and installing libraries

Reading and writing data

Filtering and subsetting data

Basic statistical operations on data

Generating probability distributions

Performing statistical tests on data

Visualizing data

Working with PubMed in R

Retrieving data from BioMart

Introduction to Bioconductor

Introduction

Installing packages from Bioconductor

Handling annotation databases in R

Performing ID conversions

The KEGG annotation of genes

The GO annotation of genes

The GO enrichment of genes

The KEGG enrichment of genes

Bioconductor in the cloud

Sequence Analysis with R

Introduction

Retrieving a sequence

Reading and writing the FASTA file

Getting the detail of a sequence composition

Pairwise sequence alignment

Multiple sequence alignment

Phylogenetic analysis and tree plotting

Handling BLAST results

Pattern finding in a sequence

Protein Structure Analysis with R

Introduction

Retrieving a sequence from UniProt

Protein sequence analysis

Computing the features of a protein sequence

Handling the PDB file

Working with the InterPro domain annotation

Understanding the Ramachandran plot

Searching for similar proteins

Working with the secondary structure features of proteins

Visualizing the protein structures

Analyzing Microarray Data with R

Introduction

Reading CEL files

Building the ExpressionSet object

Handling the AffyBatch object

Checking the quality of data

Generating artificial expression data

Data normalization

Overcoming batch effects in expression data

An exploratory analysis of data with PCA

Finding the differentially expressed genes

Working with the data of multiple classes

Handling time series data

Fold changes in microarray data

The functional enrichment of data

Clustering microarray data

Getting a co-expression network from microarray data

More visualizations for gene expression data

Analyzing GWAS Data

Introduction

The SNP association analysis

Running association scans for SNPs

The whole genome SNP association analysis

Importing PLINK GWAS data

Data handling with the GWASTools package

Manipulating other GWAS data formats

The SNP annotation and enrichment

Testing data for the Hardy-Weinberg equilibrium

Association tests with CNV data

Visualizations in GWAS studies

Analyzing Mass Spectrometry Data

Introduction

Reading the MS data of the mzXML/mzML format

Reading the MS data of the Bruker format

Converting the MS data in the mzXML format to MALDIquant

Extracting data elements from the MS data object

Preprocessing MS data

Peak detection in MS data

Peak alignment with MS data

Peptide identification in MS data

Performing protein quantification analysis

Performing multiple groups' analysis in MS data

Useful visualizations for MS data analysis

Analyzing NGS Data

Introduction

Querying the SRA database

Downloading data from the SRA database

Reading FASTQ files in R

Reading alignment data

Preprocessing the raw NGS data

Analyzing RNAseq data with the edgeR package

The differential analysis of NGS data using limma

Enriching RNAseq data with GO terms

The KEGG enrichment of sequence data

Analyzing methylation data

Analyzing ChipSeq data

Visualizations for NGS data

Machine Learning in Bioinformatics

Introduction

Data clustering in R using k-means and hierarchical clustering

Visualizing clusters

Supervised learning for classification

Probabilistic learning in R with Naïve Bayes

Bootstrapping in machine learning

Cross-validation for classifiers

Measuring the performance of classifiers

Visualizing an ROC curve in R

Biomarker identification using array data

Useful Operators and Functions in R

Useful R Packages

Index