PacktLib: HDInsight Essentials

HDInsight Essentials

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

Hadoop and HDInsight in a Heartbeat

Big Data – hype or real?

Apache Hadoop concepts

Summary

Deploying HDInsight on Premise

HDInsight and Hadoop relationship

Deployment options for on-premise

Single-node install

Multinode planning and preparation

Multinode installation

Managing HDInsight services

Uninstalling HDInsight

Summary

HDInsight Azure Cloud Service

HDInsight Service on Azure

Provision your cluster

HDInsight management dashboard

Verify the cluster and run sample jobs

Monitor your cluster

Azure storage integration

Remove your cluster

Summary

Administering Your HDInsight Cluster

Cluster status

Distributed filesystem health

MapReduce health

Key files

Summary

Ingesting Data to Your Cluster

Loading data using Hadoop commands

Loading data using Azure Storage Vault (ASV)

Loading data using interactive JavaScript

Shipping data to Azure

Loading data using Sqoop

Summary

Transforming Data in Cluster

Transformation scenario

MapReduce solution

Hive solution

Pig solution

Summary

Analyzing and Reporting Your Data

Analyzing and reporting using Excel

Hive for ad hoc queries

Interactive JavaScript for analysis and reporting

Other business intelligence tools

Summary

Project Planning Tips and Resources

Architectural considerations

Project planning

Summary

Index