Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010301271 | QA76.9.D5 H37 2013 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
Publisher's Note: Products purchased from Third Party sellers are not guaranteed by the publisher for quality, authenticity, or access to any online entitlements included with the product.
Boost your Big Data IQ! Gain insight into how to govern and consume IBM's unique in-motion and at-rest Big Data analytic capabilities
Big Data represents a new era of computing--an inflection point of opportunity where data in any format may be explored and utilized for breakthrough insights--whether that data is in-place, in-motion, or at-rest. IBM is uniquely positioned to help clients navigate this transformation. This book reveals how IBM is infusing open source Big Data technologies with IBM innovation that manifest in a platform capable of "changing the game."
The four defining characteristics of Big Data--volume, variety, velocity, and veracity--are discussed. You'll understand how IBM is fully committed to Hadoop and integrating it into the enterprise. Hear about how organizations are taking inventories of their existing Big Data assets, with search capabilities that help organizations discover what they could already know, and extend their reach into new data territories for unprecedented model accuracy and discovery.
In this book you will also learn not just about the technologies that make up the IBM Big Data platform, but when to leverage its purpose-built engines for analytics on data in-motion and data at-rest. And you'll gain an understanding of how and when to govern Big Data, and how IBM's industry-leading InfoSphere integration and governance portfolio helps you understand, govern, and effectively utilize Big Data. Industry use cases are also included in this practical guide.
Author Notes
Paul C. Zikopoulos, B.A., M.B.A., is the Director of Technical Professionals for IBM Software Group's Information Management division and additionally leads the World-Wide Competitive Database and Big Data Technical Sales Acceleration teams.
Dirk deRoos, B.Sc., B.A., is IBM's World-Wide Technical Sales Leader for IBM InfoSphere BigInsights. He spent the past two years helping customers with BigInsights and Apache Hadoop, identifying architecture fit, and advising early stage projects in dozens of customer engagements.
Krishnan Parasuraman, B.Sc., M.Sc., is part of IBM's Big Data industry solutions team and serves as the CTO for Digital Media. In his role, Krishnan works very closely with customers in an advisory capacity, driving Big Data solution architectures and best practices for the management of Internet-scale analytics.
Thomas Deutsch, B.A, M.B.A., is a Program Director for IBM's Big Data team. He played a formative role in the transition of Hadoop-based technology from IBM Research to IBM Software Group and continues to be involved with IBM Research around Big Data.
David Corrigan, B.A., M.B.A., is currently the Director of Product Marketing for IBM's InfoSphere portfolio, which is focused on managing trusted information. His primary focus is driving the messaging and strategy for the InfoSphere portfolio of information integration, data quality, master data management (MDM), data lifecycle management, and data privacy and security.
James Giles, BSEE, B.Math, MSEE, Ph.D., is an IBM Distinguished Engineer and currently a Senior Development Manager for the IBM InfoSphere BigInsights and IBM InfoSphere Streams Big Data products.
Table of Contents
Foreword | p. xvii |
Preface | p. xxi |
Acknowledgments | p. xxv |
About This Book | p. xxvii |
Part I The Big Deal About Big Data | |
1 What Is Big Data? | p. 3 |
Why Is Big Data Important? | p. 3 |
Now, the "What Is Big Data?" Part | p. 4 |
Brought to You by the Letter V: How We Define Big Data | p. 9 |
What About My Data Warehouse in a Big Data World? | p. 15 |
Wrapping It Up | p. 19 |
2 Applying Big Data to Business Problems: A Sampling of Use Cases | p. 21 |
When to Consider a Big Data Solution | p. 21 |
Before We Start: Big Data, Jigsaw Puzzles, and Insight | p. 24 |
Big Data Use Cases: Patterns for Big Data Deployment | p. 26 |
You Spent the Money to Instrument It-Now Exploit It! | p. 26 |
IT for IT: Data Center, Machine Data, and Log Analytics | p. 28 |
What, Why, and Who? Social Media: Analytics | p. 30 |
Understanding Customer Sentiment | p. 31 |
Social Media Techniques Make the World Your Oyster | p. 33 |
Customer State: Or, Don't Try to Upsell Me When I Am Mad | p. 34 |
Fraud Detection: "Who Buys an Engagement Ring at 4 a.m.?" | p. 36 |
Liquidity and Risk: Moving from Aggregate to Individual | p. 38 |
Wrapping It Up | p. 39 |
3 Boost Your Big Data IQ: The IBM Big Data Platform | p. 41 |
The New Era of Analytics | p. 41 |
Key Considerations for the Analytic Enterprise | p. 43 |
The Big Data Platform Manifesto | p. 45 |
IBM's Strategy for Big Data and Analytics | p. 49 |
1 Sustained Investments in Research and Acquisitions | p. 49 |
2 Strong Commitment to Open Source Efforts and a Fostering of Ecosystem Development | p. 50 |
3 Support Multiple Entry Points to Big Data | p. 52 |
A Flexible, Platform-Based Approach to Big Data | p. 56 |
Wrapping It Up | p. 59 |
Part II Analytics for Big Data at Rest | |
4 A Big Data Platform for High-Performance Deep Analytics: IBM PureData Systems | p. 63 |
Netezza's Design Principles | p. 66 |
Appliance Simplicity: Minimize the Human Effort | p. 66 |
Hardware Acceleration: Process Analytics Close to the Data Store | p. 67 |
Balanced, Massively Parallel Architecture: Deliver Linear Scalability | p. 67 |
Modular Design: Support Flexible Configurations and Extreme Scalability | p. 67 |
What's in the Box? The Netezza Appliance Architecture Overview | p. 68 |
A Look Inside the Netezza Appliance | p. 69 |
The Secret Sauce: FPGA-Assisted Analytics | p. 72 |
Query Orchestration in Netezza | p. 73 |
Platform for Advanced Analytics | p. 77 |
Extending the Netezza Analytics Platform with Hadoop | p. 79 |
Customers' Success Stories: The Netezza Experience | p. 81 |
T-Mobile: Delivering Extreme Performance with Simplicity at the Petabyte Scale | p. 82 |
State University of New York: Using Analytics to Help Find a Cure for Multiple Sclerosis | p. 83 |
NYSE Euronext: Reducing Data Latency and Enabling Rapid Ad-Hoc Searches | p. 84 |
5 IBM's Enterprise Hadoop: InfoSphere Biglnsights | p. 85 |
What the Hadoop! | p. 87 |
Where Elephants Come From: The History of Hadoop | p. 88 |
Components of Hadoop and Related Projects | p. 89 |
Hadoop 2.0 | p. 89 |
What's in the Box: The Components of InfoSphere Biglnsights | p. 90 |
Hadoop Components Included in InfoSphere Biglnsights 2.0 | p. 91 |
The Biglnsights Web Console | p. 92 |
The Biglnsights Development Tools | p. 93 |
Biglnsights Editions: Basic and Advanced | p. 94 |
Deploying Biglnsights | p. 94 |
Ease of Use: A Simple Installation Process | p. 94 |
A Low-Cost Way to Get Started: Running Biglnsights on the Cloud | p. 95 |
Higher-Class Hardware: IBM PowerLinux Solution for Big Data | p. 96 |
Cloudera Support | p. 96 |
Analytics: Exploration, Development, and Deployment | p. 97 |
Advanced Text Analytics Toolkit | p. 98 |
Machine Learning for the Masses: Deep Statistical Analysis on Biglnsights | p. 99 |
Analytic Accelerators: Finding Needles in Haystacks of Needles? | p. 99 |
Apps for the Masses: Easy Deployment and Execution of Custom Applications | p. 100 |
Data Discovery and Visualization: BigSheets | p. 100 |
The Biglnsights Development Environment | p. 103 |
The Biglnsights Application Lifecycle | p. 105 |
Data Integration | p. 106 |
The Anlaytics-Based IBM PureData Systems and DB2 | p. 107 |
JDBC Module | p. 108 |
InfoSphere Streams for Data in Motion | p. 109 |
InfoSphere DataStage | p. 109 |
Operational Excellence | p. 110 |
Securing the Cluster | p. 110 |
Monitoring All Aspects of Your Cluster | p. 112 |
Compression | p. 113 |
Improved Workload Scheduling: Intelligent Scheduler | p. 117 |
Adaptive MapReduce | p. 118 |
A Flexible File System for Hadoop: GPFS-FPO | p. 120 |
Wrapping It Up | p. 122 |
Part III Analytics for Big Data in Motion | |
6 Real-Time Analytical Processing with InfoSphere Streams | p. 127 |
The Basics: InfoSphere Streams | p. 128 |
How InfoSphere Streams Works | p. 132 |
What's a Lowercase "stream"? | p. 132 |
Programming Streams Made Easy | p. 135 |
The Streams Processing Language | p. 145 |
Source and Sink Adapters | p. 147 |
Operators | p. 149 |
Streams Toolkits | p. 152 |
Enterprise Class | p. 155 |
High Availability | p. 155 |
Integration Is the Apex of Enterprise Class Analysis | p. 157 |
Industry Use Cases for InfoSphere Streams | p. 158 |
Telecommunications | p. 158 |
Enforcement, Defense, Surveillance, and Cyber Security | p. 159 |
Financial Services Sector | p. 160 |
Health and Life Sciences | p. 160 |
And the Rest We Can't Fit in This Book | p. 161 |
Wrapping It Up | p. 162 |
Part IV Unlocking Big Data | |
7 If Data Is the New Oil-You Need Data Exploration and Discovery | p. 165 |
Indexing Data from Multiple Sources with InfoSphere Data Explorer | p. 167 |
Connector Framework | p. 167 |
The Data Explorer Processing Layer | p. 169 |
User Management Layer | p. 173 |
Beefing Up InfoSphere Biglnsights | p. 174 |
An App with a View: Creating Information Dashboards with InfoSphere Data Explorer Application Builder | p. 175 |
Wrapping It Up: Data Explorer Unlocks Big Data | p. 177 |
Part V Big Data Analytic Accelerators | |
8 Differentiate Yourself with Text Analytics | p. 181 |
What Is Text Analysis? | p. 183 |
The Annotated Query Language to the Rescue! | p. 184 |
Productivity Tools That Make All the Difference | p. 188 |
Wrapping It Up | p. 190 |
9 The IBM Big Data Analytic Accelerators | p. 191 |
The IBM Accelerator for Machine Data Analytics | p. 192 |
Ingesting Machine Data | p. 193 |
Extract | p. 194 |
Index | p. 196 |
Transform | p. 196 |
Statistical Modeling | p. 197 |
Visualization | p. 197 |
Faceted Search | p. 198 |
The IBM Accelerator for Social Data Analytics | p. 198 |
Feedback Extractors: What Are People Saying? | p. 200 |
Profile Extractors: Who Are These People? | p. 200 |
Workflow: Pulling It All Together | p. 201 |
The IBM Accelerator for Telecommunications Event Data Analytics | p. 203 |
Call Detail Record Enrichment | p. 205 |
Network Quality Monitoring | p. 207 |
Customer Experience Indicators | p. 207 |
Wrapping It Up: Accelerating Your Productivity | p. 208 |
Part VI Integration and Governance in a Big Data World | |
10 To Govern or Not to Govern: Governance in a Big Data World | p. 211 |
Why Should Big Data be Governed? | p. 212 |
Competing on Information and Analytics | p. 214 |
The Definition of Information Integration and Governance | p. 216 |
An Information Governance Process | p. 217 |
The IBM Information Integration and Governance Technology Platform | p. 220 |
IBM InfoSphere Business Information Exchange | p. 221 |
IBM InfoSphere Information Server | p. 224 |
Data Quality | p. 228 |
Master Data Management | p. 229 |
Data Lifecycle Management | p. 230 |
Privacy and Security | p. 232 |
Wrapping It Up: Trust Is About Turning Big Data into Trusted Information | p. 234 |
11 Integrating Big Data in the Enterprise | p. 235 |
Analytic Application Integration | p. 236 |
IBM Cognos Software | p. 236 |
IBM Content Analytics with Enterprise Search | p. 237 |
SPSS | p. 237 |
SAS | p. 238 |
Unica | p. 238 |
Q1 Labs: Security Solutions | p. 238 |
IBM i2 Intelligence Analysis Platform | p. 239 |
Platform Symphony MapReduce | p. 239 |
Component Integration Within the IBM Big Data Platform | p. 240 |
InfoSphere Biglnsights | p. 240 |
InfoSphere Streams | p. 241 |
Data Warehouse Solutions | p. 241 |
The Advanced Text Analytics Toolkit | p. 241 |
InfoSphere Data Explorer | p. 242 |
InfoSphere Information Server | p. 242 |
InfoSphere Master Data Management | p. 243 |
InfoSphere Guardium | p. 243 |
InfoSphere Optim | p. 244 |
WebSphere Front Office | p. 244 |
WebSphere Decision Server: iLog Rule | p. 245 |
Rational | p. 245 |
Data Repository-Level Integration | p. 245 |
Enterprise Platform Plug-ins | p. 246 |
Development Tooling | p. 246 |
Analytics | p. 246 |
Visualization | p. 246 |
Wrapping It Up | p. 247 |