Get $1 credit for every $25 spent!

The Ultimate Data Infrastructure Architect Bundle

Ending In:
Add to Cart - $36
Add to Cart ($36)
$684
94% off
wishlist
(11)
Courses
10
Enrolled
102

What's Included

Product Details

Access
Lifetime
Content
3 hours
Lessons
35

Learning ElasticSearch 5.0

Store, Search & Analyze Your Data with Ease Using ElasticSearch 5.0

By Packt Publishing | in Online Courses

Learn how to use ElasticSearch in combination with the rest of the Elastic Stack to ship, parse, store, and analyze logs! You'll start by getting an understanding of what ElasticSearch is, what it's used for, and why it's important before being introduced to the new features of Elastic Search 5.0.

  • Access 35 lectures & 3 hours of content 24/7
  • Go through each of the fundamental concepts of ElasticSearch such as queries, indices, & aggregation
  • Add more power to your searches using filters, ranges, & more
  • See how ElasticSearch can be used w/ other components like LogStash, Kibana, & Beats
  • Build, test, & run your first LogStash pipeline to analyze Apache web logs
Ethan Anthony is a San Francisco based Data Scientist who specializes in distributed data centric technologies. He is also the Founder of XResults, where the vision is to harness the power of data to innovate and deliver intuitive customer facing solutions, largely to non-technical professionals. Ethan has over 10 combined years of experience in cloud based technologies such as Amazon webservices and OpenStack, as well as the data centric technologies of Hadoop, Mahout, Spark and ElasticSearch. He began using ElasticSearch in 2011 and has since delivered solutions based on the Elastic Stack to a broad range of clientele. Ethan has also consulted worldwide, speaks fluent Mandarin Chinese and is insanely curious about human cognition, as related to cognitive dissonance.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • Getting Started with ElasticSearch
    • The Course Overview (3:08)
    • What Is ElasticSearch? (4:01)
    • Installing ElasticSearch (6:34)
    • Goal of ElasticSearch (4:16)
    • What’s New in Version 5.0? (3:36)
    • Why Use ElasticSearch? (5:10)
  • Dichotomy of ElasticSearch
    • What Is an Index? (2:31)
    • Documents in ElasticSearch (4:20)
    • What Is a Cluster? (6:51)
    • Setting Shards and Replicas (6:30)
  • Get Going with Documents and Indices
    • Adding and Deleting an Index (8:09)
    • Adding and Deleting Documents (5:33)
    • Using Bulk API (8:44)
  • Power Your Searches with DSL
    • Introduction to DSL (4:10)
    • Understanding DSL (3:09)
    • Term Queries and Boosting (7:13)
    • Range Query (2:27)
    • Exist Query (3:02)
    • Aggregation Based Analytics (6:41)
    • Aggregations: Implementation (4:45)
  • Querying with RESTful API
    • Introduction to REST API (3:21)
    • Using REST API to Search (6:34)
    • Using REST API to Update (6:39)
  • What ElasticSearch is NOT
    • Myths about ElasticSearch (8:39)
  • Getting More with ElasticStack
    • What Is ElasticStack? (1:47)
    • Kibana (5:24)
    • Logstash (3:49)
    • X-Pack (4:58)
    • Beats (1:53)
  • Apache Log Analysis
    • Preparing for Log Analysis (5:40)
    • Running Log Analysis (9:26)
  • Advanced ElasticSearch Queries
    • Sorting in ElasticSearch (4:33)
    • Geo Searching (2:51)
    • Getting into Synonyms (4:24)
  • ElasticSearch versus Apache Solr
    • Choosing between ElasticSearch and Apache Solr (4:45)

View Full Curriculum


Access
Lifetime
Content
5.5 hours
Lessons
45

Apache Spark 2 for Beginners

Take the First Steps In Developing Large-Scale Distributed Data Processing Applications

By Packt Publishing | in Online Courses

Apache Spark is one of the most widely-used large-scale data processing engines and runs at extremely high speeds. It's a framework that has tools that are equally useful for app developers and data scientists. This book starts with the fundamentals of Spark 2 and covers the core data processing framework and API, installation, and application development setup.

  • Access 45 lectures & 5.5 hours of content 24/7
  • Learn the Spark programming model through real-world examples
  • Explore Spark SQL programming w/ DataFrames
  • Cover the charting & plotting features of Python in conjunction w/ Spark data processing
  • Discuss Spark's stream processing, machine learning, & graph processing libraries
  • Develop a real-world Spark application
Rajanarayanan Thottuvaikkatumana, Raj, is a seasoned technologist with more than 23 years of software development experience at various multinational companies. He has lived and worked in India, Singapore, and the USA, and is presently based out of the UK. His experience includes architecting, designing, and developing software applications. He has worked on various technologies including major databases, application development platforms, web technologies, and big data technologies. Since 2000, he has been working mainly in Java related technologies, and does heavy-duty server-side programming in Java and Scala. He has worked on very highly concurrent, highly distributed, and high transaction volume systems. Currently he is building a next generation Hadoop YARN-based data processing platform and an application suite built with Spark using Scala.

Raj holds one master's degree in Mathematics, one master's degree in Computer Information Systems and has many certifications in ITIL and cloud computing to his credit. Raj is the author of Cassandra Design Patterns - Second Edition, published by Packt.

When not working on the assignments his day job demands, Raj is an avid listener to classical music and watches a lot of tennis.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • Spark Fundamentals
    • The Course Overview (4:30)
    • An Overview of Apache Hadoop (5:50)
    • Understanding Apache Spark (5:13)
    • Installing Spark on Your Machines (13:48)
  • Spark Programming Model
    • Functional Programming with Spark and Understanding Spark RDD (8:44)
    • Data Transformations and Actions with RDDs (5:21)
    • Monitoring with Spark (4:01)
    • The Basics of Programming with Spark (20:30)
    • Creating RDDs from Files and Understanding the Spark Library Stack (6:38)
  • Spark SQL
    • Understanding the Structure of Data and the Need of Spark SQL (9:38)
    • Anatomy of Spark SQL (5:08)
    • DataFrame Programming (12:00)
    • Understanding Aggregations and Multi-Datasource Joining with SparkSQL (8:32)
    • Introducing Datasets and Understanding Data Catalogs (7:53)
  • Spark Programming with R
    • The Need for Spark and the Basics of the R Language (8:09)
    • DataFrames in R and Spark (2:57)
    • Spark DataFrame Programming with R (4:42)
    • Understanding Aggregations and Multi- Datasource Joins in SparkR (4:11)
  • Spark Data Analysis with Python
    • Charting and Plotting Libraries and Setting Up a Dataset (3:59)
    • Charts, Plots, and Histograms (5:36)
    • Bar Chart and Pie Chart (7:45)
    • Scatter Plot and Line Graph (4:53)
  • Spark Stream Processing
    • Data Stream Processing and Micro Batch Data Processing (8:36)
    • A Log Event Processor (16:22)
    • Windowed Data Processing and More Processing Options (7:26)
    • Kafka Stream Processing (10:43)
    • Spark Streaming Jobs in Production (9:09)
  • Spark Machine Learning
    • Understanding Machine Learning and the Need of Spark for it (6:22)
    • Wine Quality Prediction and Model Persistence (10:43)
    • Wine Classification (5:57)
    • Spam Filtering (7:07)
    • Feature Algorithms and Finding Synonyms (6:54)
  • Spark Graph Processing
    • Understanding Graphs with Their Usage (4:35)
    • The Spark GraphX Library (10:09)
    • Graph Processing and Graph Structure Processing (9:44)
    • Tennis Tournament Analysis (5:34)
    • Applying PageRank Algorithm (3:30)
    • Connected Component Algorithm (4:39)
    • Understanding GraphFrames and Its Queries (9:31)
  • Designing Spark Applications
    • Lambda Architecture (4:47)
    • Micro Blogging with Lambda Architecture (7:13)
    • Implementing Lambda Architecture and Working with Spark Applications (8:19)
    • Coding Style, Setting Up the Source Code, and Understanding Data Ingestion (9:09)
    • Generating Purposed Views and Queries (5:53)
    • Understanding Custom Data Processes (6:12)

View Full Curriculum


Access
Lifetime
Content
2 hours
Lessons
19

Designing AWS Environments

Design & Create Robust & Resilient Distributed Solutions with AWS

By Packt Publishing | in Online Courses

Amazon Web Services (AWS) provides trusted, cloud-based solutions to help businesses meet all of their needs. Running solutions in the AWS Cloud can help you (or your company) get applications up and running faster while providing the security needed to meet your compliance requirements. This course leaves no stone unturned in getting you up to speed with administering AWS.

  • Access 19 lectures & 2 hours of content 24/7
  • Familiarize yourself w/ the key capabilities to architect & host apps, websites, & services on AWS
  • Explore the available options for virtual instances & demonstrate launching & connecting to them
  • Design & deploy networking & hosting solutions for large deployments
  • Focus on security & important elements of scalability & high availability
Wayde Gilchrist started moving customers of his IT consulting business into the cloud and away from traditional hosting environments in 2010. In addition to consulting, he delivers AWS training for Fortune 500 companies, government agencies, and international consulting firms. When he is not out visiting customers, he is delivering training virtually from his home in Florida.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • Installation and Setup
    • The Course Overview (3:08)
    • Opening an AWS Account (2:56)
    • The Free Tier (1:48)
    • The Management Console (2:36)
  • Launching an EC2 Instance
    • Amazon Machine Images (5:05)
    • EC2 Instance Types (5:12)
    • EC2 Storage Options (7:54)
    • Security Groups (4:59)
  • Logging in to EC2 Instances
    • Key Pairs
    • Logging In to Linux Instances (6:48)
    • Logging In to Windows Instances (3:42)
  • Networking on AWS
    • Classless Inter-Domain Routing (4:44)
    • EC2 IP Addressing (6:04)
    • Subnets and Route Tables (4:48)
  • Creating a VPC
    • Getting Started with VPCs (7:12)
    • Creating a VPC Demo (9:07)
    • Connecting to a VPC (4:31)
    • Securing Your VPC (12:29)
    • Highly Available Architectures (14:59)

View Full Curriculum


Access
Lifetime
Content
40 hours
Lessons
64

Learning MongoDB

A Comprehensive Guide to Using MongoDB for Fast, Fault Tolerant Management of Big Data

By Packt Publishing | in Online Courses

Businesses today have access to more data than ever before, and a key challenge is ensuring that data can be easily accessed and used efficiently. MongoDB makes it possible to store and process large sets of data in a ways that drive up business value. Learning MongoDB will give you the flexibility of unstructured storage, combined with robust querying and post processing functionality, making you an asset to enterprise Big Data needs.

  • Access 64 lectures & 40 hours of content 24/7
  • Master data management, queries, post processing, & essential enterprise redundancy requirements
  • Explore advanced data analysis using both MapReduce & the MongoDB aggregation framework
  • Delve into SSL security & programmatic access using various languages
  • Learn about MongoDB's built-in redundancy & scale features, replica sets, & sharding
Daniel Watrous is a 15-year veteran of designing web-enabled software. His focus on data store technologies spans relational databases, caching systems, and contemporary NoSQL stores. For the last six years, he has designed and deployed enterprise-scale MongoDB solutions in semiconductor manufacturing and information technology companies. He holds a degree in electrical engineering from the University of Utah, focusing on semiconductor physics and optoelectronics. He also completed an MBA from the Northwest Nazarene University. In his current position as senior cloud architect with Hewlett Packard, he focuses on highly scalable cloud-native software systems.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • Getting Started
    • Downloading and Installing Linux (10:08)
    • Downloading and Installing Windows (6:55)
    • Configuring Startup with a System (Service Integration) (6:07)
    • Using the Command-line Tool (7:35)
    • Graphical User Interfaces (GUI) (6:24)
  • JSON and Data Operations
    • An Overview of JSON (5:57)
    • Schemaless but Structured (5:57)
    • Adding Data to MongoDB (5:03)
    • Querying Data in MongoDB (6:45)
    • Advanced Queries, Regex, Projection, and Fields (5:38)
  • Working with Databases
    • Pruning Data from MongoDB (5:04)
    • Backing Up a Database (5:18)
    • Restoring a Database (4:20)
    • Other Redundancy Mechanisms (3:56)
    • Security (4:35)
  • MapReduce
    • MapReduce Overview and Background (5:51)
    • Creating a Map Function (6:17)
    • Creating a Reduce Function (6:55)
    • Advanced MapReduce Functionality (5:04)
    • When to Use MapReduce (2:16)
  • The Aggregation Framework
    • An Overview of the Aggregation Framework (3:05)
    • Single Purpose Aggregation (4:32)
    • Pipeline Components (6:25)
    • Example Usage (6:11)
    • Expression Operators (5:29)
  • SSL Security and Programmatic Access
    • SCons and Memory Requirements Used to Build MongoDB (3:17)
    • Verifying and Distributing the Build (5:55)
    • Authentication and Authorization (5:13)
    • Accessing MongoDB Using PHP (3:21)
    • Accessing MongoDB Using Python (4:31)
  • Replica Sets and Scaling
    • Types of Nodes (2:30)
    • Building a Replica Set (4:17)
    • Verifying Failovers (3:22)
    • Write Concern (3:33)
    • ReadPreference and Load Balancing (3:51)
  • Advanced Topics and Hosting
    • Sharding and Ultrascale (3:21)
    • Sharding Example (5:34)
    • MMS Setup (6:38)
    • Reviewing the MMS Feature (3:23)
    • Caching MongoDB (5:09)

View Full Curriculum


Access
Lifetime
Content
1.5 hours
Lessons
19

Learning Hadoop 2

Introduce Yourself to Storing, Structuring, & Analyzing Data at Scale with Hadoop

By Packt Publishing | in Online Courses

Hadoop emerged in response to the proliferation of masses and masses of data collected by organizations, offering a strong solution to store, process, and analyze what has commonly become known as Big Data. It comprises a comprehensive stack of components designed to enable these tasks on a distributed scale, across multiple servers and thousand of machines. In this course, you'll learn Hadoop 2, introducing yourself to the powerful system synonymous with Big Data.

  • Access 19 lectures & 1.5 hours of content 24/7
  • Get an overview of the Hadoop component ecosystem, including HDFS, Sqoop, Flume, YARN, MapReduce, Pig, & Hive
  • Install & configure a Hadoop environment
  • Explore Hue, the graphical user interface of Hadoop
  • Discover HDFS to import & export data, both manually & automatically
  • Run computations using MapReduce & get to grips working w/ Hadoop's scripting language, Pig
  • Siphon data from HDFS into Hive & demonstrate how it can be used to structure & query data sets
Randal Scott King is the Managing Partner of Brilliant Data, a consulting firm specialized in data analytics. In his 16 years of consulting, Scott has amassed an impressive list of clientele from mid-market leaders to Fortune 500 household names. Scott lives just outside Atlanta, GA, with his children.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • The Hadoop Ecosystem
    • The Course Overview (1:51)
    • Overview of HDFS and YARN (7:24)
    • Overview of Sqoop and Flume (3:17)
    • Overview of MapReduce (3:38)
    • Overview of Pig (3:04)
    • Overview of Hive (6:33)
  • Installing and Configuring Hadoop
    • Downloading and Installing Hadoop (2:53)
    • Exploring Hue (5:24)
  • Data Import and Export
    • Manual Import (4:33)
    • Importing from Databases Using Sqoop (6:27)
    • Using Flume to Import Streaming Data (5:07)
  • Using MapReduce and Pig
    • Coding "Word Count" in MapReduce (5:55)
    • Coding "Word Count" in Pig (2:30)
    • Performing Common ETL Functions in Pig (8:48)
    • Using User-defined Functions in Pig (5:58)
  • Using Hive
    • Importing Data from HDFS into Hive (4:57)
    • Importing Data Directly from a Database (2:23)
    • Performing Basic Queries in Hive (6:58)
    • Putting It All Together (2:15)

View Full Curriculum


ElasticSearch 5.x Cookbook eBook

Over 170 Advanced Recipes to Search, Analyze, Deploy, Manage, & Monitor Data Effectively

By Packt Publishing | in Online Courses

ElasticSearch is a Lucene-based distributed search server that allows users to index and search unstructured content with petabytes of data. Through this ebook, you'll be guided through comprehensive recipes covering what's new in ElasticSearch 5.x as you create complex queries and analytics. By the end, you'll have an in-depth knowledge of how to implement the ElasticSearch architecture and be able to manage data efficiently and effectively.

  • Access 696 pages of content 24/7
  • Perform index mapping, aggregation, & scripting
  • Explore the modules of Cluster & Node monitoring
  • Understand how to install Kibana to monitor a cluster & extend Kibana for plugins
  • Integrate your Java, Scala, Python, & Big Data apps w/ ElasticSearch
Alberto Paro is an engineer, project manager, and software developer. He currently works as freelance trainer/consultant on big data technologies and NoSQL solutions. He loves to study emerging solutions and applications mainly related to big data processing, NoSQL, natural language processing, and neural networks. He began programming in BASIC on a Sinclair Spectrum when he was eight years old, and to date, has collected a lot of experience using different operating systems, applications, and programming languages.

In 2000, he graduated in computer science engineering from Politecnico di Milano with a thesis on designing multiuser and multidevice web applications. He assisted professors at the university for about a year. He then came in contact with The Net Planet Company and loved their innovative ideas; he started working on knowledge management solutions and advanced data mining products. In summer 2014, his company was acquired by a big data technologies company, where he worked until the end of 2015 mainly using Scala and Python on state-of-the-art big data software (Spark, Akka, Cassandra, and YARN). In 2013, he started freelancing as a consultant for big data, machine learning, Elasticsearch and other NoSQL products. He has created or helped to develop big data solutions for business intelligence, financial, and banking companies all over the world. A lot of his time is spent teaching how to efficiently use big data solutions (mainly Apache Spark), NoSql datastores (Elasticsearch, HBase, and Accumulo) and related technologies (Scala, Akka, and Playframework). He is often called to present at big data or Scala events. He is an evangelist on Scala and Scala.js (the transcompiler from Scala to JavaScript).

In his spare time, when he is not playing with his children, he likes to work on open source projects. When he was in high school, he started contributing to projects related to the GNOME environment (gtkmm). One of his preferred programming languages is Python, and he wrote one of the first NoSQL backends on Django for MongoDB (Django-MongoDBengine). In 2010, he began using Elasticsearch to provide search capabilities to some Django e-commerce sites and developed PyES (a Pythonic client for Elasticsearch), as well as the initial part of the Elasticsearch MongoDB river. He is the author of Elasticsearch Cookbook as well as a technical reviewer of Elasticsearch Server-Second Edition, Learning Scala Web Development, and the video course, Building a Search Server with Elasticsearch, all of which are published by Packt Publishing.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • Elasticsearch 5.x Cookbook - Third Edition
    • Elasticsearch 5.x Cookbook - Third Edition

View Full Curriculum


Fast Data Processing with Spark 2 eBook

Learn How to Use Spark to Process Big Data At Speed & Scale

By Packt Publishing | in Online Courses

Compared to Hadoop, Spark is a significantly more simple way to process Big Data at speed. It is increasing in popularity with data analysts and engineers everywhere, and in this course you'll learn how to use Spark with minimum fuss. Starting with the fundamentals, this ebook will help you take your Big Data analytical skills to the next level.

  • Access 274 pages of content 24/7
  • Get to grips w/ some simple APIs before investigating machine learning & graph processing
  • Learn how to use the Spark shell
  • Load data & build & run your own Spark applications
  • Discover how to manipulate RDD
  • Understand useful machine learning algorithms w/ the help of Spark MLlib & R
Krishna Sankar is a Senior Specialist—AI Data Scientist with Volvo Cars focusing on Autonomous Vehicles. His earlier stints include Chief Data Scientist at http://cadenttech.tv/, Principal Architect/Data Scientist at Tata America Intl. Corp., Director of Data Science at a bioinformatics startup, and as a Distinguished Engineer at Cisco. He has been speaking at various conferences including ML tutorials at Strata SJC and London 2016, Spark Summit, Strata-Spark Camp, OSCON, PyCon, and PyData, writes about Robots Rules of Order, Big Data Analytics—Best of the Worst, predicting NFL, Spark, Data Science, Machine Learning, Social Media Analysis as well as has been a guest lecturer at the Naval Postgraduate School. His occasional blogs can be found at https://doubleclix.wordpress.com/. His other passion is flying drones (working towards Drone Pilot License (FAA UAS Pilot) and Lego Robotics—you will find him at the St.Louis FLL World Competition as Robots Design Judge.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • Fast Data Processing with Spark 2
    • Fast Data Processing with Spark 2

View Full Curriculum


MongoDB Cookbook: Second Edition eBook

Harness the Latest Features of MongoDB 3 with A Collection of 80 Recipes

By Packt Publishing | in Online Courses

MongoDB is a high-performance, feature-rich, NoSQL database that forms the backbone of the systems that power many organizations. Packed with easy-to-use features that have become essential for a variety of software professionals, MongoDB is a vital technology to learn for any aspiring data scientist or systems engineer. This cookbook contains many solutions to the everyday challenges of MongoDB, as well as guidance on effective techniques to extend your skills and capabilities.

  • Access 274 pages of content 24/7
  • Initialize the server in three different modes w/ various configurations
  • Get introduced to programming language drivers in Java & Python
  • Learn advanced query operations, monitoring, & backup using MMS
  • Find recipes on cloud deployment, including how to work w/ Docker containers along MongoDB
Amol Nayak is a MongoDB certified developer and has been working as a developer for over 8 years. He is currently employed with a leading financial data provider, working on cutting-edge technologies. He has used MongoDB as a database for various systems at his current and previous workplaces to support enormous data volumes. He is an open source enthusiast and supports it by contributing to open source frameworks and promoting them. He has made contributions to the Spring Integration project, and his contributions are the adapters for JPA, XQuery, MongoDB, Push notifications to mobile devices, and Amazon Web Services (AWS). He has also made some contributions to the Spring Data MongoDB project. Apart from technology, he is passionate about motor sports and is a race official at Buddh International Circuit, India, for various motor sports events. Earlier, he was the author of Instant MongoDB, Packt Publishing.

Cyrus Dasadia always liked tinkering with open source projects since 1996. He has been working as a Linux system administrator and part-time programmer for over a decade. He works at InMobi, where he loves designing tools and platforms. His love for MongoDB started in 2013, when he was amazed by its ease of use and stability. Since then, almost all of his projects are written with MongoDB as the primary backend. Cyrus is also the creator of an open source alert management system called CitoEngine. He likes spending his spare time trying to reverse engineer software, playing computer games, or increasing his silliness quotient by watching reruns of Monty Python.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • MongoDB Cookbook - Second Edition
    • MongoDB Cookbook - Second Edition

View Full Curriculum


Learning Apache Kafka: Second Edition eBook

Learn How to Administer Apache Kafka Effectively for Messaging

By Packt Publishing | in Online Courses

Apache Kafka is simple describe at a high level bust has an immense amount of technical detail when you dig deeper. This step-by-step, practical guide will help you take advantage of the power of Kafka to handle hundreds of megabytes of messages per second from multiple clients.

  • Access 120 pages of content 24/7
  • Set up Kafka clusters
  • Understand basic blocks like producer, broker, & consumer blocks
  • Explore additional settings & configuration changes to achieve more complex goals
  • Learn how Kafka is designed internally & what configurations make it most effective
  • Discover how Kafka works w/ other tools like Hadoop, Storm, & more
Nishant Garg has over 14 years of software architecture and development experience in various technologies, such as Java Enterprise Edition, SOA, Spring, Hadoop, Hive, Flume, Sqoop, Oozie, Spark, Shark, YARN, Impala, Kafka, Storm, Solr/Lucene, NoSQL databases (such as HBase, Cassandra, and MongoDB), and MPP databases (such as GreenPlum).

He received his MS in software systems from the Birla Institute of Technology and Science, Pilani, India, and is currently working as a technical architect for the Big Data R&D Group with Impetus Infotech Pvt. Ltd. Previously, Nishant has enjoyed working with some of the most recognizable names in IT services and financial industries, employing full software life cycle methodologies such as Agile and SCRUM.

Nishant has also undertaken many speaking engagements on big data technologies and is also the author of HBase Essestials, Packt Publishing.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • Learning Apache Kafka - Second Edition
    • Learning Apache Kafka - Second Edition

View Full Curriculum


Apache Flume: Distributed Log Collection for Hadoop: Second Edition eBook

Design & Implement A Series of Flume Agents to Send Streamed Data Into Hadoop

By Packt Publishing | in Online Courses

Apache Flume is a distributed, reliable, and available service used to efficiently collect, aggregate, and move large amounts of log data. It's used to stream logs from application servers to HDFS for ad hoc analysis. This ebook start with an architectural overview of Flume and its logical components, and pulls everything together into a real-world, end-to-end use case encompassing simple and advanced features.

  • Access 178 pages of content 24/7
  • Explore channels, sinks, & sink processors
  • Learn about sources & channels
  • Construct a series of Flume agents to dynamically transport your stream data & logs from your systems into Hadoop
Steve Hoffman has 32 years of experience in software development, ranging from embedded software development to the design and implementation of large-scale, service-oriented, object-oriented systems. For the last 5 years, he has focused on infrastructure as code, including automated Hadoop and HBase implementations and data ingestion using Apache Flume. Steve holds a BS in computer engineering from the University of Illinois at Urbana-Champaign and an MS in computer science from DePaul University. He is currently a senior principal engineer at Orbitz Worldwide (http://orbitz.com/).

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • Apache Flume: Distributed Log Collection for Hadoop - Second Edition
    • Apache Flume: Distributed Log Collection for Hadoop - Second Edition

View Full Curriculum



Terms

  • Instant digital redemption

15-Day Satisfaction Guarantee

We want you to be happy with every course you purchase! If you're unsatisfied for any reason, we will issue a store credit refund within 15 days of purchase.