Airbnb Tech Talk: Jay Kreps - Building LinkedIn's Real-time Pipeline

submitted by kevincarter on 10/20/13 1

"Building LinkedIn's Real-time Data Pipeline" At the core of many of LinkedIn's analytics applications is a real-time data pipeline built on top of Apache Kafka. This system handles over 10 billion messages writes per day for thousands of production processes. This talk will cover some of the challenges of building and scaling this data pipeline for log data, system metrics, and other high-volume data streams. It will also cover some details of the design of Kafka, as well as some of the particular requirements of Hadoop data loads and real-time processing applications. Jay Kreps is the technical lead for LinkedIn's data team, which is responsible for the site's core data technologies including storage systems, data pipelines, Hadoop, search, social graph, and recommendation systems. He is an original author on several open source projects including Apache Kafka, a real-time distributed messaging system, and Project Voldemort a distributed key-value store. He has a Masters degree in computer science from UC Santa Cruz where he studied machine learning. www.airbnb.com/techtalks

Leave a comment

Be the first to comment

Collections with this video
Email
Message
×
Embed video on a website or blog
Width
px
Height
px
×
Join Huzzaz
Start collecting all your favorite videos
×
Log in
Join Huzzaz

facebook login
×
Retrieve username and password
Name
Enter your email address to retrieve your username and password
(Check your spam folder if you don't find it in your inbox)

×