Amazon Announces Real-Time Big Data Processing Service -- ADTmag

Amazon Announces Real-Time Big Data Processing Service

By David Ramel
November 15, 2013

Amazon yesterday introduced its Kinesis service for real-time processing of never-ending Big Data streams to facilitate instant business decision-making.

Like many other recent products in the exploding Big Data market, Amazon Kinesis aims to improve on batch-processing solutions such as Apache Hadoop and databases or data warehouses that work with discrete, static chunks of data. "Database and MapReduce technologies are good at handling large volumes of data," said AWS executive Terry Hanold in an announcement. "But they are fundamentally batch-based, and struggle with enabling real-time decisions on a never-ending--and never fully complete--stream of data."

The real-time data analytics service allows quick decision-making based on data streams such as server logs or other infrastructure information, social media content, financial market data feeds, Web site clickstreams, mobile game interactions, online machine learning and more.

The scalable service is designed to plug into the AWS data processing ecosystem, sending data to the Amazon Simple Storage Service (Amazon S3) data storage infrastructure, the managed NoSQL database service Amazon DynamoDB and the petabyte-scale data warehouse service, Amazon Redshift.

On Hacker News, several readers were comparing the new service to the open source Apache Kafka distributed, persistent messaging system and the Storm realtime computation system. One reader described it as "Amazon's version of Kafka/Storm with pay as you go minus the headaches of maintaining the cluster." Another reader said: "This is essentially a hosted Kafka. Given the complexity of operating a distributed persistent queue, this could be a compelling alternative for AWS-centric environments. (We run a large Kafka cluster on AWS, and it is one of our highest-maintenance services.)"

The Kinesis processing model incorporates a "producer" component and a "consumer" side, according to the Amazon Web Services blog. The producer side stores data in a stream, while the consumer side sequentially reads through "shards," for which customers specify desired data capacity. "Each shard has the ability to handle 1,000 write transactions (up to 1 megabyte per second--we call this the ingress rate) and up to 20 read transactions (up to 2 megabytes per second--the egress rate)," the blog said. "You can scale a stream up or down at any time by adding or removing shards without affecting processing throughput or incurring any downtime, with new capacity ready to use within seconds."

Amazon is providing a Kinesis client library, written in Java, to ease the handling of details such as load balancing, coordination and errors. A Kinesis Service API is also available for custom-built applications in any programming language.

Applications can incorporate real-time dashboards and capture and handle exceptions and generate alerts about the errors, helping to drive recommendations for business decisions, Amazon said. The AWS Management Console can be used with the Kinesis APIs and the AWS Command Line Interface to create and manage data streams.

Pricing for the service was described in terms of a mobile game scenario in which each gaming device transmits a 2-kilobyte message about player interaction every 5 seconds, with a peak of 10,000 devices sending simultaneous messages. While data screams can scale up and down on the go, at a simplified constant data rate, such a scenario would require 20 shards. The pay-as-you-go pricing model charges $0.028 for each 1 million PUT operations and $0.015 per shard per hour. To collect 1 hour of game data in this example scenario, the cost would be $1.31.

Announced at the AWS re:Invent conference in Las Vegas, Amazon Kinesis is being offered in a limited preview, for which you can sign up here.

About the Author

David Ramel is an editor and writer at Converge 360.

Featured

AppTrends

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

VSLive! 2-Day Hands-On Training Seminar: Asynchronous and Parallel Programming in C#
June 24-25, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
July 15-18, 2025

Securing IT in the AI Era
July 23, 2025

VSLive! 4-Hour In-Depth Workshop: Immersive .NET Full Stack Training: C# Interfaces: Effective Usage while Avoiding Pitfalls
July 29, 2025

Visual Studio Live! @ Microsoft HQ
August 4-8, 2025

4-Hour VSLive! Workshop: Testability in .NET
August 27, 2025

Visual Studio Live! San Diego
September 8-12, 2025

Live! 360 2-Day Hands-On Seminar: Swimming in the Lakes of Microsoft Fabric and AI – A Hands-on Experience
September 18-19, 2025

VSLive! 2-Day Hands-On Training Seminar: Hands-On with .NET Web Development in 2025
October 7-8, 2025

Live! 360 Orlando
November 16-21, 2025

Artificial Intelligence Live! Orlando
November 16-21, 2025

Cloud & Containers Live! Orlando
November 16-21, 2025

Cybersecurity & Ransomware Live! Orlando
November 16-21, 2025

Data Platform Live! Orlando
November 16-21, 2025

Visual Studio Live! Orlando
November 16-21, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
December 16-19, 2025

Visual Studio Live! Las Vegas
March 16-20, 2026

Free White Papers

More Tech Library