A customer is collecting clickstream data using Amazon Kinesis and is grouping the events by IP address into
5-minute chunks stored in Amazon S3.
Many analysts in the company use Hive on Amazon EMR to analyze this data. Their queries always reference a single IP address. Data must be optimized for querying based on IP address using Hive running on Amazon
EMR.
What is the most efficient method to query the data with Hive?
muhsin
Highly Voted 3 years, 8 months agomattyb123
3 years, 8 months agoexams
3 years, 8 months agoBulti
Highly Voted 3 years, 7 months agoskytango
Most Recent 3 years, 7 months agoskytango
3 years, 7 months agofaloameme
3 years, 7 months agoBulti
3 years, 7 months agosan2020
3 years, 7 months agoME2000
3 years, 7 months agopracticioner
3 years, 7 months agoRaju_k
3 years, 7 months agoharry_123
3 years, 7 months agoasadao
3 years, 8 months agojiedee
3 years, 7 months agoShatamjeev
3 years, 8 months agomattyb123
3 years, 8 months agojlpl
3 years, 8 months agomattyb123
3 years, 8 months ago