An insurance company has raw data in JSON format that is sent without a predefined schedule through an Amazon Kinesis Data Firehose delivery stream to an
Amazon S3 bucket. An AWS Glue crawler is scheduled to run every 8 hours to update the schema in the data catalog of the tables stored in the S3 bucket. Data analysts analyze the data using Apache Spark SQL on Amazon EMR set up with AWS Glue Data Catalog as the metastore. Data analysts say that, occasionally, the data they receive is stale. A data engineer needs to provide access to the most up-to-date data.
Which solution meets these requirements?
singh100
Highly Voted 3 years, 9 months agochinmayj213
1 year, 9 months agozanhsieh
Highly Voted 3 years, 9 months agoNarenKA
Most Recent 1 year, 4 months agoNikkyDicky
1 year, 11 months agoBdtri
2 years, 1 month agochinmayj213
1 year, 9 months agopk349
2 years, 2 months agokondi2309
1 year, 4 months agolk23
2 years, 4 months agocloudlearnerhere
2 years, 8 months agoAbep
2 years, 10 months agorocky48
2 years, 11 months agoBik000
3 years, 1 month agojrheen
3 years, 2 months agoShilaP
3 years, 3 months agoaws2019
3 years, 7 months agoiconara
3 years, 8 months agoHuy
3 years, 8 months agoShraddha
3 years, 8 months ago