Exam Associate Data Practitioner topic 1 question 24 discussion

Actual exam question from Google's Associate Data Practitioner

Question #: 24
Topic #: 1

[All Associate Data Practitioner Questions]

Your organization has a petabyte of application logs stored as Parquet files in Cloud Storage. You need to quickly perform a one-time SQL-based analysis of the files and join them to data that already resides in BigQuery. What should you do?

A. Create a Dataproc cluster, and write a PySpark job to join the data from BigQuery to the files in Cloud Storage.
B. Launch a Cloud Data Fusion environment, use plugins to connect to BigQuery and Cloud Storage, and use the SQL join operation to analyze the data.
C. Create external tables over the files in Cloud Storage, and perform SQL joins to tables in BigQuery to analyze the data.
D. Use the bq load command to load the Parquet files into BigQuery, and perform SQL joins to analyze the data.

Show Suggested Answer

Suggested Answer: C 🗳️

by n2183712847 at Feb. 27, 2025, 6:08 p.m.

Comments

Submit Cancel

n2183712847

2 months ago

Selected Answer: C

The most efficient and quick solution for a one-time SQL analysis of petabyte-scale Parquet files in Cloud Storage joined with BigQuery data is C. Create external tables over the files in Cloud Storage and perform SQL joins. External tables allow you to query data directly in Cloud Storage with SQL, avoiding the time and cost of loading a petabyte of data into BigQuery. This is ideal for a fast, one-time analysis. Options A (Dataproc/Spark) and B (Cloud Data Fusion) are more complex and slower for a quick analysis. Option D (bq load) is inefficient and slow as it requires loading a petabyte of data into BigQuery, which is unnecessary for a one-time analysis of external files. Therefore, Option C provides the most direct, efficient, and SQL-centric approach for this scenario.

upvoted 1 times

...

Exam Associate Data Practitioner All Questions

View all questions & answers for the Associate Data Practitioner exam

Exam Associate Data Practitioner topic 1 question 24 discussion

Comments

n2183712847

SY0-701