Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.

Unlimited Access

Get Unlimited Contributor Access to the all ExamTopics Exams!
Take advantage of PDF Files for 1000+ Exams along with community discussions and pass IT Certification Exams Easily.

Exam Professional Data Engineer topic 1 question 101 discussion

Actual exam question from Google's Professional Data Engineer
Question #: 101
Topic #: 1
[All Professional Data Engineer Questions]

You need to copy millions of sensitive patient records from a relational database to BigQuery. The total size of the database is 10 TB. You need to design a solution that is secure and time-efficient. What should you do?

  • A. Export the records from the database as an Avro file. Upload the file to GCS using gsutil, and then load the Avro file into BigQuery using the BigQuery web UI in the GCP Console.
  • B. Export the records from the database as an Avro file. Copy the file onto a Transfer Appliance and send it to Google, and then load the Avro file into BigQuery using the BigQuery web UI in the GCP Console.
  • C. Export the records from the database into a CSV file. Create a public URL for the CSV file, and then use Storage Transfer Service to move the file to Cloud Storage. Load the CSV file into BigQuery using the BigQuery web UI in the GCP Console.
  • D. Export the records from the database as an Avro file. Create a public URL for the Avro file, and then use Storage Transfer Service to move the file to Cloud Storage. Load the Avro file into BigQuery using the BigQuery web UI in the GCP Console.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
Ganshank
Highly Voted 4 years ago
You are transferring sensitive patient information, so C & D are ruled out. Choice comes down to A & B. Here it gets tricky. How to choose Transfer Appliance: (https://cloud.google.com/transfer-appliance/docs/2.0/overview) Without knowing the bandwidth, it is not possible to determine whether the upload can be completed within 7 days, as recommended by Google. So the safest and most performant way is to use Transfer Appliance. Therefore my choice is B.
upvoted 60 times
tprashanth
3 years, 10 months ago
https://cloud.google.com/solutions/migration-to-google-cloud-transferring-your-large-datasets The table shows for 1Gbps, it takes 30 hrs for 10 TB. Generally, corporate internet speeds are over 1Gbps. I'm inclined to pick A
upvoted 4 times
BigQuery
2 years, 5 months ago
SAY MY NAME! You need to Transfer Sensitive Patient information, over public ISP you shouldn't do that.
upvoted 3 times
...
forepick
11 months, 3 weeks ago
If you transfer 10TBs over the wire, your network will be blocked for the entire transfer time. This isn't something a company would be happy to swallow.
upvoted 2 times
...
...
TNT87
3 years, 7 months ago
Answer is B,gsutil has a limit of 1TBaccording to Google documentation,if data is morethan 1TBthen we have to use Transfer Appliance.
upvoted 17 times
Yiouk
2 years, 9 months ago
The answer is clearly seen here: https://cloud.google.com/architecture/migration-to-google-cloud-transferring-your-large-datasets#transfer-options
upvoted 8 times
...
...
AzureDP900
1 year, 4 months ago
B is right answer
upvoted 4 times
...
...
SSV
Highly Voted 3 years, 10 months ago
Answer should be B: A is also correct but it has its own limit. It allows only 5TB data upload at a time to cloud storage. https://cloud.google.com/storage/quotas I will go with B
upvoted 8 times
VASI
3 years, 4 months ago
5Tb "for individual objects". Create smaller AVRO files.
upvoted 2 times
...
VASI
3 years, 4 months ago
AVRO compression can reduce file size to a tenth
upvoted 3 times
...
...
Naresh_4u
Most Recent 2 weeks, 1 day ago
Selected Answer: B
to securely transfer data and looking at the size of data B is the correct option.
upvoted 1 times
...
GCanteiro
3 months, 2 weeks ago
Selected Answer: A
IMO "A" is the most suitable option since the transfer appliance could take 25 days to get the appliance and then 25 days to ship it back and have the data available. https://cloud.google.com/transfer-appliance/docs/4.0/overview#transfer-speeds
upvoted 1 times
...
TVH_Data_Engineer
4 months, 4 weeks ago
Selected Answer: B
Given the sensitivity of the patient records and the large size of the data, using Google's Transfer Appliance is a secure and efficient method. The Transfer Appliance is a hardware solution provided by Google for transferring large amounts of data. It enables you to securely transfer data without exposing it over the internet.
upvoted 1 times
...
rocky48
5 months, 2 weeks ago
Selected Answer: B
Option B combines security, efficiency, and ease of use, making it a suitable choice for transferring sensitive patient records to BigQuery.
upvoted 1 times
...
spicebits
6 months, 1 week ago
Selected Answer: A
10 TB is nothing. With a single 10 GB interconnect you could transfer the data in 3 hours or even with a 1 GB speeds without interconnect you could transfer it in one weekend. The transfer appliance will take 25 days to get the appliance and then 25 days while you wait for the data to be available that is not "time-efficient" at all. I go with A instead of B.
upvoted 3 times
spicebits
6 months, 1 week ago
I got the 25 days + 25 days from here: https://cloud.google.com/transfer-appliance/docs/4.0/overview#transfer-speeds
upvoted 2 times
...
...
A_Nasser
7 months, 3 weeks ago
Selected Answer: A
transfer appliance will take time more than gsutil. and we did not mention yet if the location of the organization has google data centre
upvoted 3 times
...
DineshVarma
9 months ago
Selected Answer: D
As per Google recommendation above 1TB of transfer from onprem or from Google cloud or other cloud storage like s3 etc we need to use storage transfer service.
upvoted 1 times
...
arien_chen
9 months ago
Transfer Appliance would take 20 days for epected turnaround time. https://cloud.google.com/architecture/migration-to-google-cloud-transferring-your-large-datasets#expected%20turnaround:~:text=The%20expected%20turnaround%20time%20for%20a%20network%20appliance%20to%20be%20shipped%2C%20loaded%20with%20your%20data%2C%20shipped%20back%2C%20and%20rehydrated%20on%20Google%20Cloud%20is%2020%20days. The best answer would be A. If gsutil consume/leverage 100MB it would take 12 days and more time-efficient than B. This is a reasonable assumption. https://cloud.google.com/static/architecture/images/big-data-transfer-how-to-get-started-transfer-size-and-speed.png
upvoted 1 times
...
Colourseun
9 months ago
I will go with " A" because of the transition time to take transfer appliance to Google and that also depends in the organisation location. gsutil works anywhere internet is available.
upvoted 1 times
...
NeoNitin
10 months ago
bhaii ek baar mera point sun lo and khud ki research karo... option A,because dekho 10tb hai ye mat dekho file ko compress kiya ja raha hai Avro me jo ki 90%-92% compress kar deta hai, to finaly hamare pass 1TB ya esase bhi kam ka file data hai jisko transfer karna hai , ab batao Transfer Appliance kyo use karu bhaisahab transfer appliance ki catagory hai 40tb aur 300TB ki , kyo offline ja rahe ho jo ki 7 din ya usase jyada time lega tumhara data online aane me, aur GSUTIL use karoge aur ye 100MB pe hi chala without dedicated bandwidth tab bhi ye ,1TB 100MB/S ki speed se 1 din me pura data online la dega .kyoki avro se file pahale hi 10tb se 1tb ho chuki hai. to GSUTIL is the best,bhale hi cost effective nahi bola hai question me but time bhi to dekho
upvoted 3 times
...
aewis
10 months, 1 week ago
Selected Answer: B
A will take crazy time if the organization didnt have a dedicated link
upvoted 1 times
...
ZZHZZH
10 months, 1 week ago
Selected Answer: A
Transfer Appliance is not as time-efficient when you have enough bandwitdh. https://cloud.google.com/architecture/migration-to-google-cloud-transferring-your-large-datasets#transfer_appliance_for_larger_transfers
upvoted 3 times
...
WillemHendr
10 months, 4 weeks ago
Selected Answer: B
There is no "cost effective", if this is not a clear case for the appliance than what is?
upvoted 1 times
...
Ender_H
11 months, 1 week ago
Selected Answer: A
A is the answer, the question states the following facts: - Total size of database 10TB. - Solution needs to be: * Secure * Time-efficient Total size of database: will be significantly reduced in an avro file compression (up to 90% compression) Secure transfer: Even if we are dealing with sensitive data, data is encrypted when in transit while using `gsutils cp` to upload the data to GCS. https://cloud.google.com/storage/docs/gsutil/addlhelp/SecurityandPrivacyConsiderations#transport-layer-security Time-Efficient: gsutil could upload 10TB of data in 30 hours (or 1TB if its avro compressed first in 3 hours)
upvoted 5 times
...
dgteixeira
11 months, 1 week ago
Selected Answer: A
It has to be gsutil. In this documentation: https://cloud.google.com/architecture/migration-to-google-cloud-transferring-your-large-datasets#transfer-options It states that if it meets the projects deadline, use gsutil. Also, for Transfer Appliance, here (https://cloud.google.com/architecture/migration-to-google-cloud-transferring-your-large-datasets#transfer_appliance_for_larger_transfers), it states: The expected turnaround time for a network appliance to be shipped, loaded with your data, shipped back, and rehydrated on Google Cloud is 20 days. Even with 100 Mbps, for 10 TB, it's 12 days. Almost half! of the Transfer Appliance. It's, of course, option A.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...