You want to rebuild your batch pipeline for structured data on Google Cloud. You are using PySpark to conduct data transformations at scale, but your pipelines are taking over twelve hours to run. To expedite development and pipeline run time, you want to use a serverless tool and SOL syntax. You have already moved your raw data into Cloud Storage. How should you build the pipeline on Google Cloud while meeting speed and processing requirements?
devaid
Highly Voted 2 years agoGCP001
Most Recent 9 months, 3 weeks agoMaxNRG
10 months, 2 weeks agoMoeHaydar
1 year, 3 months agoPrudvi3266
1 year, 6 months agomusumusu
1 year, 8 months agomaci_f
1 year, 9 months agoevanfebrianto
1 year, 5 months agoAtnafu
1 year, 11 months agoAtnafu
1 year, 11 months agoTNT87
2 years, 1 month agoWasss123
2 years, 1 month agoJohn_Pongthorn
2 years, 1 month agoTNT87
2 years, 1 month agoducc
2 years, 2 months agoducc
2 years, 2 months agoducc
2 years, 2 months agoAWSandeep
2 years, 2 months agoducc
2 years, 2 months agoAWSandeep
2 years, 2 months agoAtnafu
1 year, 11 months ago