You want to rebuild your batch pipeline for structured data on Google Cloud. You are using PySpark to conduct data transformations at scale, but your pipelines are taking over twelve hours to run. To expedite development and pipeline run time, you want to use a serverless tool and SOL syntax. You have already moved your raw data into Cloud Storage. How should you build the pipeline on Google Cloud while meeting speed and processing requirements?
devaid
Highly Voted 2 years, 2 months agoGCP001
Most Recent 11 months, 1 week agoMaxNRG
12 months agoMoeHaydar
1 year, 5 months agoPrudvi3266
1 year, 8 months agomusumusu
1 year, 10 months agomaci_f
1 year, 11 months agoevanfebrianto
1 year, 6 months agoAtnafu
2 years agoAtnafu
2 years agoTNT87
2 years, 2 months agoWasss123
2 years, 3 months agoJohn_Pongthorn
2 years, 3 months agoTNT87
2 years, 3 months agoducc
2 years, 3 months agoducc
2 years, 3 months agoducc
2 years, 3 months agoAWSandeep
2 years, 3 months agoducc
2 years, 3 months agoAWSandeep
2 years, 3 months agoAtnafu
2 years ago