You want to rebuild your batch pipeline for structured data on Google Cloud. You are using PySpark to conduct data transformations at scale, but your pipelines are taking over twelve hours to run. To expedite development and pipeline run time, you want to use a serverless tool and SOL syntax. You have already moved your raw data into Cloud Storage. How should you build the pipeline on Google Cloud while meeting speed and processing requirements?
devaid
Highly Voted 2 years, 4 months agoGCP001
Most Recent 1 year, 1 month agoMaxNRG
1 year, 1 month agoMoeHaydar
1 year, 7 months agoPrudvi3266
1 year, 10 months agomusumusu
1 year, 12 months agomaci_f
2 years agoevanfebrianto
1 year, 8 months agoAtnafu
2 years, 2 months agoAtnafu
2 years, 2 months agoTNT87
2 years, 4 months agoWasss123
2 years, 5 months agoJohn_Pongthorn
2 years, 5 months agoTNT87
2 years, 5 months agoducc
2 years, 5 months agoducc
2 years, 5 months agoducc
2 years, 5 months agoAWSandeep
2 years, 5 months agoducc
2 years, 5 months agoAWSandeep
2 years, 5 months agoAtnafu
2 years, 2 months ago