Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 2 question 127 discussion

Actual exam question from Microsoft's DP-203

Question #: 127
Topic #: 2

HOTSPOT
-

You have an Azure Data Lake Storage account that contains one CSV file per hour for January 1, 2020, through January 31, 2023. The files are partitioned by using the following folder structure.

You need to query the files by using an Azure Synapse Analytics serverless SQL pool. The solution must return the row count of each file created during the last three months of 2022.

How should you complete the query? To answer, select the appropriate options in the answer area.

Show Suggested Answer

Suggested Answer:

by Lewiasskick at Jan. 8, 2024, 9:33 p.m.

Comments

Submit Cancel

Lewiasskick

Highly Voted 1 year, 5 months ago

correct: https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/query-specific-files

upvoted 6 times

...

Pey1nkh

Most Recent 4 months ago

wrong! r.filepath() : returns full path ( /year=2022/month=10/day=01/file1.csv ) r.filepath(1) : returns year (2022) r.filepath(2) : returns month which we need.

upvoted 2 times

...

WayBacK

10 months, 4 weeks ago

Correct Answer My reason: from the script provided, particularly from here "From OPENROWSET ( BULK 'csv/system1/2022/*/*.csv', The search on the file path uses a wildcard that begins after the year 2022 (i.e. the asterisks) and the first parameter is the month after 2022. Hence, r.filepath(1) is correct.

upvoted 1 times

...

ageorgieva

1 year ago

ChatGPT: Given the typical folder structure in your case, which is csv/system1/{year}/{month}/{filename}.csv, you need to access specific segments of this path to filter data by month: r.filepath(0) would return the first segment after the initial folder, likely csv. r.filepath(1) would then return system1. r.filepath(2) refers to the next segment, which, according to your folder structure, represents the year (2022). So answer is r.filepath(2).

upvoted 1 times

...

MohamedBI12

1 year, 3 months ago

Correct answer

upvoted 2 times

...

mghf61

1 year, 4 months ago

r.filepath(2) rerurn year (2022)

upvoted 1 times

...

Azure_2023

1 year, 5 months ago

Correct. The term "first wildcard" refers to the asterisk that represents the month in the file path placeholder. For this specific example.

upvoted 3 times

...

[Removed]

1 year, 5 months ago

Answers are correct: As per next example, parameter (1) refers to year, and paremeter (2) to month. SELECT r.filepath() AS filepath ,r.filepath(1) AS [year] ,r.filepath(2) AS [month] ,COUNT_BIG(*) AS [rows] FROM OPENROWSET( BULK 'csv/taxi/yellow_tripdata_*-*.csv', DATA_SOURCE = 'SqlOnDemandDemo', FORMAT = 'CSV', PARSER_VERSION = '2.0', FIRSTROW = 2 ) WITH ( vendor_id INT ) AS [r] WHERE r.filepath(1) IN ('2017') AND r.filepath(2) IN ('10', '11', '12') GROUP BY r.filepath() ,r.filepath(1) ,r.filepath(2) ORDER BY filepath;

upvoted 4 times

Gikan

1 year, 4 months ago

Just to be clear: The right answer is filepath(1), because In your example the 2. parameter (star) is the month, but in the answer the year is fixed and the month is the first parameter.

upvoted 4 times

...

zodraz

1 year, 5 months ago

Second one is r.filepath(2) asper: When called without parameter, returns the full file path that a row originates from. When called with parameter, it returns part of path that matches the wildcard on position specified in the parameter. For example, parameter value 1 would return part of path that matches the first wildcard.

upvoted 3 times

...