exam questions

Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 7 question 4 discussion

Actual exam question from Microsoft's DP-100
Question #: 4
Topic #: 8
[All DP-100 Questions]

HOTSPOT -
You need to identify the methods for dividing the data according to the testing requirements.
Which properties should you select? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

Show Suggested Answer Hide Answer
Suggested Answer:
Scenario: Testing -
You must produce multiple partitions of a dataset based on sampling using the Partition and Sample module in Azure Machine Learning Studio.

Box 1: Assign to folds -
Use Assign to folds option when you want to divide the dataset into subsets of the data. This option is also useful when you want to create a custom number of folds for cross-validation, or to split rows into several groups.
Not Head: Use Head mode to get only the first n rows. This option is useful if you want to test a pipeline on a small number of rows, and don't need the data to be balanced or sampled in any way.
Not Sampling: The Sampling option supports simple random sampling or stratified random sampling. This is useful if you want to create a smaller representative sample dataset for testing.

Box 2: Partition evenly -
Specify the partitioner method: Indicate how you want data to be apportioned to each partition, using these options:
✑ Partition evenly: Use this option to place an equal number of rows in each partition. To specify the number of output partitions, type a whole number in the
Specify number of folds to split evenly into text box.
Reference:
https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-module-reference/partition-and-sample

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Arend78
10 months, 1 week ago
"You must create three equal partitions for cross-validation. You must also configure the cross-validation process so that the rows in the test and training datasets are divided evenly by properties that are near each city's main river. You must complete this task before the data goes through the sampling process." Considering "divided evenly by properties that are near each city's main river", shouldn't the sampling process be stratified? That would make the correct answer "Partition with custom partitions"?
upvoted 1 times
michaelmorar
8 months, 1 week ago
Makes sense, but there doesn't seem to be an option for custom partitioning based on a value a certain field value. Custom partition only allows us to play with the proportions in each split. Stratified Split is not selectable in the question (not sure why)... From Microsoft's reference: Partition with customized proportions: Use this option to specify the size of each partition as a comma-separated list. For example, assume that you want to create three partitions. The first partition will contain 50 percent of the data. The remaining two partitions will each contain 25 percent of the data. In the List of proportions separated by comma box, enter these numbers: .5, .25, .25.
upvoted 1 times
...
...
JTWang
1 year, 6 months ago
Partition evenly: Use this option to place an equal number of rows in each partition. To specify the number of output partitions, enter a whole number in the Specify number of folds to split evenly into box. https://docs.microsoft.com/en-us/azure/machine-learning/component-reference/partition-and-sample#split-data-into-partitions
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago