exam questions

Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 2 question 33 discussion

Actual exam question from Microsoft's DP-100
Question #: 33
Topic #: 2
[All DP-100 Questions]

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You are analyzing a numerical dataset which contains missing values in several columns.
You must clean the missing values using an appropriate operation without affecting the dimensionality of the feature set.
You need to analyze a full dataset to include all values.
Solution: Use the Last Observation Carried Forward (LOCF) method to impute the missing data points.
Does the solution meet the goal?

  • A. Yes
  • B. No
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
timosi
Highly Voted 3 years, 1 month ago
yes, it meets the goal to keep the dimensionality of the feature set. Is it the right approach? without context difficult to say, but if the observations are independent, the answer would be no.
upvoted 17 times
...
David_Tadeu
Highly Voted 2 years, 1 month ago
Selected Answer: A
ID | NAME | T1 | T2 | T3 | 1 | Joe | 37 | 89 | - | 2 | Pip | 39 | - | 80 | 3 | Guy | 89 | 99 | 45 | 4 | Bane | 33 | - | - | 5 | Sue | 77 | - | - | If we applied LOCF, it would turn into ID | NAME | T1 | T2 | T3 | 1 | Joe | 37 | 89 | 89 | 2 | Pip | 39 | 80 | 80 | 3 | Guy | 89 | 99 | 45 | 4 | Bane | 33 | 33 | 33 | 5 | Sue | 77 | 77 | 77 | Hence it would: - CLEAN THE MISSING DATA - MAINTAIN THE DIMENSIONALITY (if it would make sense or no to do it, it would be a different question). The LOCF method can be used when data are longitudinal (i.e. repeated measures have been taken per subject by time point). So, it would make sense to use it in a dataset where each row contains an Olympic athlete and the columns contain their distance attempts for long jump, but it would not make sense to use it in a dataset where each row contains an Olympic athlete and the columns contain their time attempts for 100m, 400m, and 1000m.
upvoted 8 times
...
ManuelHenriques
Most Recent 1 month, 3 weeks ago
Selected Answer: A
Yes, LOCF maintains numerical dimensionality
upvoted 1 times
...
84b1989
3 months, 2 weeks ago
Selected Answer: B
Last Observation Carried Forward (LOCF): The LOCF method fills missing values by carrying forward the last observed value. This method is typically used in time-series data where the order of observations matters. However, it may not be appropriate for non-time-series data or datasets where the assumption of continuity (i.e., the last observed value is a good estimate for missing values) does not hold. Impact on Dimensionality: While LOCF does not reduce the dimensionality of the dataset, it may introduce bias or inaccuracies if the missing values are not related to the last observed values. Better Alternatives: For numerical datasets, other imputation methods such as mean, median, or mode imputation, or more advanced techniques like k-nearest neighbors (KNN) imputation, might be more appropriate and effective.
upvoted 2 times
...
Ahmed_Gehad
9 months, 1 week ago
Selected Answer: B
The Last Observation Carried Forward (LOCF) method is a technique that is used to impute missing values by carrying forward the last observed value for that variable. This can be useful for imputing missing values in time series data. However, the LOCF method can be problematic if there are a lot of missing values in a column. This is because the LOCF method will simply carry forward the last observed value, even if that value is an outlier. This can skew the results of your analysis. In this case, the goal is to clean the missing values using an appropriate operation without affecting the dimensionality of the feature set. However, the LOCF method can affect the dimensionality of the feature set, as it will create new rows in the dataset to store the imputed values. Therefore, the LOCF method is not the best solution for this problem.
upvoted 6 times
...
PI_Team
9 months, 3 weeks ago
Selected Answer: B
The correct answer is B. No. The Last Observation Carried Forward (LOCF) method is commonly used for imputing missing values in time series data, where the last observed value is carried forward to replace the missing values. However, this method may not be appropriate for non-time series data or when the missing values are not expected to have a linear or sequential relationship. To clean the missing values in a numerical dataset without affecting the dimensionality of the feature set, other methods such as mean imputation, median imputation, or regression imputation could be considered. These methods calculate the mean, median, or use regression models to estimate the missing values based on the available data without changing the dimensionality of the feature set. SaM
upvoted 3 times
phydev
9 months, 2 weeks ago
Yes, ChatGPT agrees.
upvoted 1 times
...
...
DrewSeeSharp
1 year, 2 months ago
No, the solution does not meet the goal of analyzing a full dataset to include all values. While the LOCF method can be useful for imputing missing data points in time series data, it is not appropriate for all types of data and may not accurately reflect the true values of the missing data points. Additionally, imputing missing data using LOCF will not result in a full dataset as the missing values will still exist in the original dataset. Other imputation methods such as mean imputation, median imputation, or multiple imputation may be more appropriate depending on the characteristics of the dataset and the nature of the missing data.
upvoted 4 times
...
phdykd
1 year, 2 months ago
Yes, the solution meets the goal of analyzing a full dataset that includes all values by using the Last Observation Carried Forward (LOCF) method to impute the missing data points. With LOCF, the missing value in a given time series is filled with the most recent non-missing value in the same series. This method is simple to implement and can be useful in situations where missing values are few and the time series is relatively smooth. However, it is important to note that this method can introduce bias if the missing values are not missing at random and can lead to artificially smoothed data.
upvoted 3 times
...
Edriv
1 year, 4 months ago
Agree Yes
upvoted 1 times
...
Amyreads
1 year, 5 months ago
Selected Answer: B
The question also suggests "you are analyzing the full data set" to impute the missing values and therefore MICE would be correct
upvoted 5 times
...
JTWang
1 year, 6 months ago
Selected Answer: B
The question requirements need to analyze the full dataset and include all values to fill in missing values, so LOCF does not meet the subject requirements. LOCF only fill miss value by last value.
upvoted 2 times
silva_831
1 year, 6 months ago
As per your answer, The option should be A instead of B
upvoted 1 times
...
...
ning
1 year, 10 months ago
Selected Answer: A
Absolutely possible to replace data with LOCF to keep the same number of rows and columns of the original dataset! That is the only requirements for the question!
upvoted 4 times
...
FU_User
1 year, 11 months ago
Selected Answer: B
"You need to analyze a full dataset to include all values." This requirement is not fulfilled as only the last observation is used -> B
upvoted 2 times
...
azurecert2021
2 years, 10 months ago
not sure but i was not able to find LOCF on below link https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clean-missing-data
upvoted 3 times
...
snac
2 years, 10 months ago
I think this should be a yes. LOCF will keep the dimensionality as much as MICE. And dimensionality was the only real requirement.
upvoted 4 times
...
prashantjoge
2 years, 11 months ago
the answer should be yes irrespective. How can mice reduce dimensionality?
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago