exam questions

Exam AI-100 All Questions

View all questions & answers for the AI-100 exam

Exam AI-100 topic 1 question 20 discussion

Actual exam question from Microsoft's AI-100
Question #: 20
Topic #: 1
[All AI-100 Questions]

You plan to implement a new data warehouse for a planned AI solution.
You have the following information regarding the data warehouse:
✑ The data files will be available in one week.
✑ Most queries that will be executed against the data warehouse will be ad-hoc queries.
✑ The schemas of data files that will be loaded to the data warehouse will change often.
✑ One month after the planned implementation, the data warehouse will contain 15 TB of data.
You need to recommend a database solution to support the planned implementation.
What two solutions should you include in the recommendation? Each correct answer is a complete solution.
NOTE: Each correct selection is worth one point.

  • A. Apache Hadoop
  • B. Apache Spark
  • C. A Microsoft Azure SQL database
  • D. An Azure virtual machine that runs Microsoft SQL Server
Show Suggested Answer Hide Answer
Suggested Answer: AB 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
exam_taker5
Highly Voted 5 years, 10 months ago
I believe the answer should be Hadoop and Spark. Both of these are intended for unstructured data, and the question specifies that the schema will be changing constantly. Both also excel with big data (4TB over the first month qualifies)
upvoted 22 times
CodeAnant
5 years, 9 months ago
I think Hadoop... is correct... not SQl not Spark.... as Spark is for analytics not for storage
upvoted 5 times
Piraat
5 years, 3 months ago
But Hadoop nor Spark are database systems right? So it doesn't answer the question
upvoted 1 times
...
...
...
JCM
Highly Voted 5 years, 7 months ago
You ask for two solutions but the answer is only SQL.
upvoted 12 times
...
rveney
Most Recent 1 year, 12 months ago
two recommended solutions for supporting the planned data warehouse implementation are B. Apache Spark and C. A Microsoft Azure SQL database.
upvoted 1 times
...
Derin_tade
3 years, 11 months ago
With Azure SQL Database, you can create a highly available and high-performance data storage layer for the applications and solutions in Azure. SQL Database can be the right choice for a variety of modern cloud applications because it enables you to process both relational data and non-relational structures, such as graphs, JSON, spatial, and XML. The hyperscale service tier for single databases enables you to scale to 100 TB, with fast backup and restore capabilities. https://docs.microsoft.com/en-us/azure/azure-sql/database/sql-database-paas-overview
upvoted 1 times
...
timosi
3 years, 11 months ago
the correct answer is B and D
upvoted 1 times
...
Cornholioz
4 years, 4 months ago
By asking for a Database Solution, the question asks for a combination given the requirements. Part of the requirements can be addressed by SQL and part by No SQL or Big Data. Spark is only for Analytics but can still be used here. However, I would go with Hadoop and Azure SQL as a combined solution to address the 4 requirements. I may be wrong but this is just how I see the "solution" since none of these offerings/services/products can cater solely to all requirements. Also, I am wondering why it says Azure SQL DB and not Azure SQL DWH (or Synapse, but this an old question before Synapse was born).
upvoted 2 times
Cornholioz
4 years, 4 months ago
Update: Then again, looking at the Spark connector available for SQL Server and Azure SQL DB, I'm thinking Spark is still the right one instead of Hadoop. https://docs.microsoft.com/en-us/sql/connect/spark/connector?view=sql-server-ver15 Check the link. This is kind of what the question is asking, about ad-hoc etc. However, Hyperscale service tier of Azure SQL DB can grow to 100TB. SQL Server on a VM can do it easily and the Spark connector is available for that too. Changing my answer to Spark... unsure about SQL DB or SQL Server VM. Think I'll go with SQL DB.
upvoted 3 times
allanm
4 years ago
It can't be SQL as the schema of the data keeps changing. Hadoop and spark have data storage systems that cater to the requirement of the question.
upvoted 1 times
...
...
...
alanblack
4 years, 4 months ago
AB is correct
upvoted 1 times
...
duytran216
4 years, 8 months ago
I think A&C. First two requirement is for SQL. Because data is ad-hoc and available 1 week. So not much storage here The remaining of requirement is data warehouse. It is Hadoop: 15TB
upvoted 2 times
...
Nova077
4 years, 9 months ago
" The schemas of data files that will be loaded to the data warehouse will change often" This rules out SQL. It should be a No SQL option. Hadoop stores data in HDFS files system. Spark also has its storage. So it will be spark and hadoop
upvoted 5 times
codingorca
4 years, 7 months ago
schema of the data files will change often, but not the data warehouse, these two are separated
upvoted 1 times
...
...
sayak17
4 years, 9 months ago
C is a correct option based on the link provided. Also hyperscale option in sql database supports upto 100 TB. Don't know what the other answer might be.
upvoted 1 times
...
cramau
4 years, 10 months ago
Is there any consensus on this??
upvoted 5 times
shiranai
4 years, 5 months ago
I am wondering the same thing!!! Seems like no agreement yet
upvoted 1 times
...
...
BobjonesBob
4 years, 11 months ago
The question is to “recommend a DATABASE solution”, Spark and Hadoop are not databases, which means both SQL answers (C&D) would be the correct choices. Please correct me if I am wrong.
upvoted 9 times
codingorca
4 years, 7 months ago
I agree on both C&D should be the answer, either Spark or Hadoop are databases or at least spark is not. SQL Azure Hyperscale supports up to 100 TB
upvoted 2 times
...
...
robiciccio
4 years, 11 months ago
I think the original answer provided is correct based on the solution reported by the link https://docs.microsoft.com/en-us/azure/sql-database/saas-multitenantdb-adhoc-reporting. Since it use elastic query there is only one answer: SQL database.
upvoted 4 times
...
troj
5 years, 1 month ago
Azure Hadoop in HDinsight could be a right answer https://azure.microsoft.com/en-in/blog/azure-hdinsight-interactive-query-simplifying-big-data-analytics-architecture-and-operations/
upvoted 1 times
...
Atanu
5 years, 1 month ago
A and B
upvoted 3 times
...
avestabrzn
5 years, 2 months ago
Answer is AB
upvoted 2 times
...
bizolus
5 years, 2 months ago
Azure SQL can support more than 15TB https://docs.microsoft.com/en-us/azure/sql-database/sql-database-service-tier-hyperscale
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...