Exam Certified Generative AI Engineer Associate topic 1 question 65 discussion

Actual exam question from Databricks's Certified Generative AI Engineer Associate

Question #: 65
Topic #: 1

[All Certified Generative AI Engineer Associate Questions]

A Generative AI Engineer is developing a RAG application and would like to experiment with different embedding models to improve the application performance.

Which strategy for picking an embedding model should they choose?

A. Pick an embedding model with multilingual support to support potential multilingual user questions
B. Pick the most recent and most performant open LLM released at the time
C. Pick an embedding model trained on related domain knowledge
D. Pick the embedding model ranked highest on the Massive Text Embedding Benchmark (MTEB) leaderboard hosted by HuggingFace

Show Suggested Answer

Suggested Answer: C 🗳️

by Mogit at June 14, 2025, 2:39 a.m.

Comments

Submit Cancel

Mogit

1 month, 2 weeks ago

Selected Answer: C

The most effective strategy is C. Pick an embedding model trained on related domain knowledge, because it directly addresses the need to improve retrieval performance by aligning the embedding space with the application’s semantic context. Domain-specific models capture nuanced relationships better than general-purpose ones, leading to more relevant retrieved documents and better RAG outputs. However, D is a close second and serves as a practical fallback or complementary approach, especially during experimentation. If the domain is broad, unclear, or lacks specialized models, starting with a top MTEB-ranked model ensures strong baseline performance. The engineer can easily access MTEB rankings on HuggingFace and test models like E5 or BGE, which are well-documented and widely supported.

upvoted 1 times

...