Databricks Exam Databricks Certified Generative AI Engineer Associate Topic 6 Question 15 Discussion

Actual exam question for Databricks's Databricks Certified Generative AI Engineer Associate exam

Question #: 15
Topic #: 6

[All Databricks Certified Generative AI Engineer Associate Questions]

A Generative AI Engineer just deployed an LLM application at a digital marketing company that assists with answering customer service inquiries.

Which metric should they monitor for their customer service LLM application in production?

ANumber of customer inquiries processed per unit of time

BEnergy usage per query

CFinal perplexity scores for the training of the model

DHuggingFace Leaderboard values for the base LLM

Show Suggested Answer

Suggested Answer: A

When deploying an LLM application for customer service inquiries, the primary focus is on measuring the operational efficiency and quality of the responses. Here's why A is the correct metric:

Number of customer inquiries processed per unit of time: This metric tracks the throughput of the customer service system, reflecting how many customer inquiries the LLM application can handle in a given time period (e.g., per minute or hour). High throughput is crucial in customer service applications where quick response times are essential to user satisfaction and business efficiency.

Real-time performance monitoring: Monitoring the number of queries processed is an important part of ensuring that the model is performing well under load, especially during peak traffic times. It also helps ensure the system scales properly to meet demand.

Why other options are not ideal:

B . Energy usage per query: While energy efficiency is a consideration, it is not the primary concern for a customer-facing application where user experience (i.e., fast and accurate responses) is critical.

C . Final perplexity scores for the training of the model: Perplexity is a metric for model training, but it doesn't reflect the real-time operational performance of an LLM in production.

D . HuggingFace Leaderboard values for the base LLM: The HuggingFace Leaderboard is more relevant during model selection and benchmarking. However, it is not a direct measure of the model's performance in a specific customer service application in production.

Focusing on throughput (inquiries processed per unit time) ensures that the LLM application is meeting business needs for fast and efficient customer service responses.

by Brinda at Apr 30, 2025, 09:52 PM

Limited Time Offer

25%

Off

Get Premium Databricks Certified Generative AI Engineer Associate Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Blondell

27 days ago

I think we should consider both A) and C) to get a comprehensive view of the performance of the LLM application.

upvoted 0 times

...

Alline

1 months ago

I believe monitoring C) Final perplexity scores for the training of the model is also important to ensure the accuracy of the responses.

upvoted 0 times

...

Matthew

1 months ago

I agree with Alfred. That metric will show us how efficient the LLM application is in handling customer inquiries.

upvoted 0 times

...

Tricia

1 months ago

The correct answer is clearly A - number of customer inquiries processed. Unless they're running this thing on a potato, the energy usage is probably not a concern. And who cares about the leaderboard when you've got customers to serve?

upvoted 0 times

Mable

1 days ago

Energy usage per query is not as important as ensuring efficient customer service.

upvoted 0 times

...

Laurel

18 days ago

I agree, monitoring the number of customer inquiries processed is crucial for the success of the application.

upvoted 0 times

...

Aracelis

2 months ago

Haha, energy usage per query? What is this, a green AI challenge? I think the Generative AI Engineer needs to focus on the actual business metrics, not how much electricity the model is chugging.

upvoted 0 times

...

A: I agree, tracking the number of customer inquiries processed per unit of time is crucial for monitoring the performance of the LLM application.

upvoted 0 times

...