BlackFriday 2024! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Exam Databricks-Certified-Professional-Data-Engineer Topic 5 Question 17 Discussion

Actual exam question for Databricks's Databricks-Certified-Professional-Data-Engineer exam
Question #: 17
Topic #: 5
[All Databricks-Certified-Professional-Data-Engineer Questions]

A team of data engineer are adding tables to a DLT pipeline that contain repetitive expectations for many of the same data quality checks.

One member of the team suggests reusing these data quality rules across all tables defined for this pipeline.

What approach would allow them to do this?

Show Suggested Answer Hide Answer
Suggested Answer: A

Maintaining data quality rules in a centralized Delta table allows for the reuse of these rules across multiple DLT (Delta Live Tables) pipelines. By storing these rules outside the pipeline's target schema and referencing the schema name as a pipeline parameter, the team can apply the same set of data quality checks to different tables within the pipeline. This approach ensures consistency in data quality validations and reduces redundancy in code by not having to replicate the same rules in each DLT notebook or file.


Databricks Documentation on Delta Live Tables: Delta Live Tables Guide

Contribute your Thoughts:

Luke
2 months ago
I'm just picturing the team arguing over which option is best, like a bunch of data ninjas fighting over the perfect data quality kata.
upvoted 0 times
Fernanda
2 months ago
D) Maintain data quality rules in a separate Databricks notebook that each DLT notebook of file.
upvoted 0 times
...
Anglea
2 months ago
A) Maintain data quality rules in a Delta table outside of this pipeline's target schema, providing the schema name as a pipeline parameter.
upvoted 0 times
...
...
Alba
2 months ago
Option D is the one for me! Keeping the data quality rules in a separate notebook is like a data engineer's version of 'Keep Calm and Carry On.'
upvoted 0 times
Sharita
2 months ago
I agree, having a separate notebook for data quality rules makes it easier to manage.
upvoted 0 times
...
Nieves
2 months ago
Option D is a good choice. It helps keep things organized.
upvoted 0 times
...
...
Zachary
3 months ago
Using global Python variables (option B) feels a bit hacky. I'd prefer a more structured approach like option A or D.
upvoted 0 times
Lashawnda
2 months ago
D) Maintain data quality rules in a separate Databricks notebook that each DLT notebook of file.
upvoted 0 times
...
Leanora
2 months ago
A) Maintain data quality rules in a Delta table outside of this pipeline's target schema, providing the schema name as a pipeline parameter.
upvoted 0 times
...
...
Jose
3 months ago
I agree with Val, option A seems like the most practical solution.
upvoted 0 times
...
Doug
3 months ago
I'm feeling option C. Adding constraints through an external job with access to the pipeline config seems like a robust solution.
upvoted 0 times
Kandis
2 months ago
Let's go with option C then, it seems like the most practical approach.
upvoted 0 times
...
Becky
2 months ago
It would definitely streamline the process and make it easier to manage.
upvoted 0 times
...
Dominque
2 months ago
I agree, having an external job handle the constraints seems efficient.
upvoted 0 times
...
Delbert
2 months ago
Option C sounds like a good idea. It would centralize the data quality rules.
upvoted 0 times
...
...
Val
3 months ago
But with option A, we can easily maintain and update the data quality rules.
upvoted 0 times
...
Mike
3 months ago
I disagree, I believe option D would be more efficient.
upvoted 0 times
...
Val
3 months ago
I think option A is the best approach.
upvoted 0 times
...
Antonio
3 months ago
Option A is the way to go! Maintaining data quality rules in a separate Delta table is a clean and organized approach.
upvoted 0 times
Lennie
3 months ago
That sounds like a smart solution to ensure consistency and efficiency in the data quality checks.
upvoted 0 times
...
Detra
3 months ago
I agree, it would make it easier to manage and update the data quality rules for all tables in the pipeline.
upvoted 0 times
...
Tequila
3 months ago
Option A is the way to go! Maintaining data quality rules in a separate Delta table is a clean and organized approach.
upvoted 0 times
...
...

Save Cancel