Databricks

Machine Learning Professional — Databricks Certified Machine Learning Professional Study Guide

82 practice questions Updated 2026-02-18 $19 (70% off) HTML + PDF formats

Machine Learning Professional Exam Overview

Prepare for the Databricks Machine Learning Professional certification exam with our comprehensive study guide. This study material contains 82 practice questions sourced from real exams and expert-verified for accuracy. Each question includes the correct answer and a detailed explanation to help you understand the material thoroughly.

The Machine Learning Professional exam — Databricks Certified Machine Learning Professional — is offered by Databricks. Our study materials were last updated on 2026-02-18 to reflect the most recent exam objectives and content.

What You Get

82 Practice Questions

Complete question bank covering all exam domains and objectives.

HTML + PDF Formats

Interactive HTML file (recommended) for screen study and a print-ready PDF.

Instant Download

Access your study materials immediately after purchase.

Email with Permanent Download Links

You will receive a confirmation email with permanent download links in case you want to download the files again in the future.

Why Choose CheapestExamDumps?

Lowest Price Available

Only $19 per exam — competitors charge $50-$300 for similar content.

Updated Monthly

Study materials refreshed within 30 days of any exam content changes.

Free Preview

Try 15 real practice questions before you buy — no signup required.

Instant Access

Download HTML + PDF immediately after payment. No waiting, no account needed.

$63 $19

One-time payment · HTML + PDF · Instant download · 82 questions

Free Sample — 15 Practice Questions

Preview 15 of 82 questions from the Machine Learning Professional exam. Try before you buy — purchase the full study guide for all 82 questions with answers and explanations.

Question 65

A machine learning engineer wants to deploy a model for real-time serving using MLflow Model Serving. For the model, the machine learning engineer currently has one model version in each of the stages in the MLflow Model Registry. The engineer wants to know which model versions can be queried once Model Serving is enabled for the model. Which of the following lists all of the MLflow Model Registry stages whose model versions are automatically deployed with Model Serving?

A. Staging, Production, Archived
B. Production
C. None, Staging, Production, Archived
D. Staging, Production
E. None, Staging, Production
Show Answer
Correct Answer: D
Explanation:
With MLflow Model Serving enabled, only model versions in the Staging and Production stages are automatically deployed and can be queried. Versions in None or Archived stages are not served.

Question 16

A machine learning engineer has developed a model and registered it using the FeatureStoreClient fs. The model has model URI model_uri. The engineer now needs to perform batch inference on the training set logged with the model, but a few of the feature values in the column spend have since been updated and arc present in the customer-level Spark DataFrame spark_df. The customer_id column is the primary key of spark_df and the training set used when training and logging the model. Which code block can be used to compute predictions for the training set while overwriting its old spend values with the new spend values from spark_df?

A. fs.score_batch(model_uri, spark_df)
B. fs.score_model(model_uri, spark_df)
C. df = fs.get_updated_feature(spark_df, model=uri) fs.score_batch(model_uri, df)
D. df = fs.get_updated_features(spark_df) fs.score_batch(model_uri, df)
Show Answer
Correct Answer: A
Explanation:
FeatureStoreClient.score_batch uses the feature lookups logged with the model and joins them using the primary key. If the input DataFrame already contains a feature column (such as spend), those provided values override the stored feature values, enabling batch inference on the training set with updated spend values. The other options reference methods that do not exist or are unnecessary.

Question 21

A machine learning engineer wants to move their model version model_version for the MLflow Model Registry model model from the Staging stage to the Production stage using MLflow Client client. Which code block can they use to accomplish the task?

A.
B.
C.
D.
Show Answer
Correct Answer: D
Explanation:
In MLflow, moving a registered model version between stages is done with `MlflowClient.transition_model_version_stage()`. This method explicitly transitions a given `model_name` and `version` to a target stage such as `Production`. Other options either use incorrect APIs or unnecessary/invalid parameters.

Question 66

Which of the following is a benefit of logging a model signature with an MLflow model?

A. The model will have a unique identifier in the MLflow experiment
B. The schema of input data can be validated when serving models
C. The model can be deployed using real-time serving tools
D. The model will be secured by the user that developed it
E. The schema of input data will be converted to match the signature
Show Answer
Correct Answer: B
Explanation:
Logging a model signature in MLflow records the expected schema of the model’s inputs and outputs. During model serving, MLflow can use this signature to validate incoming data against the defined schema, helping catch mismatches early and ensuring the model receives data in the correct format.

Question 82

A machine learning engineer wants to move their model version model_version for the MLflow Model Registry model model from the Staging stage to the Production stage using MLflow Client client. Which of the following code blocks can they use to accomplish the task?

A.
B.
C.
D.
E.
Show Answer
Correct Answer: C
Explanation:
In MLflow, promoting a registered model version to another stage is done with MlflowClient.transition_model_version_stage(), which requires only the model name, version number, and the target stage. There is no need to specify the source stage. The code in option C correctly calls this method with name="model", version=model_version, and stage="Production", which moves the model version from Staging (or any current stage) to Production.

Question 26

Which of the following is a reason for using Jensen-Shannon (JS) distance over a Kolmogorov-Smirnov (KS) test for numeric feature drift detection?

A. All of these reasons
B. JS is not normalized or smoothed
C. None of these reasons
D. JS is more robust when working with large datasets
E. JS does not require any manual threshold or cutoff determinations
Show Answer
Correct Answer: C
Explanation:
None of the listed statements is a valid reason to prefer Jensen–Shannon (JS) distance over the Kolmogorov–Smirnov (KS) test. JS is in fact normalized and smoothed (so B is false), it is not inherently more robust than KS on large datasets (D is false), and although JS yields a bounded score, practitioners still must choose thresholds to flag drift (E is false). Therefore, none of the given options correctly explains a reason to use JS over KS.

Question 52

A data scientist has developed a model to predict ice cream sales using the expected temperature and expected number of hours of sun in the day. However, the expected temperature is dropping beneath the range of the input variable on which the model was trained. Which of the following types of drift is present in the above scenario?

A. Label drift
B. None of these
C. Concept drift
D. Prediction drift
E. Feature drift
Show Answer
Correct Answer: E
Explanation:
The issue described is a change in the distribution of an input feature: the expected temperature is moving outside the range seen during training. This is a classic case of feature (data/covariate) drift, where the input feature values shift, even if the relationship between inputs and target has not necessarily changed.

Question 38

Which of the following MLflow Model Registry use cases requires the use of an HTTP Webhook?

A. Starting a testing job when a new model is registered
B. Updating data in a source table for a Databricks SQL dashboard when a model version transitions to the Production stage
C. Sending an email alert when an automated testing Job fails
D. None of these use cases require the use of an HTTP Webhook
E. Sending a message to a Slack channel when a model version transitions stages
Show Answer
Correct Answer: E
Explanation:
MLflow Model Registry HTTP webhooks are designed to notify or trigger external systems when registry events occur, such as model version creation or stage transitions. Sending a message to a Slack channel on a stage transition requires calling an external service (Slack) via HTTP, which is exactly what webhooks provide. The other options can be handled internally within Databricks jobs or alerting mechanisms without requiring an MLflow registry webhook.

Question 39

Which of the following lists all of the model stages are available in the MLflow Model Registry?

A. Development, Staging, Production
B. None, Staging, Production
C. Staging, Production, Archived
D. None, Staging, Production, Archived
E. Development, Staging, Production, Archived
Show Answer
Correct Answer: D
Explanation:
MLflow Model Registry defines four model stages: None (default upon registration), Staging (for validation/testing), Production (for live serving), and Archived (for deprecated models). Therefore the correct list is None, Staging, Production, Archived.

Question 51

A data scientist wants to remove the star_rating column from the Delta table at the location path. To do this, they need to load in data and drop the star_rating column. Which of the following code blocks accomplishes this task?

A. spark.read.format(“delta”).load(path).drop(“star_rating”)
B. spark.read.format(“delta”).table(path).drop(“star_rating”)
C. Delta tables cannot be modified
D. spark.read.table(path).drop(“star_rating”)
E. spark.sql(“SELECT * EXCEPT star_rating FROM path”)
Show Answer
Correct Answer: A
Explanation:
The table is stored at a filesystem path, so it must be read with spark.read.format("delta").load(path). Dropping a column is a DataFrame operation, which drop("star_rating") performs. Options B and D expect a table name, not a path; E is invalid SQL for a path; C is false.

Question 54

A data scientist is utilizing MLflow to track their machine learning experiments. After completing a series of runs for the experiment with experiment ID exp_id, the data scientist wants to programmatically work with the experiment run data in a Spark DataFrame. They have an active MLflow Client client and an active Spark session spark. Which of the following lines of code can be used to obtain run-level results for exp_id in a Spark DataFrame?

A. client.list_run_infos(exp_id)
B. spark.read.format("delta").load(exp_id)
C. There is no way to programmatically return row-level results from an MLflow Experiment.
D. mlflow.search_runs(exp_id)
E. spark.read.format("mlflow-experiment").load(exp_id)
Show Answer
Correct Answer: E
Explanation:
MLflow experiments can be queried as a Spark DataFrame using the Databricks-provided mlflow-experiment data source. The call spark.read.format("mlflow-experiment").load(exp_id) loads run-level results (metrics, parameters, tags, etc.) for the specified experiment ID into a Spark DataFrame. Other options either return Python objects, are invalid formats, or are incorrect.

Question 83

A machine learning engineer is migrating a machine learning pipeline to use Databricks Machine Learning. They have programmatically identified the best run from an MLflow Experiment and stored its URI in the model_uri variable and its Run ID in the run_id variable. They have also determined that the model was logged with the name "model". Now, the machine learning engineer wants to register that model in the MLflow Model Registry with the name "best_model". Which of the following lines of code can they use to register the model to the MLflow Model Registry?

A. mlflow.register_model(model_uri, "best_model")
B. mlflow.register_model(run_id, "best_model")
C. mlflow.register_model(f"runs:/{run_id}/best_model", "model")
D. mlflow.register_model(model_uri, "model")
E. mlflow.register_model(f"runs:/{run_id}/model")
Show Answer
Correct Answer: A
Explanation:
To register a model in the MLflow Model Registry, mlflow.register_model expects a model URI (e.g., a runs:/ URI or another valid model_uri) and the desired registered model name. The engineer already has the correct model_uri for the logged model and wants to register it as "best_model", which is exactly what option A does. The run_id alone is insufficient, and the other options either misuse the run_id, the model path, or the registered model name.

Question 68

Which of the following statements describes streaming with Spark as a model deployment strategy?

A. The inference of batch processed records as soon as a trigger is hit
B. The inference of all types of records in real-time
C. The inference of batch processed records as soon as a Spark job is run
D. The inference of incrementally processed records as soon as trigger is hit
E. The inference of incrementally processed records as soon as a Spark job is run
Show Answer
Correct Answer: D
Explanation:
Spark Structured Streaming processes data incrementally (often as micro-batches or continuous processing) and executes computation, including model inference, whenever a configured trigger fires. This matches streaming deployment behavior: new data is inferred as it arrives based on triggers, not as one-off batch jobs.

Question 10

A data scientist is utilizing MLflow to track their machine learning experiments. After completing a run with run ID run_id for the experiment with experiment ID exp_id, the data scientist wants to programmatically return the logged metrics for run_id. They have an active MLflow Client client and an active Spark session spark. Which lines of code can be used to return the logged metrics for run_id?

A. client.qet_run(exp_id.run_id).data.metrics
B.
C. client.get_run(run_id).data.metrics
D. spark.read.format("mlflow-run").load(run_id)
Show Answer
Correct Answer: C
Explanation:
In MLflow, the MLflowClient.get_run(run_id) method returns a Run object corresponding to the given run ID. The logged metrics are accessible via the Run object's data.metrics attribute. Therefore, client.get_run(run_id).data.metrics correctly returns all metrics logged for that run. The other options use invalid methods or APIs not intended for retrieving metrics.

Question 9

Which tool can be used to automatically start a testing Job when a new version of an MLflow Model Registry model is registered?

A. MLflow Model Registry UI
B. MLflow Client API
C. MLflow Model Registry Webhooks
D. MLflow REST API
Show Answer
Correct Answer: C
Explanation:
MLflow Model Registry Webhooks are designed for event-driven automation. They can be configured to trigger external actions, such as starting a testing job, whenever specific events occur in the registry (e.g., when a new model version is registered). The UI is manual, and the Client and REST APIs support programmatic interactions but do not provide automatic event triggers.

$63 $19

Get all 82 questions with detailed answers and explanations

Machine Learning Professional — Frequently Asked Questions

What is the Databricks Machine Learning Professional exam?

The Databricks Machine Learning Professional exam — Databricks Certified Machine Learning Professional — is a professional IT certification exam offered by Databricks.

How many practice questions are included?

This study guide contains 82 practice questions, each with an expert-verified correct answer and a detailed explanation. Questions cover all exam domains and objectives.

Is there a free sample available?

Yes! We provide a free sample of 15 practice questions from the Machine Learning Professional exam right on this page. Scroll up to preview them and evaluate the quality of our materials before purchasing.

When was this Machine Learning Professional study guide last updated?

This study guide was last updated on 2026-02-18. We regularly refresh our materials to reflect the latest exam content and objectives so you're always studying current material.

What file formats do I receive?

After purchase you receive two files: an interactive HTML file with show/hide answer toggles (ideal for studying on screen) and a PDF file (ideal for printing or offline study). Both work on any device — desktop, tablet, or phone.

How much does the Machine Learning Professional study guide cost?

The Databricks Machine Learning Professional study guide costs $19 (discounted from $63). This is a one-time payment with no subscriptions or hidden fees.

How do I get my files after payment?

After successful payment via Stripe, you are immediately redirected to a download page with links to your HTML and PDF files. We also send the download links to your email address as a backup, so you'll always have access.

Why choose CheapestExamDumps over other providers?

CheapestExamDumps offers the lowest price at $19 per exam — competitors charge $50-$300 for similar content. All study materials are expert-verified, updated monthly, and include a free 15-question preview with no signup required. You get instant access to both HTML and PDF formats after payment.