Free Sample — 15 Practice Questions
Preview 15 of 82 questions from the Machine Learning Professional exam.
Try before you buy — purchase the full study guide for all 82 questions with answers and explanations.
Question 65
A machine learning engineer wants to deploy a model for real-time serving using MLflow Model Serving. For the model, the machine learning engineer currently has one model version in each of the stages in the MLflow Model Registry. The engineer wants to know which model versions can be queried once Model Serving is enabled for the model.
Which of the following lists all of the MLflow Model Registry stages whose model versions are automatically deployed with Model Serving?
A. Staging, Production, Archived
B. Production
C. None, Staging, Production, Archived
D. Staging, Production
E. None, Staging, Production
Show Answer
Correct Answer: D
Explanation:
With MLflow Model Serving enabled, only model versions in the Staging and Production stages are automatically deployed and can be queried. Versions in None or Archived stages are not served.
Question 16
A machine learning engineer has developed a model and registered it using the FeatureStoreClient fs. The model has model URI model_uri. The engineer now needs to perform batch inference on the training set logged with the model, but a few of the feature values in the column spend have since been updated and arc present in the customer-level Spark DataFrame spark_df. The customer_id column is the primary key of spark_df and the training set used when training and logging the model.
Which code block can be used to compute predictions for the training set while overwriting its old spend values with the new spend values from spark_df?
A. fs.score_batch(model_uri, spark_df)
B. fs.score_model(model_uri, spark_df)
C. df = fs.get_updated_feature(spark_df, model=uri) fs.score_batch(model_uri, df)
D. df = fs.get_updated_features(spark_df) fs.score_batch(model_uri, df)
Show Answer
Correct Answer: A
Explanation:
FeatureStoreClient.score_batch uses the feature lookups logged with the model and joins them using the primary key. If the input DataFrame already contains a feature column (such as spend), those provided values override the stored feature values, enabling batch inference on the training set with updated spend values. The other options reference methods that do not exist or are unnecessary.
Question 21
A machine learning engineer wants to move their model version model_version for the MLflow Model Registry model model from the Staging stage to the Production stage using MLflow Client client.
Which code block can they use to accomplish the task?
Show Answer
Correct Answer: D
Explanation:
In MLflow, moving a registered model version between stages is done with `MlflowClient.transition_model_version_stage()`. This method explicitly transitions a given `model_name` and `version` to a target stage such as `Production`. Other options either use incorrect APIs or unnecessary/invalid parameters.
Question 66
Which of the following is a benefit of logging a model signature with an MLflow model?
A. The model will have a unique identifier in the MLflow experiment
B. The schema of input data can be validated when serving models
C. The model can be deployed using real-time serving tools
D. The model will be secured by the user that developed it
E. The schema of input data will be converted to match the signature
Show Answer
Correct Answer: B
Explanation:
Logging a model signature in MLflow records the expected schema of the model’s inputs and outputs. During model serving, MLflow can use this signature to validate incoming data against the defined schema, helping catch mismatches early and ensuring the model receives data in the correct format.
Question 82
A machine learning engineer wants to move their model version model_version for the MLflow Model Registry model model from the Staging stage to the Production stage using MLflow Client client.
Which of the following code blocks can they use to accomplish the task?
Show Answer
Correct Answer: C
Explanation:
In MLflow, promoting a registered model version to another stage is done with MlflowClient.transition_model_version_stage(), which requires only the model name, version number, and the target stage. There is no need to specify the source stage. The code in option C correctly calls this method with name="model", version=model_version, and stage="Production", which moves the model version from Staging (or any current stage) to Production.
Question 26
Which of the following is a reason for using Jensen-Shannon (JS) distance over a Kolmogorov-Smirnov (KS) test for numeric feature drift detection?
A. All of these reasons
B. JS is not normalized or smoothed
C. None of these reasons
D. JS is more robust when working with large datasets
E. JS does not require any manual threshold or cutoff determinations
Show Answer
Correct Answer: C
Explanation:
None of the listed statements is a valid reason to prefer Jensen–Shannon (JS) distance over the Kolmogorov–Smirnov (KS) test. JS is in fact normalized and smoothed (so B is false), it is not inherently more robust than KS on large datasets (D is false), and although JS yields a bounded score, practitioners still must choose thresholds to flag drift (E is false). Therefore, none of the given options correctly explains a reason to use JS over KS.
Question 52
A data scientist has developed a model to predict ice cream sales using the expected temperature and expected number of hours of sun in the day. However, the expected temperature is dropping beneath the range of the input variable on which the model was trained.
Which of the following types of drift is present in the above scenario?
A. Label drift
B. None of these
C. Concept drift
D. Prediction drift
E. Feature drift
Show Answer
Correct Answer: E
Explanation:
The issue described is a change in the distribution of an input feature: the expected temperature is moving outside the range seen during training. This is a classic case of feature (data/covariate) drift, where the input feature values shift, even if the relationship between inputs and target has not necessarily changed.
Question 38
Which of the following MLflow Model Registry use cases requires the use of an HTTP Webhook?
A. Starting a testing job when a new model is registered
B. Updating data in a source table for a Databricks SQL dashboard when a model version transitions to the Production stage
C. Sending an email alert when an automated testing Job fails
D. None of these use cases require the use of an HTTP Webhook
E. Sending a message to a Slack channel when a model version transitions stages
Show Answer
Correct Answer: E
Explanation:
MLflow Model Registry HTTP webhooks are designed to notify or trigger external systems when registry events occur, such as model version creation or stage transitions. Sending a message to a Slack channel on a stage transition requires calling an external service (Slack) via HTTP, which is exactly what webhooks provide. The other options can be handled internally within Databricks jobs or alerting mechanisms without requiring an MLflow registry webhook.
Question 39
Which of the following lists all of the model stages are available in the MLflow Model Registry?
A. Development, Staging, Production
B. None, Staging, Production
C. Staging, Production, Archived
D. None, Staging, Production, Archived
E. Development, Staging, Production, Archived
Show Answer
Correct Answer: D
Explanation:
MLflow Model Registry defines four model stages: None (default upon registration), Staging (for validation/testing), Production (for live serving), and Archived (for deprecated models). Therefore the correct list is None, Staging, Production, Archived.
Question 51
A data scientist wants to remove the star_rating column from the Delta table at the location path. To do this, they need to load in data and drop the star_rating column.
Which of the following code blocks accomplishes this task?
A. spark.read.format(“delta”).load(path).drop(“star_rating”)
B. spark.read.format(“delta”).table(path).drop(“star_rating”)
C. Delta tables cannot be modified
D. spark.read.table(path).drop(“star_rating”)
E. spark.sql(“SELECT * EXCEPT star_rating FROM path”)
Show Answer
Correct Answer: A
Explanation:
The table is stored at a filesystem path, so it must be read with spark.read.format("delta").load(path). Dropping a column is a DataFrame operation, which drop("star_rating") performs. Options B and D expect a table name, not a path; E is invalid SQL for a path; C is false.
Question 54
A data scientist is utilizing MLflow to track their machine learning experiments. After completing a series of runs for the experiment with experiment ID exp_id, the data scientist wants to programmatically work with the experiment run data in a Spark DataFrame. They have an active MLflow Client client and an active Spark session spark.
Which of the following lines of code can be used to obtain run-level results for exp_id in a Spark DataFrame?
A. client.list_run_infos(exp_id)
B. spark.read.format("delta").load(exp_id)
C. There is no way to programmatically return row-level results from an MLflow Experiment.
D. mlflow.search_runs(exp_id)
E. spark.read.format("mlflow-experiment").load(exp_id)
Show Answer
Correct Answer: E
Explanation:
MLflow experiments can be queried as a Spark DataFrame using the Databricks-provided mlflow-experiment data source. The call spark.read.format("mlflow-experiment").load(exp_id) loads run-level results (metrics, parameters, tags, etc.) for the specified experiment ID into a Spark DataFrame. Other options either return Python objects, are invalid formats, or are incorrect.
Question 83
A machine learning engineer is migrating a machine learning pipeline to use Databricks Machine Learning. They have programmatically identified the best run from an MLflow Experiment and stored its URI in the model_uri variable and its Run ID in the run_id variable. They have also determined that the model was logged with the name "model". Now, the machine learning engineer wants to register that model in the MLflow Model Registry with the name "best_model".
Which of the following lines of code can they use to register the model to the MLflow Model Registry?
A. mlflow.register_model(model_uri, "best_model")
B. mlflow.register_model(run_id, "best_model")
C. mlflow.register_model(f"runs:/{run_id}/best_model", "model")
D. mlflow.register_model(model_uri, "model")
E. mlflow.register_model(f"runs:/{run_id}/model")
Show Answer
Correct Answer: A
Explanation:
To register a model in the MLflow Model Registry, mlflow.register_model expects a model URI (e.g., a runs:/ URI or another valid model_uri) and the desired registered model name. The engineer already has the correct model_uri for the logged model and wants to register it as "best_model", which is exactly what option A does. The run_id alone is insufficient, and the other options either misuse the run_id, the model path, or the registered model name.
Question 68
Which of the following statements describes streaming with Spark as a model deployment strategy?
A. The inference of batch processed records as soon as a trigger is hit
B. The inference of all types of records in real-time
C. The inference of batch processed records as soon as a Spark job is run
D. The inference of incrementally processed records as soon as trigger is hit
E. The inference of incrementally processed records as soon as a Spark job is run
Show Answer
Correct Answer: D
Explanation:
Spark Structured Streaming processes data incrementally (often as micro-batches or continuous processing) and executes computation, including model inference, whenever a configured trigger fires. This matches streaming deployment behavior: new data is inferred as it arrives based on triggers, not as one-off batch jobs.
Question 10
A data scientist is utilizing MLflow to track their machine learning experiments. After completing a run with run ID run_id for the experiment with experiment ID exp_id, the data scientist wants to programmatically return the logged metrics for run_id. They have an active MLflow Client client and an active Spark session spark.
Which lines of code can be used to return the logged metrics for run_id?
A. client.qet_run(exp_id.run_id).data.metrics
B.
C. client.get_run(run_id).data.metrics
D. spark.read.format("mlflow-run").load(run_id)
Show Answer
Correct Answer: C
Explanation:
In MLflow, the MLflowClient.get_run(run_id) method returns a Run object corresponding to the given run ID. The logged metrics are accessible via the Run object's data.metrics attribute. Therefore, client.get_run(run_id).data.metrics correctly returns all metrics logged for that run. The other options use invalid methods or APIs not intended for retrieving metrics.
Question 9
Which tool can be used to automatically start a testing Job when a new version of an MLflow Model Registry model is registered?
A. MLflow Model Registry UI
B. MLflow Client API
C. MLflow Model Registry Webhooks
D. MLflow REST API
Show Answer
Correct Answer: C
Explanation:
MLflow Model Registry Webhooks are designed for event-driven automation. They can be configured to trigger external actions, such as starting a testing job, whenever specific events occur in the registry (e.g., when a new model version is registered). The UI is manual, and the Client and REST APIs support programmatic interactions but do not provide automatic event triggers.