Google Cloud Exam Syllabus

Professional Machine Learning Engineer syllabus, skills measured, and exam topics

A Professional Machine Learning Engineer builds, evaluates, productionizes, and optimizes AI solutions by using Google Cloud capabilities and knowledge of conventional ML approaches. The ML Engineer handles large, complex datasets and creates repeatable, reusable code. The

Skills measured by domain

Use the weighting table to decide where to spend the most study time.

Domain Weight
Section 1: Architecting low-code AI solutions 13%
Section 3: Scaling prototypes into ML models 18%
Section 4: Serving and scaling models 20%
Section 5: Automating and orchestrating ML pipelines 22%
Section 6: Monitoring AI solutions 13%

Detailed outline

Scan each section as a working study checklist instead of one long wall of text.

Section 1: Architecting low-code AI solutions (~13% of the exam)

  • 1.1 Developing ML models by using BigQuery ML. Considerations include:
  • Building the appropriate BigQuery ML model (e.g., linear and binary classification,
  • regression, time-series, matrix factorization, boosted trees, autoencoders) based on
  • the business problem
  • Feature engineering or selection by using BigQuery ML
  • Generating predictions by using BigQuery ML
  • 1. 2 Building AI solutions by using ML APIs or foundation models. Considerations include:
  • Building applications by using ML APIs from Model Garden
  • Building applications by using industry-specific APIs (e.g., Document AI API, Retail API)
  • Implementing retrieval augmented generation (RAG) applications by using Vertex AI
  • Agent Builder
  • 1.3 Training models by using AutoML. Considerations include:

Section 2: Collaborating within and across teams to manage data and models

  • (~14% of the exam)
  • 2.1 Exploring and preprocessing organization-wide data (e.g., Cloud Storage, BigQuery,
  • Spanner, Cloud SQL, Apache Spark, Apache Hadoop). Considerations include:
  • Organizing different types of data (e.g., tabular, text, speech, images, videos) for
  • efficient training
  • Managing datasets in Vertex AI
  • Data preprocessing (e.g., Dataflow, TensorFlow Extended [TFX], BigQuery)
  • Creating and consolidating features in Vertex AI Feature Store
  • Privacy implications of data usage and/or collection (e.g., handling sensitive data such
  • as personally identifiable information [PII] and protected health information [PHI])
  • Ingesting different data sources (e.g., text documents) into Vertex AI for inference
  • 2.2 Model prototyping using Jupyter notebooks. Considerations include:

Section 3: Scaling prototypes into ML models (~18% of the exam)

  • 3.1 Building models. Considerations include:
  • Choosing ML framework and model architecture
  • Modeling techniques given interpretability requirements
  • 3.2 Training models. Considerations include:
  • Organizing training data (e.g., tabular, text, speech, images, videos) on Google Cloud
  • (e.g., Cloud Storage, BigQuery)
  • Ingestion of various file types (e.g., CSV, JSON, images, Hadoop, databases) into
  • training
  • Training using different SDKs (e.g., Vertex AI custom training, Kubeflow on Google
  • Kubernetes Engine, AutoML, tabular workflows)
  • Using distributed training to organize reliable pipelines
  • Hyperparameter tuning

Section 4: Serving and scaling models (~20% of the exam)

  • 4.1 Serving models. Considerations include:
  • Batch and online inference (e.g., Vertex AI, Dataflow, BigQuery ML, Dataproc)
  • Using different frameworks (e.g., PyTorch, XGBoost) to serve models
  • Organizing a model registry
  • A/B testing different versions of a model
  • 4.2 Scaling online model serving. Considerations include:
  • Vertex AI Feature Store
  • Vertex AI public and private endpoints
  • Choosing appropriate hardware (e.g., CPU, GPU, TPU, edge)
  • Scaling the serving backend based on the throughput (e.g., Vertex AI Prediction,
  • containerized serving)
  • Tuning ML models for training and serving in production (e.g., simplification techniques,

Section 5: Automating and orchestrating ML pipelines (~22% of the exam)

  • 5.1 Developing end-to-end ML pipelines. Considerations include:
  • Data and model validation
  • Ensuring consistent data pre-processing between training and serving
  • Hosting third-party pipelines on Google Cloud (e.g., MLFlow)
  • Identifying components, parameters, triggers, and compute needs (e.g., Cloud Build,
  • Cloud Run)
  • Orchestration framework (e.g., Kubeflow Pipelines, Vertex AI Pipelines, Cloud
  • Composer)
  • Hybrid or multicloud strategies
  • System design with TFX components or Kubeflow DSL (e.g., Dataflow)
  • 5.2 Automating model retraining. Considerations include:
  • Determining an appropriate retraining policy

Section 6: Monitoring AI solutions (~13% of the exam)

  • 6.1 Identifying risks to AI solutions. Considerations include:
  • Building secure AI systems by protecting against unintentional exploitation of data or
  • models (e.g., hacking)
  • Aligning with Google’s Responsible AI practices (e.g., monitoring for bias)
  • Assessing AI solution readiness (e.g., fairness, bias)
  • Model explainability on Vertex AI (e.g., Vertex AI Prediction)
  • 6.2 Monitoring, testing, and troubleshooting AI solutions. Considerations include:
  • Establishing continuous evaluation metrics (e.g., Vertex AI Model Monitoring, Explainable
  • Monitoring for training-serving skew
  • Monitoring for feature attribution drift
  • Monitoring model performance against baselines, simpler models, and across the time
  • dimension