What does the Generative AI Engineer Associate syllabus cover?

The official Generative AI Engineer Associate syllabus covers the domains and topics that Databricks lists in the current exam guide for Databricks Certified Generative AI Engineer Associate.

Where can I verify the latest Generative AI Engineer Associate skills measured?

Use the official source links on this page to verify the latest Generative AI Engineer Associate skills measured, topic areas, and outline updates before exam day.

Databricks Exam Syllabus

Generative AI Engineer Associate syllabus, skills measured, and exam topics

The purpose of this exam guide is to give you an overview of the exam and what is covered on the exam to help you determine your exam readiness. This document will get updated anytime there are any changes to an exam (and when those changes will take effect on an exam) so that you can

Open the practice page

Number of items	45 multiple-choice or multiple-selection questions
Time Limit	90 minutes
Registration fee	$200
Delivery method	Online Proctored
Test aides	None allowed
Prerequisite	None required; course attendance and six months of hands-on experience in
Validity	2 years.
Recertification	Recertification is required every two years to maintain your certified status.

Section 1: Design Applications

Design a prompt that elicits a specifically formatted response
Select model tasks to accomplish a given business requirement
Select chain components for a desired model input and output
Translate business use case goals into a description of the desired inputs and outputs for
the AI pipeline
Define and order tools that gather knowledge or take actions for multi-stage reasoning

Section 2: Data Preparation

Apply a chunking strategy for a given document structure and model constraints
Filter extraneous content in source documents that degrades quality of a RAG application
Choose the appropriate Python package to extract document content from provided
source data and format.
Define operations and sequence to write given chunked text into Delta Lake tables in Unity
Catalog
Identify needed source documents that provide necessary knowledge and quality for a
given RAG application
Identify prompt/response pairs that align with a given model task
Use tools and metrics to evaluate retrieval performance

Section 3: Application Development

Create tools needed to extract data for a given data retrieval need
Select Langchain/similar tools for use in a Generative AI application.
Identify how prompt formats can change model outputs and results
Qualitatively assess responses to identify common issues such as quality and safety
Select chunking strategy based on model & retrieval evaluation
Augment a prompt with additional context from a user's input based on key fields, terms,
and intents
Create a prompt that adjusts an LLM's response from a baseline to a desired output
Implement LLM guardrails to prevent negative outcomes
Write metaprompts that minimize hallucinations or leaking private data
Build agent prompt templates exposing available functions
Select the best LLM based on the attributes of the application to be developed

Section 4: Assembling and Deploying Applications

Code a chain using a pyfunc model with pre- and post-processing
Control access to resources from model serving endpoints
Code a simple chain according to requirements
Code a simple chain using langchain
Choose the basic elements needed to create a RAG application: model flavor, embedding
model, retriever, dependencies, input examples, model signature
Register the model to Unity Catalog using MLflow
Sequence the steps needed to deploy an endpoint for a basic RAG application
Create and query a Vector Search index
Identify how to serve an LLM application that leverages Foundation Model APIs
Identify resources needed to serve features for a RAG application

Section 5: Governance

Use masking techniques as guard rails to meet a performance objective
Select guardrail techniques to protect against malicious user inputs to a Gen AI application
Recommend an alternative for problematic text mitigation in a data source feeding a RAG
application
Use legal/licensing requirements for data sources to avoid legal risk

Section 6: Evaluation and Monitoring

Select an LLM choice (size and architecture) based on a set of quantitative evaluation
metrics
Select key metrics to monitor for a specific LLM deployment scenario
Evaluate model performance in a RAG application using MLflow
Use inference logging to assess deployed RAG application performance
Use Databricks features to control LLM costs for RAG applications
Sample Questions
These questions are similar to actual question items and give you a general sense of how questions
are asked on this exam. They include exam objectives as they are stated on the exam guide and
give you a sample question that aligns to the objective. The exam guide lists all of the objectives
that could be covered on an exam. The best way to prepare for a certification exam is to review the
exam outline in the exam guide.