Salesforce Certified Data Architect Exam Guide

I’ve finally slayed the dragon that is the Salesforce Certified Data Architect exam. For context/history, I’ve taken the exam twice before and failed with the same score both times! I don’t believe in monsters, but I considered this test my “boogeyman”:

  • The first time I sat for the exam at TrailheadDX 2018, I was new to the platform, and there was still a lot to learn. 
  • My second attempt was nearly 18 months later. I spent a lot of time creating and studying flashcards and trying to understand the nitty-gritty details of what I thought the exam was attempting to validate. Needless to say that getting the same score again really hurt. 

A particular highlight for me was the vast improvement in the Data Modeling portion of the exam. 

  • 2018 -41%
  • 2019 – 50%
  • 2022 – 93%

My Exam Experience

  • The exam was much of what I expected: a lot of overlap between sections. To no one’s surprise, the exam was very close to the exam guide, so pay close attention to the objectives and understand them.
  • I was caught off-guard by some of the exam’s Heroku/Heroku Postgres questions. 
  • I wasn’t as prepared for questions specifically related to field metadata and data classification. I handled these well, but I wasn’t expecting to see them. 
  • There were quite a few questions on Big Objects: when to use them, alternatives to using them, etc., especially in the context of archival strategies. I didn’t do as well with these as I would have liked, and I “blame” the mindset of “get it out of the org and report on it elsewhere” – I don’t think any implementation I’ve been a part of used custom BOs in favor of data warehousing or replication. Of course, Salesforce exams expect you not to have a myopic view of small business vs. enterprise, and that’s my shortcoming to overcome…
  • Be prepared for questions on sharing and visibility.
  • There were more integration questions than I would have expected.
  • This exam was the first time I went back through every question to make sure I agreed with the answer; I changed a few of them. The wording of some of the questions seemed particularly tricky!

Getting Started and How to Prepare:

Large Data Volumes or bust. I would argue that the entirety of the exam focuses on Salesforce solutions at LDV scale and how to manage through loading/extracting, integration, and security (including sharing/visibility); if you can solve for LDV, lesser volumes should be simple. 

Justin Long Movie GIF - Find & Share on GIPHY
if you can solve for LDV, lesser volumes should be simple.

Platform App Builder: If you haven’t taken Platform App Builder, I highly suggest doing that first. One of the strategies I employed for this (and PAB) was studying “Data Modeling and Management” from App Builder along with “Data Modeling/Database Design“, “Data Migration“, and “Salesforce Data Management” from Data Architect. Of the four sections, I did exceptionally well on three.

Be aware: The Trailmix for Data Architect reflects the old “Data Architecture and Management Designer” which means there are topics (looking at you, Heroku…) missing from it. Great reference, but incomplete.

Focus on Force: Get the study guide and the practice exams. I had moved away from using the study guides more recently (bad taste after failing “Data Architecture and Management Designer” the second time), but the study guide would have saved my tail with Heroku. The practice exams are great for getting accustomed to the exam question structure and working through thought-process while selecting the appropriate answers. 


As I write this up, I feel like I just had a massive dump of adrenaline and cortisol. It has been three years since I last attempted this exam; I felt better prepared this time (…I mean, it has been three years…), and I attribute that to the time dedicated to better understanding platform capability. Coming from a developer background (especially non-Salesforce development), that was time well spent. Don’t rush it, don’t shortcut it, take the time to understand the platform, and then go for it.

Data modeling/Database Design: 25%

  • Compare and contrast various techniques and considerations for designing a data model for the Customer 360 platform. (e.g. objects, fields & relationships, object features).
  • Given a scenario, recommend approaches and techniques to design a scalable data model that obeys the current security and sharing model.
  • Compare and contrast various techniques, approaches and considerations for capturing and managing business and technical metadata (e.g. business dictionary, data lineage, taxonomy, data classification).
  • Compare and contrast the different reasons for implementing Big Objects vs Standard/Custom objects within a production instance, alongside the unique pros and cons of utilizing Big Objects in a Salesforce data model.
  • Given a customer scenario, recommend approaches and techniques to avoid data skew (record locking, sharing calculation issues, and excessive child to parent relationships).

Master Data Management: 5%

  • Compare and contrast the various techniques, approaches and considerations for implementing Master Data Management Solutions (e.g. MDM implementation styles, harmonizing & consolidating data from multiple sources, establishing data survivorship rules, thresholds & weights, leveraging external reference data for enrichment, Canonical modeling techniques, hierarchy management.)
  • Given a customer scenario, recommend and use techniques for establishing a “golden record” or “system of truth” for the customer domain in a Single Org
  • Given a customer scenario, recommend approaches and techniques for consolidating data attributes from multiple sources. Discuss criteria and methodology for picking the winning attributes.
  • Given a customer scenario, recommend appropriate approaches and techniques to capture and maintain customer reference & metadata to preserve traceability and establish a common context for business rules 

Salesforce Data Management: 25%

  • Given a customer scenario, recommend appropriate combination of Salesforce license types to effectively leverage standard and custom objects to meet business needs.
  • Given a customer scenario, recommend techniques to ensure data is persisted in a consistent manner. 
  • Given a scenario with multiple systems of interaction, describe techniques to represent a single view of the customer on the Salesforce platform.
  • Given a customer scenario, recommend a design to effectively consolidate and/or leverage data from multiple Salesforce instances.

Data Governance: 10%

  • Given a customer scenario, recommend an approach for designing a GDPR compliant data model. Discuss the various options to identify, classify and protect personal and sensitive information. 
  • Compare and contrast various approaches and considerations for designing and implementing an enterprise data governance program. 

Large Data Volume considerations: 20%

  • Given a customer scenario, design a data model that scales considering large data volume and solution performance.
  • Given a customer scenario, recommend a data archiving and purging plan that is optimal for customer’s data storage management needs.
  • Given a customer scenario, decide when to use virtualised data and describe virtualised data options.

Data Migration: 15%

  • Given a customer scenario, recommend appropriate techniques and methods for ensuring high data quality at load time. 
  • Compare and contrast various techniques for improving performance when migrating large data volumes into Salesforce.
  • Compare and contrast various techniques and considerations for exporting data from Salesforce.

Chris Mattison

Hi, I'm Chris: husband, father, fitness fanatic, geek, and coffee addict with close to 20 years of IT experience. I am a Salesforce Technical Architect on the #journeyToCTA.

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: