Tips, Tricks and Topics to study for the DP-700 Fabric Data Engineer Associate

I took the beta exam for the DP-700 Fabric Data Engineer Associate certification last week, and while I don’t know the results yet, I figured I would share my thoughts on studying for the exam, and a few notes on the topics included on the questions I received.

If you want to schedule the exam, here is the link: Exam DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric (beta) – Certifications | Microsoft Learn

Resources

In my preparation for the Exam, I trawled the limited but great resources out there from community members who already took the test:

Andy Cutler made an excellent link with Docs/Learn references based on the official ‘Skills Measured’ guide: DP-700 Microsoft Fabric Data Engineering Associate – Skills Measured Resources

Kevin Chant put together a good list of resources and links: Checklist for the DP-700 beta exam for the new Microsoft Fabric Data Engineering certification – Kevin Chant

Sam Debruyn made an article not too different from this one, with perspectives worth reading over: I just took the new Fabric DP-700 Data Engineering Exam: here’s what you should know – Sam Debruyn

Ali Stoops has some good reflections as well: Microsoft Fabric Data Engineer Associate (DP-700) Beta – My Experience & Tips

Official Microsoft Collections:

And then I reviewed a bunch of Microsoft Learn articles(both based off of Andy’s list above, but also just browsing through tens, if not hundreds of pages). The reason being, that Microsoft Learn is accessible to you during the exam. Hence, familiarity with the articles, and the search bar, is invaluable.

In no particular order, below are some Learn articles that came in handy for me, and which I was able to look up again during the exam, through the embedded Microsoft Learn tab in the exam software:

Topics worth spending time practicing

While the study guide provides a decent overall picture of the topics in scope for the exam, I thought I’d give slightly more granular insight. Below are some of the genres of questions and topics I encountered questions about:

Comparing and choosing the right tool for the job:

  • Evaluating which Batch Data Ingestion tool is best suited given the context (Often comparing Dataflows, Notebooks, Pipelines)
  • Evaluating which Streaming Data Ingestion tool is the best fit given the context (Often mentioning Spark Structured Streaming, Eventstreams and even Streaming Dataflows).
  • Evaluating Transformation tools for Batch Data (Pipelines, Notebooks, Dataflows, Stored Procedures)
  • Evaluating Transformation tools for Streaming Data (Notebooks, Event Processing in Eventstreams), including very specific details about how to optimize and output certain aggregations.
  • Evaluating the best place to monitor certain types of things the easiest way possible (e.g. queryinsights views for Fabric Warehouse, Monitor Hub for Notebook runs and more.)
  • Evaluating when to use Deployment Pipelines in Fabric vs. Azure Pipelines

Configure a Fabric Item to do a certain job:

  • How to setup Metadata Driven Ingestion with Data Pipelines (as in, which activities do you need to include, how to use and configure control tables, how to use parameters in activities)
  • How to setup Dynamic Data Masking on Fabric Data Warehouses (knowing the keywords to apply data masking, knowing which masking methods to use to accommodate specific masking/governance requirements).
  • How to setup row-level security in Fabric Data Warehouse.
  • How to setup Primary Keys in Data Warehouse.
  • How to configure Incremental Refresh in Pipelines, Notebooks, Dataflows.

Syntax Review type questions, asking you to fill in the blanks in code, or explain which option performs the intended operation:

  • Review KQL statements (knowing the right keywords and order of functions for creating tables, creating Aggregations, filtering, displaying columns, creating sorting/window functions
  • Fill in the blanks when creating DAGs in Notebooks (knowing the right keywords, knowing how to create dependencies)
  • Distinguish between different PySpark functions, and be able to choose the right one for the desired output.

Platform Related questions

  • How to setup GIT Integration, both to Azure DevOps and Github, including which pieces of information from DevOps and Github you need during setup, and which permissions you need in those tools (not just which permissions are needed in Fabric).
  • Shortcut Caching, and what triggers data to be loaded from cache vs from the source.
  • Data Discovery / Promotion: How to apply Endorsements and Certifications, who can do it, and what is their internal hierarchy.
  • Domains, and what can be delegated to domain administrators.

Access / Security questions:

  • Intricate scenarios about granting full/partial/limited access to Items and Workspaces.
    • What Workspace Roles grant which types of access.
    • How and when to instead share Fabric Items directly.
    • How and when to share data directly.
    • What permissions are required to create/run Deployment Pipelines
    • What permissions are required to modify Workspace Settings and setup GIT Integration
  • Where and how to configure settings for allowing/preventing Workspace Creation, Fabric Item creation.

Happy Certification Hunting!

Also check out these other blogs:

Bulk Write-Back w. Translytical Task Flows in Microsoft Fabric / Power BI: Writing a single value back to multiple records at the same time

Introduction On this blog we’ve previously covered quite a few areas of Translytical Task Flows: Having presented a few sessions on Translytical Task Flows at conferences in the past moths, there is one major recurring question: How do you write-back multiple records at once? If you ask me, the questions of bulk write-back/writing back multiple…

Fabric Quick Tips – Pushing transformation upstream with Self Service Views and Tables in Visual Queries for Lakehouses/Warehouses/SQL DB

Introduction Recently, I’ve experienced a huge influx in requests from Microsoft Fabric customers wanting a good way for user’s to push data transformation upstream, following Roche’s Maxim: Data should be transformed as far upstream as possible, and as far downstream as necessary. To elaborate slightly, there are tons of Power BI Semantic Models out there…

Organizing your Microsoft Fabric Data Platform: Tags and Task Flows

Introduction We’ve arrived at the final level of detail in our series on Organizing your Microsoft Fabric Data Platform. So far we’ve covered, from broadest to narrowest scope: This time we go all the way down to the Item level on our platform, and describe strategies for labeling and categorising individual items by using Tags…

Something went wrong. Please refresh the page and/or try again.

Leave a comment