Alpha v2: Added plugins, Synapse, AI-driven policy explanations and policy evaluation API
Nov 30, 2023
PACE is picking up some speed! Here’s a summary of some of the notable new features of the past weeks:
We're adding generative AI to PACE through
plugins. Please read along.
We've included a new
evaluatecommand to validate your policies upfront and see a preview of how the policy would affect different users.
We added Azure Synapse as data platform.
New options for building policies:
Lots of general improvements: increased test coverage, Renovatebot in the repo, better sequencing for the container startup, improved validation.
Let's explain the notable changes and features in more detail.
🤵 AI-driven policy generation and explanation (very alpha)
Generative AI! For real! Please don't look away in fear of marketing mumbo - PACE is about solving real problems, not a costume play. And going from "we would like to implement policy X" to actually implementing the policy in a data platform is a very real problem.
Which is why we opened our tinker shed and crafted the ability to embed ChatGPT (or really the LLM behind it) into PACE, which then acts as a smart policy assistant.
Add your API key, provide PACE with a description of the policy you would like to implement, and it generates the corresponding policy YAML. PACE can even generate sensical test data based on the table headers to help you evaluate the policy if you combine it with the new
evaluate command. Please note PACE AI is designed as a human-in-the-loop system, so always check its output!
As an example, the following instruction returns a valid data policy:
For the group administrators, do not filter the data. For all other users, only show records with emails ending with google.com.
For the group administrators, replace the username with a fixed value of "omitted". For the group analytics, only show records for age older than 18. For all other users, use a regex pattern to only show the domain of the email field. For all users only include records that are no older than 30 days.
Give it a whirl with our Data Policy Generator tutorial.
🔎 Evaluate how policies affect your data
We've added an
evaluate API and CLI command, to do a quick dry run of how your policies play out. Feed it a data sample, and PACE evaluates your policy and shows for each principal how it will affect the sample data. See the technical documentation for an example.
🔌 PACE Plugins
As part of extending PACE's architecture, PACE now features plugins: small extendable pieces of (your own) logic to interface with PACE. Plugins run on top of the core of PACE and offer a structured and convenient way to add satellite logic (larger than a UDF for data, but smaller or too custom for a full API abstraction). Run a custom IAM or Authentication provider? Write a plugin. Have your own taste of LLM already running locally? Plug it in. Over time, we'll extend the plugin library and we can work with you to write a custom plugin. The data policy generator uses this plugin mechanism.
🧰 Build better policies with global transforms and retention filters
Bricks are to Lego what transforms and filters are to PACE: take and combine each to layer them into the desired policy you want to apply. Our newest bricks include global transforms and retention filters.
With the new Global Transforms you can define reusable transformations of data. For instance, when all emails everywhere should be masked, unless someone is in customer service. Global transforms are currently tag-based, which means they leverage the existing metadata coming from your catalog or data platform. Docs.
Retention filters allow to specify a time window. This is handy for instance when you need temporary access to data to explore a (sensitive) set, when sharing data in- or externally or to improve your compliance posture (e.g. when you have competing retention policies, like 90 days on customer data for web analytics but 7 years for tax obligations). Read more.