怀孕一个月内有什么反应| 西瓜汁加什么好喝| 身份证上的数字是什么字体| 孕妇梦到蛇是什么意思| 心什么气什么| 右手掌心有痣代表什么| 脸部填充用什么填充最好| 高血压吃什么药好| 潘多拉属于什么档次| 婴儿补钙什么牌子的好| HlV是什么| 排卵期有什么明显症状| 拔牙什么时候拔最好| 序列是什么意思| 子宫囊肿是什么病| 女性生活疼痛什么原因| 吃什么食物能养肝护肝| 八月二号是什么星座| 金青什么字| 脸很黄是什么原因| 风寒感冒吃什么药好| 哦吼是什么意思| 前额头疼是什么原因引起的| 海鲜中毒有什么反应| 男朋友昵称叫什么好听| 喉咙发炎吃什么药最好| 金玉满堂是什么菜| 甲亢的早期症状是什么| 人为什么会有胎记| lotus是什么意思| jewelry什么意思| macd是什么| 瑞夫泰格手表什么档次| 御木本是什么档次| 血沉是检查什么的| 为什么叫211大学| 御字五行属什么| 淋巴结回声是什么意思| 11月18日是什么星座| 眩晕症吃什么好| 蜂蜜为什么不会变质| 绿豆的功效与作用是什么| 氧氟沙星和诺氟沙星有什么区别| 宇宙的外面是什么| 肠胃炎可以吃什么水果| 7月11日什么星座| 拔完牙不能吃什么| 瘦脸针的危害有什么副作用| 喜欢出汗是什么原因| 胡人是什么民族| dm是什么病| 吃完饭就犯困是什么原因| 什么是口爆| 什么叫磨玻璃结节| 疤痕增生是什么样子| 庚是什么意思| 什么是央企| 耳朵后面是什么穴位| 六月是什么星座的| 长针眼是什么意思| 八岁属什么生肖| loho眼镜属于什么档次| 煮海带放什么容易烂| 梦到拉粑粑是什么意思| 麻瓜是什么意思| 赵本山什么时候死的| 勇往直前是什么意思| 什么动物吃草| 自由奔放是什么生肖| cos是什么意思啊| 肝什么相照| 来月经喝什么茶好| 咳嗽挂什么科| 生物学是什么| 乳化是什么意思| 梦见长牙齿预示着什么| 泛性恋是什么| 11月1日是什么星座| 法警是什么编制| 一个月一个并念什么| 脚气是什么菌| 凿壁偷光是什么意思| 很什么很什么| 11月13日什么星座| 三月底是什么星座| 奎宁是什么药| 白英别名叫什么| 为什么月经迟迟不来| 黄精什么功效| karl lagerfeld是什么牌子| 5月21号是什么星座| 小腿红肿是什么原因引起的| 孕妇dha什么时候吃| 2034年是什么年| 手指关节疼痛挂什么科| 腋下有异味是什么原因导致的| 什么东西化痰效果最好最快| 干事是什么职务| 百日咳是什么意思| 肺泡是什么| 人参长什么样| 雪球是什么| 蓝桉什么意思| 湿疹是什么症状及图片| 弦子为什么嫁给李茂| 一直咳嗽吃什么药| 什么笑脸| 白细胞数目偏高是什么意思| 为什么会高血压| 蔬菜沙拉都放什么菜| 胰腺是什么病| 无忧什么意思| 竖心旁的字有什么| 哺乳期抽烟对宝宝有什么影响| 相是什么意思| 天蝎座是什么星座| 什么天山| 紫水晶五行属什么| 胎动什么感觉| 包皮红肿瘙痒用什么药| 吸允的读音是什么| 右侧卵巢内囊性结构什么意思| 尿毒症是什么原因导致的| 龙胆泻肝丸治什么病| 点石成金是什么意思| 什么血型会导致不孕| 每天喝一杯豆浆有什么好处| 免疫力低是什么原因| 求人办事送什么礼物好| 投食是什么意思| 手腕三条纹代表什么| 右眼上眼皮跳是什么预兆| 狮子的天敌是什么动物| 膳食是什么意思| 扁平苔藓有什么症状| 凿壁偷光形容什么| 泪河高度说明什么| 命门是什么意思| 灵犀是什么意思| 天团是什么意思| 劫财代表什么| 扁桃体作用是什么| 痔疮是什么原因引起| disease是什么意思| 一月十九号是什么星座| 十月十号是什么星座| 南瓜吃了有什么好处| 松脂是什么| 图谋不轨什么意思| 颔是什么部位| 摇头晃脑是什么生肖| 双手脱皮是什么原因引起的| 什么是局限性肺纤维化| thenorthface是什么牌子| 人活一辈子到底为了什么| 舌头热灼是什么原因| 腰闪了挂什么科| 奶瓶pp和ppsu有什么区别| 同妻是什么意思| 甜胚子是什么做的| 跪舔是什么意思| 肯尼亚说什么语言| 维生素b是什么食物| 哮喘是什么病| 吃什么可以降胆固醇| 才子是什么生肖| 禁的部首是什么| 减肥期间吃什么水果好| 花千骨什么时候上映的| doge是什么意思| 性功能障碍挂什么科| 是什么样的感觉我不懂是什么歌| 男人胸前有痣代表什么意思| 鱼露是什么东西| 马后炮是什么意思| 盈字五行属什么| cp感什么意思| 全性向是什么意思| 兰州人为什么要戴头巾| 九牧王男装是什么档次| 9527什么意思| 司马光和司马迁是什么关系| 宝宝风寒感冒吃什么药最好| 刮痧用什么油刮最好| 甙是什么意思| 血糖低什么症状| 818是什么星座| 风五行属什么| 黄芪什么人不能喝| 摆架子是什么意思| 为什么青霉素要做皮试| 96120是什么电话| py是什么意思| 什么病不能吃玉米| 后背痒是什么原因| 梦见小男孩拉屎是什么意思| 什么东西放进去是硬的拿出来是软的| 什么叫引产| 备孕吃什么最容易怀孕| 九头身什么意思| 人参果是什么季节的| 什么是激素类药物| 黄瓜长什么样| 68岁属什么生肖| 厍是什么意思| 几月初几是叫什么历| 止步不前什么意思| 被猫抓了有什么症状| 两小无猜是什么意思| 土耳其说什么语言| 加持什么意思| 为什么会得干眼症| 神志不清是什么意思| 发烧酒精擦什么部位| 一蹴而就什么意思| vvs是什么意思| 什么补肾效果最好| prr是什么意思| 生姜黄叶病用什么药| 中午一点半是什么时辰| 男人更年期有什么症状有哪些表现| 胃炎可以吃什么水果| 肝不好吃什么药最好| 药石是什么意思| sk是什么| 白带是绿色的是什么原因| 欧尼酱什么意思| 喉咙痰多是什么原因造成的| 巴斯光年是什么意思| 哪吒妈妈叫什么名字| 吃什么抑制食欲| 全糖是什么意思| 脂溢性脱发用什么洗发水好| 1117什么星座| 山东立冬吃什么| 胃左边疼是什么原因| 骨折吃什么水果| 每天尿都是黄的是什么原因| 肛门周围潮湿瘙痒是什么原因| 沙眼用什么眼药水| 惜败是什么意思| 招财进宝是什么意思| 青鱼吃什么| 肛门周边瘙痒擦什么药| 五音是什么| 昆明的别称是什么| 吃青椒有什么好处| 塔罗牌正位和逆位是什么意思| 手上有湿疹是什么原因引起的| 夜晚尿频尿多是什么原因| 蜂蜜不能和什么一起吃| 脸上长白斑是什么原因| 身体机能是什么意思| 什么药降肌酐| 鼻子一直流血是什么原因| 绿色属于五行属什么| 什么是肥皂剧| 话费为什么扣那么快| 江米是什么米| 舀水是什么意思| birads3类是什么意思| 尿液有泡沫是什么原因| 今天吃什么| 为什么痛风就痛一只脚| 百度
Skip to content

GoogleCloudPlatform/data-mesh-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

?

History

7 Commits
?
?
?
?
?
?
?
?
?
?
?
?
?
?

Repository files navigation

Data Mesh on Google Cloud

Overview

This repo is a reference implementation of Data Mesh on Google Cloud. See this architecture guide for details.

infrastructure folder contains a number of Terraform scripts to create several Google Cloud projects which represent typical data mesh participants - data producers, data consumers, central services. The scripts are stored in folders and are meant to be executed sequentially to illustrate the typical progression of building a data mesh.

Setup

Prerequisites

In order to run the scripts in this repo make sure you have installed:

Terraform (instructions)

If you are using Cloud Shell, all these utilities are preinstalled in your environment.

You would need to create Google Cloud's Application Default Credentials using

gcloud auth application-default login

Permissions of the principal running the Terraform scripts

Typicially a person with Owner or Editor basic roles will execute the Terraform scripts to set up the infrastructure of the Data Mesh. These scripts can also be run using a service account with sufficent privileges. It is important that the principal running the scripts has Service Account Token Creator role at the Domain/Org level policy rather than the project level. Without it the scripts will fail with a 403 error while creating tag templates.

To start the data mesh infrastructure build process:

cd infrastructure

Providing Terraform variables

Several step below mention that you will need to provide Terraform variables. There are multiple ways to do this. The simplest approach is to create a short script that sets the variables you need before you start setting the infrastructure, which you can re-run if you need to restart your session. Typically this script will contain several statements like:

export TF_VAR_project_prefix='my-data-mesh-demo'

and can be set using

source set-my-variables.sh

Notice, TF_VAR_ indicates to Terraform that the rest of the environment variable name is Terraform's variable name.

Projects

There are 4 projects that are used in this demo. They can be existing projects, or there are scripts to create new projects.

  • Domain base data project. It contains base BigQuery tables and materialized views.
  • Domain product project. It contains BigQuery authorized views and a Pub/Sub topic that constitute consumption interfaces exposed to data consumers (see the Consumer project, described below) as data products by this domain.
  • Central catalog project. It is the project where the data mesh-specific Data Catalog tag templates are defined.
  • Consumer project. Data from the data product is consumed from this project.

To create new projects

All new projects will be created in the same organization folder. In a real organization most likely there will be multiple folders mapping to different domains in the organization, but for simplicity in managing the newly created projects we are creating them in the same folder.

Set up the following Terraform variables:

  • project_prefix - used if creating the projects as part of the setup
  • billing_id - all projects created will be using this billing id
  • folder_id - all projects will be created in this folder
  • org_domain - all users within this domain will be given access to search data products in Data Catalog

Create a new folder and capture the folder id

Ensure that the account used to run Terraform scripts has the following roles assigned at the folder level:

  • Folder Admin
  • Service Account Token Creator

Run these commands:

cd initial-project-setup
terraform init
terraform apply
cd ..

Now that the projects are created their project ids can be stored in environment variables by running

source get-project-ids.sh

Using existing projects

If you pre-created the four projects we mentioned earlier and would like to use them, make sure the following Terraform variables are set:

  • consumer_project_id
  • domain_data_project_id
  • domain_product_project_id
  • central_catalog_project_id

Additionally, you would need to create a service account in the Central Catalog project, give it at least Data Catalog Admin role and assign the Terraform variable central_catalog_admin_sa its email address. The account that will be used to run the Terraform scripts in the central catalog project will need to have Service Account Token Creator role on this service account.

Creating base project infrastructure

Each of these steps can be done in any sequence. It simulates typical steps done by participants in the data mesh.

Domain data (created by data producers)

cd domain
terraform init
terraform apply
cd ..

Central catalog (created by the central services team)

cd central-catalog
terraform init
terraform apply
cd ..

Consumer project (created by a data mesh consumer)

cd consumer
terraform init
terraform apply
cd ..

Once all the projects are populated, store additional resource ids in environment variables:

source get-base-project-details.sh

Enabling the data mesh

Next steps show how to enable the data mesh in the organization. We included these steps in incremental changes folder

cd incremental-changes

Allow producers to tag data products and underlying data resources

The central team defined a set of tag templates needed to tag data product-related resources ("product interfaces") by data producers. Allowing data producers to use the tag templates requires granting the Tag Template User role to the service account(s) used to tag these resources.

cd allow-producer-to-tag
terraform init
terraform apply
cd ..

Make product searchable

To make data interfaces searchable in Data Catalog two things need to happen:

  • Resource metadata needs to become visible to anybody in the organization
  • Resources need to be tagged with a set of predefined tag templates

get-catalog-ids.sh script sets several Terraform variables with the Data Catalog ids of the resources to be tagged using the data mesh-related tag templates.

You will need to set up this Terraform variable:

  • org_domain - all users within this domain will be given access to search data products in Data Catalog, e.g., example.com.
cd make-products-searchable
source get-catalog-ids.sh
terraform init
terraform apply
cd ..

Give the consumer access to a product interface

Once a product is discovered and its suitability is assessed by the consumer the product owner needs to grant access to the resources which represent the interface.

cd give-access-to-event-v1
terraform init
terraform apply
cd ..

Single script to create all infrastructure

build-all.sh script executes all the steps above. It assumes that you will be creating projects from scratch. It assumes that all the Terraform variables will be provided in bootstrap.sh script. Here's an example of such a script:

export TF_VAR_project_prefix='my-org-mesh'
export TF_VAR_org_id='<numeric org id>'
export TF_VAR_billing_id='<billing id>'
export TF_VAR_folder_id='<numeric folder id>'
export TF_VAR_org_domain='example.com'

Using Data Mesh concepts

Searching for products in Data Catalog

Data Catalog searches are performed within scopes - either organization, folder or project. Several scripts mentioned below use organization as the scope. You would need to define TF_VAR_org_id environment variable before you can run them. You can find the organization id you belong to by running

gcloud organizations list

To search for data products using Data Catalog APIs, execute this script from the root of this repo:

bin/search-products.sh <search terms>

Search terms must conform to the Data Catalog search syntax . A simple search term which works for this repo is events. It will find several data products defined in the domain project that contain the term " events" in their names or descriptions.

You "undo" the data product registration in the data mesh catalog by running this script:

terraform -chdir=infrastructure/incremental-changes/make-products-searchable destroy

A subsequent search of the catalog will return no results.

To make the products searchable again use:

terraform -chdir=infrastructure/incremental-changes/make-products-searchable apply

Also, compare this to searching the data catalog across the whole organization:

bin/search-all-org-assets.sh <search terms>

Depending on your organization and permissions granted to you to access the data you will see a large number of Data Catalog entries, most of which are not ready to be shared by the product owners with possible consumers.

Consuming product data

Once the data product is discovered and access to it is granted to a consumer (a service account or a user group), the product can be consumed.

Authorized views

In this case, access to the authorized BigQuery dataset has been granted to a Service Account in the consumer project.

bin folder has several scripts to test how access works:

  • bin/populate-base-table.sh creates several records in the base table
  • bin/query-product.sh uses a service account in the consumer project to query the table. You should see results similar to these:
+---------------+
|    src_ip     |
+---------------+
| 670.345.234.1 |
| 671.345.234.1 |
| 672.345.234.1 |
| 673.345.234.1 |
| 674.345.234.1 |
+---------------+
  • bin/query-product-with-column-level-ac.sh is similar to the previous script. It shows that column level access control can be enforced in the data mesh at the base table level. This query will fail with the Access Denied... error.

When you run any of the query-product*.sh scripts, your gcloud authorization will be set to use the service account in the consumer project. This validates that the customer's service account had the intended access to query the product data.

Use bin/remove-consumer-sa-key.sh script to delete the service account key created to run these scripts and make the originally active account active again.

Consuming data streams

streaming-product folder contains the instructions on how to create, discover and consume data which is exposed as a data stream.

Cleanup

You can run terraform destroy in all the subfolders of infrastructure to revert the changed to the projects. There is a convenience destroy-all.sh script that will do it for you.

If you have created new projects for this demo you can just delete these projects.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  
儿童过敏性咳嗽吃什么药 皴是什么意思 牙齿松动了有什么办法能固齿吗 拜阿司匹林和阿司匹林有什么区别 轻度溶血是什么意思
1月26日是什么星座 什么是结节 肝纤维化是什么意思 胸有成竹是什么生肖 胃病四联疗法是什么药
张柏芝和谢霆锋为什么离婚 早上4点是什么时辰 vk是什么 惊涛骇浪什么意思 媱字五行属什么
cop是什么意思 北京大学校长什么级别 二阴指的是什么 不停的出汗是什么原因 子宫前位和子宫后位有什么区别
nt值代表什么hcv9jop2ns6r.cn 为什么脚会脱皮hcv7jop5ns3r.cn 遭罪什么意思hcv7jop5ns4r.cn 全身疼痛是什么原因hcv9jop5ns1r.cn 吃什么食物补铁dajiketang.com
肝肾不足吃什么中成药luyiluode.com 外耳炎用什么药hcv7jop5ns5r.cn 普贤菩萨的坐骑是什么hcv7jop6ns5r.cn AD是什么意思啊hcv7jop9ns3r.cn 同人文什么意思hcv9jop3ns4r.cn
trust是什么意思hcv9jop3ns3r.cn 月字旁的字有什么bjhyzcsm.com 什么是等位基因naasee.com 小什么名字好听hcv8jop8ns1r.cn 苍耳是什么xinjiangjialails.com
情人节什么时候cj623037.com 皮的偏旁是什么hcv9jop4ns8r.cn 情人果是什么hcv8jop6ns7r.cn 发情是什么意思hcv9jop7ns2r.cn 梦见火烧房子是什么预兆hcv8jop1ns6r.cn
百度