Policy decision layer: production-ready Rego policy definitions v1

As a user of the SDP, I want to be able to manage my authorization policies in a fairly simple, maintainable and flexible way.

## Current state

Currently we offer the UserInfoFetcher as well as OPA authorizers for a few products, but we do not have any guidance on how to actually write policies.

## Expected outcome
Outcomes can be RegoRule templates that we recommend users to use as a starting point for their own rules, ~~it could also be a framework or library of RegoRules that we ship with the platform.~~ We should also have a demo that showcases this. As we are working on this, we should also gain more knowledge about how to actually write sensible rules for the products, and find out more about what common policy definitions might look like.

## Step 1: Spikes, gather knowledge - plain OPA (no k8s)

We do not yet know enough about the products and their authorization models. We first want to spike some policies for each product to get a better understanding of how they all work, and then afterwards see what we can abstract away. For now, we are starting with **HDFS** and **Trino**. We also wanted to have a demo scenario that we can use as a reference when thinking about authorization and what we need to model.

What should the Rego data structures look like? We want to go in with little prerequisites and think about what works best for the product. For example for Trino we found it useful to allow the user to specify a similar data structure to [file-based access control](https://trino.io/docs/current/security/file-system-access-control.html). The policies should support assigning access to _individual users_ and _groups_. Users can model their organization in groups.

```[tasklist]
### Tasks
- [ ] https://github.com/stackabletech/opa-operator/pull/522
- [ ] https://github.com/stackabletech/issues/issues/500
- [ ] https://github.com/stackabletech/issues/issues/523
```

Questions that we should answer for each product:
- Policy definition structure:
	- How can I assign permissions to individual users, how to a department, how can I model my organizational structure in policy?
	- Can we simplify or harmonize different authz models between products, to make it easier for users to write policies that span multiple products?
- How can the policies support multi-tenancy? I.e. some rules are shared between multiple product installations, and some are specific. How to avoid duplication?
- How to support custom attributes (ABAC model)?

For each product there is an OPA authorizer and we know the input that we get from the authorizer. Policy definitions should simply be tested in pure Rego.

UserInfoFetcher - we do not want to use the UIF yet. We can simply mock the UIF API.


```[tasklist]
### Tasks
- [ ] https://github.com/stackabletech/issues/issues/524
- [x] look at all the RegoRules. What is similar, what can be common? Can group/role defintions be abstracted away and shared? Are the authz models between products compatible? --> We moved all of this into the abstraction layer
- [x] Document this knowledge in our [internal knowledge base](https://app.nuclino.com/Stackable/Engineering/Authorization-Mechanisms-across-platform-products-210a0804-4f2c-44de-afa8-5be65553967d). This should form the basis for an abstraction
- [x] ~~ADR~~ -> not needed here, only with the abstraction layer
```

### Intermediate Acceptance criteria

- RegoRules for Trino and HDFS that are usable in a realistic user scenario are written and committed somewhere
- There is knowledge base documentation about how Trino, Druid and Kafkas authorization model works, there is a document that documents commonalities and potential abstractions? - [here](https://app.nuclino.com/Stackable/Engineering/Authorization-Mechanisms-across-platform-products-210a0804-4f2c-44de-afa8-5be65553967d) :heavy_check_mark: 

## Step 2: Build a demo to showcase the rules (and other context: Kerberos, OpenID, UserInfoFetcher)


```[tasklist]
### Tasks
- [x] Update the demo to use the latest Trino rules
- [x] Implement some nice demo Trino rules
- [x] Add a Job to move the TPC-DS data into HDFS
- [x] Add some views on top of the HDFS data in Trino to make the 'ugly' parts of TPC-DS nicer (also shows how to deal with 'legacy' datasets like that)
- [x] Add nice superset dashboards
- [x] stretch goal: superset authorization (without OPA)
- [x] Add a Spark Job (https://github.com/stackabletech/issues/issues/530)
- [x] remove temporary 'hack' folder again: https://github.com/stackabletech/opa-operator/tree/main/hack - but add a similar diagram to the documentation of the demo -> https://github.com/stackabletech/opa-operator/pull/560
```

## Step 3: Deployment on the customer side

For now, since we only have two rule sets and no abstraction layer, we want to keep the rules as something users can deploy by themselves, and not automate the deployment. We can come back to automated deployment once we build an abstraction layer.

However the rules are still great starting points for customers, so we should publish them so users can use them.
We want to keep the source of truth in the kuttl tests, and link to them from the documentation. There should be some explanatory documentation around the rules as well.

```[tasklist]
### Tasks
- [ ] https://github.com/stackabletech/trino-operator/issues/580
- [ ] https://github.com/stackabletech/hdfs-operator/issues/516
- [ ] https://github.com/stackabletech/opa-operator/issues/422
- [ ] https://github.com/stackabletech/opa-operator/issues/558
- [ ] https://github.com/stackabletech/opa-operator/pull/557
- [ ] https://github.com/stackabletech/demos/pull/86
- [ ] https://github.com/stackabletech/opa-operator/issues/617
```

## Follow-up work

- Abstraction layer: https://github.com/stackabletech/issues/issues/439
- Question: What is our recommended process for developing policies? Versioning, reviewing and testing policies?

```[tasklist]
### Related tasks but out of scope for now
- [ ] https://github.com/stackabletech/issues/issues/497
- [ ] Write a RegoRule set for Kafka
- [ ] https://github.com/stackabletech/hbase-operator/issues/488
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Policy decision layer: production-ready Rego policy definitions v1 #499

Current state

Expected outcome

Step 1: Spikes, gather knowledge - plain OPA (no k8s)

Intermediate Acceptance criteria

Step 2: Build a demo to showcase the rules (and other context: Kerberos, OpenID, UserInfoFetcher)

Step 3: Deployment on the customer side

Follow-up work

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Policy decision layer: production-ready Rego policy definitions v1 #499

Description

Current state

Expected outcome

Step 1: Spikes, gather knowledge - plain OPA (no k8s)

Intermediate Acceptance criteria

Step 2: Build a demo to showcase the rules (and other context: Kerberos, OpenID, UserInfoFetcher)

Step 3: Deployment on the customer side

Follow-up work

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions