6 comments

  • cjonas 23 minutes ago
    We just create mini data "ponds" on the fly by copying tenant isolated gold tier data to parquet in s3. The users/agent queries are executed with duckdb. We run this process when the user start a session and generate an STS token scoped to their tenant bucket path. Its extremely simple and works well (at least with our data volumes).
    • boundlessdreamz 5 minutes ago
      How do you copy all the relevant data? Doesn't this create unnecessary load on your source DB?
    • Waterluvian 12 minutes ago
      Is that why it’s called DuckDb? Because data ponds?
  • zie 1 hour ago
    We do the same thing, every employee can access our main financial/back office SQL database, but we just use PostgreSQL with row level security[0]. We never bothered to complicate it like the post does.

    0: https://www.postgresql.org/docs/18/ddl-rowsecurity.html

    • staticassertion 8 minutes ago
      I'd be so uncomfortable with this. It sounds like you're placing the full burden of access on a single boundary. I mean, maybe there's more to it that you haven't spoken about here, but "everything rests on this one postgres feature" is an unacceptably unsafe state to me.
      • weird-eye-issue 3 minutes ago
        It's not like RLS is just some random feature they are misusing. It's specifically for security and is absolutely reliable. Maybe you should do a bit more research before making comments like this.
    • orf 1 hour ago
      Back office, employee access is a completely different problem to what is described in the post.

      How do you enforce tenant isolation with that method, or prevent unbounded table reads?

      • tossandthrow 1 hour ago
        They likely don't need tenant isolation and unbound table reads can be mitigated using timeouts.

        We do something similar for our backoffice - just with the difference that it is Claude that has full freedom to write queries.

      • weird-eye-issue 6 minutes ago
        RLS...
  • jelder 50 minutes ago
    We did this with MotherDuck, and without introducing a new language. Every tenant has their own isolated storage and compute, so it’s trivial to grant internal users access to specific tenants as needed. DuckDB’s SQL dialect is mostly just Postgres’ with some nice ergonomic additions and a host of extra functionality.
    • raw_anon_1111 46 minutes ago
      This is explicitly not the problem they are trying to solve. In a single tenant database you don’t have to by definition worry about multi tenant databases
      • DangitBobby 33 minutes ago
        I guess the question then becomes, what problem does a multi-tenancy setup solve that an isolated database setup doesn't? Are they really not solving the same problem for a user perspective, or is it only from their own engineering perspective? And how do those decisions ultimately impact the product they can surface to users?
        • raw_anon_1111 11 minutes ago
          Off the top of my head, managing 100 different database instances takes a lot more work from the business standpoint than managing 1 database with 100 users.

          The article also mentioned that they isolate by project_id. That implies one customer (assume a business) can isolate permissions more granulary.

  • senorrib 1 hour ago
    Reasons 1-3 could very well be done with ClickHouse policies (RLS) and good data warehouse design. In fact, that’s more secure than a compiler adding a where to a query ran by an all mighty user.

    Reason 4 is probably an improvement, but could probably be done with CH functions.

    The problem with custom DSLs like this is that tradeoff a massive ecosystem for very little benefit.

    • efromvt 53 minutes ago
      As long as you don't deviate too much from ANSI, I think the 'light sql DSL' approach has a lot of pros when you control the UX. (so UIs, in particular, are fantastic for this approach - what they seem to be targeting with queryies and dashboards). It's more of a product experience; tables are a terrible product surface to manage.

      Agreed with the ecosystem cons getting much heavier as you move outside the product surface area.

  • elnatro 31 minutes ago
    New to ClickHouse here. Would you thing this kind of database has a niche when compared to usual RDBMS like MySQL and PostgreSQL?
  • baalimago 23 minutes ago
    The evolution of this is to use agents, and have users "chat with the data"
    • mattaitken 14 minutes ago
      Yes, you can actually do this already because we expose a REST API and TypeScript SDK functions to execute the queries.