Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to properly UNNEST a JSON Array? #2933

Open
christophediprima opened this issue Jul 24, 2024 · 4 comments
Open

How to properly UNNEST a JSON Array? #2933

christophediprima opened this issue Jul 24, 2024 · 4 comments
Assignees
Labels
Project: UI Issues related to creating the UI

Comments

@christophediprima
Copy link

Hi there,

What is the proper way to UNNEST a JSON array? I have following JSON structure saved in a raw column of my table :

{
    "post_id": "1",
    "mentioned_users": [
        {
            "user_id": "B"
        },
        {
            "user_id": "C"
        }
    ]
}

I would like to convert it in a table like like so:

post_id mentioned_user_id
1 B
1 C

I have something like this that works:

MODEL (
  name mentions.full_model,
  kind FULL,
  cron '@daily',
  grain ('post_id', 'mentioned_user_id')
);

SELECT  
  raw.post_id AS `post_id`,
  mentioned_user.user_id as `mentioned_user_id`
FROM posts.seed_model AS raw, 
  UNNEST(CAST(JSON_EXTRACT(raw, '$.mentioned_users') AS ARRAY<STRUCT<user_id STRING>>)) AS mentioned_user

But the lineage is not working properly for the mentioned_user_id column. When I select it the column greys out and a new mentioned_user columns appear with the lineage on the UNNEST line. Is this the best I can get?

@tobymao
Copy link
Contributor

tobymao commented Jul 24, 2024

what dialect is this?

@christophediprima
Copy link
Author

Idealy BigQuery as we would like this to be portable and eventualy use Apache Beam to handle our workflows. But we may end up using DuckDb dialect if it makes no sense or it is not possible to create those models from Beam.

@tobymao
Copy link
Contributor

tobymao commented Jul 27, 2024

so i was able to get column level lineage to mostly work. in this case, i think you do json_extract(raw.raw, ...) it will help the compiler find the right reference. mentioned_user column is basically the unnest node.

@vchan there's some issue here with visualization of the lineage with unnest

@christophediprima
Copy link
Author

christophediprima commented Aug 12, 2024

Thanks for your reply and sorry for my late answer (I was vacationing ^^).

I have been inspecting the /api/lineage calls the UI sends to the server and the responses look great. So yeah looks like the issue is on the visualization side.

@tobymao tobymao added the Project: UI Issues related to creating the UI label Aug 14, 2024
@mykalmax mykalmax self-assigned this Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Project: UI Issues related to creating the UI
Projects
None yet
Development

No branches or pull requests

3 participants