Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation error when calling stage_external_sources from a prehook in a model config #324

Closed
1 of 3 tasks
hugoerolex opened this issue Nov 21, 2024 · 3 comments
Closed
1 of 3 tasks
Labels
bug Something isn't working triage

Comments

@hugoerolex
Copy link

Describe the bug

When calling the stage_external_sources macro from a prehook in a model config, a compilation error is raised when running a dbt build command.
Note that executing the macro through the dbt run-operation command works fine.

Steps to reproduce

Create a model and write the following minimal configuration :

{{ config(
    materialized="view",
    pre_hook="{{ dbt_external_tables.stage_external_sources() }}"
) }}

Expected results

External tables defined in my source.yml are created if not exist / refreshed if exist

Actual results

Compilation error

Screenshots and log output

Server logs summary :

10:05:21 Running dbt...
10:05:22 1 of 26 START external source schema.external_table
10:05:22 1 of 26 (1) create schema if not exists                             database.sch...  
10:05:22 Encountered an error:
Compilation Error in model XXX (path/to/model/XXX.sql)
  'None' has no attribute 'keys'

Server logs details :

11:08:40 Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7d197fdbd450>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7d197fdf4c90>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7d197fb61010>]}
11:08:40 Running dbt...
11:08:40 running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'log_cache_events': 'False', 'write_json': 'True', 'partial_parse': 'True', 'cache_selected_only': 'False', 'profiles_dir': '/usr/src/dbt-server-shared/.dbt', 'debug': 'False', 'version_check': 'True', 'log_path': '/usr/src/dbt-server-shared/working_dir/28f6fca5-dba7-4bbf-ba06-034e56bea7ee', 'fail_fast': 'False', 'warn_error': 'None', 'use_colors': 'True', 'use_experimental_parser': 'False', 'empty': 'None', 'quiet': 'False', 'no_print': 'None', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'static_parser': 'True', 'introspect': 'True', 'invocation_command': 'dbt -A dbt_worker.app worker --loglevel=DEBUG --concurrency=2 --max-memory-per-child=500000', 'target_path': 'None', 'log_format': 'json', 'send_anonymous_usage_stats': 'True'}
11:08:40 Discovered Exposures:
[]
11:08:40 Sending event: {'category': 'dbt', 'action': 'project_id', 'label': '97aa710e-3d9b-447b-9163-cc643e72e51d', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7d1977557d10>]}
11:08:40 Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': '97aa710e-3d9b-447b-9163-cc643e72e51d', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7d197f9d7a10>]}
11:08:40 Registered adapter: snowflake=1.9.0-post10+a16654bed4d67d1a922bf87e579a285c35bd55ac
11:08:40 checksum: 007e558dd5cbf66720904e41fb9ea366a52729d1df24383e6687de4655d0a2a1, vars: {}, profile: user, target: , version: 2024.11.20+a99d1be
11:08:40 Observability Metric: msgpack_manifest_bytes=2442424.0
11:08:40 Partial parsing enabled: 0 files deleted, 0 files added, 1 files changed.
11:08:40 Partial parsing: updated file: project://models/path/to/model/XXX.sql
11:08:40 1 of 26 START external source schema.external_table
11:08:40 1 of 26 (1) create schema if not exists                             database.sch...  
11:08:40 Encountered an error:
Compilation Error in model XXX (path/to/model/XXX.sql)
  'None' has no attribute 'keys'
11:08:40 Resource report: {"command_name": "parse", "command_wall_clock_time": 0.28523365, "process_user_time": 26.76897, "process_kernel_time": 4.135137, "process_mem_max_rss": "332524", "process_out_blocks": "338448", "command_success": false, "process_in_blocks": "0"}
11:08:40 Observability Metric: command_success=0.0
11:08:40 Observability Metric: command_wall_clock_time=0.28523364663124084
11:08:40 Observability Metric: process_user_time=26.768970489501953
11:08:40 Observability Metric: process_kernel_time=4.13513708114624
11:08:40 Observability Metric: process_mem_max_rss=332524.0
11:08:40 Command `cli parse` failed at 11:08:40.876994 after 0.29 seconds
11:08:40 Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7d19812e8d10>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7d197f9aa4d0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7d197fa8c350>]}

System information

The contents of your packages.yml file:

packages:
  - package: dbt-labs/codegen
    version: 0.12.1
  - package: dbt-labs/dbt_utils
    version: 1.1.1  
  - package: dbt-labs/dbt_project_evaluator
    version: 0.12.1
  - package: calogica/dbt_expectations
    version: [">=0.10.0", "<0.11.0"]
  - package: dbt-labs/dbt_external_tables
    version: 0.10.0

Which database are you using dbt with?

  • redshift
  • snowflake
  • other (specify: ____________)

The output of dbt --version:
dbt cloud versionless

The operating system you're using:
dbt cloud running in Chrome on Windows

@hugoerolex hugoerolex added bug Something isn't working triage labels Nov 21, 2024
@jeremyyeo
Copy link
Collaborator

jeremyyeo commented Nov 22, 2024

Hey Hugo,

Just doing a quick sense check...

# packages.yml
packages:
  - package: dbt-labs/dbt_external_tables
    version: 0.10.0

# dbt_project.yml
name: my_dbt_project
profile: all
config-version: 2
version: 1.0

models:
  my_dbt_project:
    +materialized: table

# models/sources.yml
version: 2
sources:
  - name: raw
    database: db
    schema: dbt_jyeo
    loader: S3
    tables:
      - name: ext_people
        external:
          location: "@dbt_test.dbt_external_tables_integration_tests_snowflake.dbt_external_tables_testing/csv"
          file_format: "(type = csv skip_header = 1)"
        columns: &people-cols
          - name: id
            data_type: int
          - name: first_name
            data_type: varchar(64)
          - name: last_name
            data_type: varchar(64)
          - name: email
            data_type: varchar(64)
-- models/new_tbl.sql
{{
    config(
        pre_hook = "{{ dbt_external_tables.stage_external_sources() }}"
    )
}}

select 1 id

First things first, delete the external table if it exist so we're starting from scratch:

-- run in snowflake
drop table db.dbt_jyeo.ext_people if exists;
select * from db.dbt_jyeo.ext_people;
-- Object 'DB.DBT_JYEO.EXT_PEOPLE' does not exist or not authorized.

Now, we build:

$ dbt --debug run
01:45:09  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1069d08d0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x106dcaa10>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x106dcb450>]}
01:45:09  Running with dbt=1.9.0-b4
01:45:09  running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'log_cache_events': 'False', 'write_json': 'True', 'partial_parse': 'True', 'cache_selected_only': 'False', 'warn_error': 'None', 'fail_fast': 'False', 'debug': 'True', 'log_path': '/Users/jeremy/git/dbt-basic/logs', 'profiles_dir': '/Users/jeremy/.dbt', 'version_check': 'True', 'use_colors': 'True', 'use_experimental_parser': 'False', 'empty': 'False', 'quiet': 'False', 'no_print': 'None', 'log_format': 'default', 'static_parser': 'True', 'introspect': 'True', 'invocation_command': 'dbt --debug run', 'target_path': 'None', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'send_anonymous_usage_stats': 'True'}
01:45:10  Sending event: {'category': 'dbt', 'action': 'project_id', 'label': 'a5d38a67-5d4d-402d-9940-4efdfeb38682', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x127497290>]}
01:45:10  Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': 'a5d38a67-5d4d-402d-9940-4efdfeb38682', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10657b350>]}
01:45:10  Registered adapter: snowflake=1.9.0-b1
01:45:10  checksum: 0d3acddb5afbce98f9888b13248063cae3d09e6e329263e3ffe4b2a6aa072c8d, vars: {}, profile: , target: , version: 1.9.0b4
01:45:10  Partial parsing enabled: 0 files deleted, 0 files added, 1 files changed.
01:45:10  Partial parsing: updated file: my_dbt_project://models/sources.yml
01:45:10  Sending event: {'category': 'dbt', 'action': 'load_project', 'label': 'a5d38a67-5d4d-402d-9940-4efdfeb38682', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x15ab5c9d0>]}
01:45:10  Wrote artifact WritableManifest to /Users/jeremy/git/dbt-basic/target/manifest.json
01:45:10  Wrote artifact SemanticManifest to /Users/jeremy/git/dbt-basic/target/semantic_manifest.json
01:45:10  Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': 'a5d38a67-5d4d-402d-9940-4efdfeb38682', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x15ae8ce90>]}
01:45:10  Found 1 model, 1 source, 518 macros
01:45:10  Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': 'a5d38a67-5d4d-402d-9940-4efdfeb38682', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x15ab5cc10>]}
01:45:10  
01:45:10  Concurrency: 1 threads (target='sf')
01:45:10  
01:45:10  Acquiring new snowflake connection 'master'
01:45:10  Acquiring new snowflake connection 'list_db'
01:45:10  Using snowflake connection "list_db"
01:45:10  On list_db: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "connection_name": "list_db"} */
show terse schemas in database db
    limit 10000
01:45:10  Opening a new connection, currently in state init
01:45:13  SQL status: SUCCESS 2 in 2.376 seconds
01:45:13  Re-using an available connection from the pool (formerly list_db, now list_db_dbt_jyeo)
01:45:13  Using snowflake connection "list_db_dbt_jyeo"
01:45:13  On list_db_dbt_jyeo: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "connection_name": "list_db_dbt_jyeo"} */
show objects in db.dbt_jyeo limit 10000;
01:45:13  SQL status: SUCCESS 2 in 0.396 seconds
01:45:13  Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': 'a5d38a67-5d4d-402d-9940-4efdfeb38682', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x15ae54410>]}
01:45:13  Began running node model.my_dbt_project.new_tbl
01:45:13  1 of 1 START sql table model dbt_jyeo.new_tbl .................................. [RUN]
01:45:13  Re-using an available connection from the pool (formerly list_db_dbt_jyeo, now model.my_dbt_project.new_tbl)
01:45:13  Began compiling node model.my_dbt_project.new_tbl
01:45:13  Writing injected SQL for node "model.my_dbt_project.new_tbl"
01:45:13  Began executing node model.my_dbt_project.new_tbl
01:45:13  1 of 1 START external source dbt_jyeo.ext_people
01:45:13  Using snowflake connection "model.my_dbt_project.new_tbl"
01:45:13  On model.my_dbt_project.new_tbl: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.new_tbl"} */
show terse schemas like 'dbt_jyeo' in database db limit 1;
01:45:13  SQL status: SUCCESS 1 in 0.350 seconds
01:45:13  1 of 1 (1) select 'Schema dbt_jyeo exists' from dual;
01:45:13  Using snowflake connection "model.my_dbt_project.new_tbl"
01:45:13  On model.my_dbt_project.new_tbl: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.new_tbl"} */
select 'Schema dbt_jyeo exists' from dual;
01:45:14  SQL status: SUCCESS 1 in 0.328 seconds
01:45:14  1 of 1 (1) SUCCESS 1
01:45:14  1 of 1 (2) create or replace external table db.dbt_jyeo.ext_people(                id int a...  
01:45:14  Using snowflake connection "model.my_dbt_project.new_tbl"
01:45:14  On model.my_dbt_project.new_tbl: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.new_tbl"} */
create or replace external table db.dbt_jyeo.ext_people(
                id int as ((case when is_null_value(value:c1) or lower(value:c1) = 'null' then null else value:c1 end)::int),
                first_name varchar(64) as ((case when is_null_value(value:c2) or lower(value:c2) = 'null' then null else value:c2 end)::varchar(64)),
                last_name varchar(64) as ((case when is_null_value(value:c3) or lower(value:c3) = 'null' then null else value:c3 end)::varchar(64)),
                email varchar(64) as ((case when is_null_value(value:c4) or lower(value:c4) = 'null' then null else value:c4 end)::varchar(64))
        )
    location = @dbt_test.dbt_external_tables_integration_tests_snowflake.dbt_external_tables_testing/csv 
    
    
    
    
    
    file_format = (type = csv skip_header = 1)
    
;
01:45:15  SQL status: SUCCESS 1 in 1.620 seconds
01:45:15  1 of 1 (2) SUCCESS 1
01:45:15  Writing runtime sql for node "model.my_dbt_project.new_tbl"
01:45:15  Using snowflake connection "model.my_dbt_project.new_tbl"
01:45:15  On model.my_dbt_project.new_tbl: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.new_tbl"} */
create or replace transient table db.dbt_jyeo.new_tbl
         as
        (

select 1 id
        );
01:45:17  SQL status: SUCCESS 1 in 1.295 seconds
01:45:17  Sending event: {'category': 'dbt', 'action': 'run_model', 'label': 'a5d38a67-5d4d-402d-9940-4efdfeb38682', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x15b4543d0>]}
01:45:17  1 of 1 OK created sql table model dbt_jyeo.new_tbl ............................. [SUCCESS 1 in 3.74s]
01:45:17  Finished running node model.my_dbt_project.new_tbl
01:45:17  Connection 'master' was properly closed.
01:45:17  Connection 'model.my_dbt_project.new_tbl' was left open.
01:45:17  On model.my_dbt_project.new_tbl: Close
01:45:17  
01:45:17  Finished running 1 table model in 0 hours 0 minutes and 7.18 seconds (7.18s).
01:45:17  Command end result
01:45:17  Wrote artifact WritableManifest to /Users/jeremy/git/dbt-basic/target/manifest.json
01:45:17  Wrote artifact SemanticManifest to /Users/jeremy/git/dbt-basic/target/semantic_manifest.json
01:45:17  Wrote artifact RunExecutionResult to /Users/jeremy/git/dbt-basic/target/run_results.json
01:45:17  
01:45:17  Completed successfully
01:45:17  
01:45:17  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
01:45:17  Resource report: {"command_name": "run", "command_success": true, "command_wall_clock_time": 8.062279, "process_user_time": 2.621293, "process_kernel_time": 1.596588, "process_mem_max_rss": "203014144", "process_in_blocks": "0", "process_out_blocks": "0"}
01:45:17  Command `dbt run` succeeded at 14:45:17.822465 after 8.06 seconds
01:45:17  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x106ddba90>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x102511c10>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10244ba10>]}
01:45:17  Flushing usage events
01:45:19  An error was encountered while trying to flush usage events

Do a subsequent run to make sure that this works even when the external table was created from the above build:

$ dbt --debug run
01:45:58  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x104f98b90>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x104fde7d0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x104fe6690>]}
01:45:58  Running with dbt=1.9.0-b4
01:45:58  running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'log_cache_events': 'False', 'write_json': 'True', 'partial_parse': 'True', 'cache_selected_only': 'False', 'warn_error': 'None', 'version_check': 'True', 'profiles_dir': '/Users/jeremy/.dbt', 'log_path': '/Users/jeremy/git/dbt-basic/logs', 'debug': 'True', 'fail_fast': 'False', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'empty': 'False', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'static_parser': 'True', 'introspect': 'True', 'invocation_command': 'dbt --debug run', 'target_path': 'None', 'log_format': 'default', 'send_anonymous_usage_stats': 'True'}
01:45:58  Sending event: {'category': 'dbt', 'action': 'project_id', 'label': '0c780a08-b282-43f1-a66f-e81c71e661a3', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x106fa4cd0>]}
01:45:58  Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': '0c780a08-b282-43f1-a66f-e81c71e661a3', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10495b250>]}
01:45:58  Registered adapter: snowflake=1.9.0-b1
01:45:58  checksum: 0d3acddb5afbce98f9888b13248063cae3d09e6e329263e3ffe4b2a6aa072c8d, vars: {}, profile: , target: , version: 1.9.0b4
01:45:58  Partial parsing enabled: 0 files deleted, 0 files added, 0 files changed.
01:45:58  Partial parsing enabled, no changes found, skipping parsing
01:45:58  Sending event: {'category': 'dbt', 'action': 'load_project', 'label': '0c780a08-b282-43f1-a66f-e81c71e661a3', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x12bc35210>]}
01:45:58  Wrote artifact WritableManifest to /Users/jeremy/git/dbt-basic/target/manifest.json
01:45:58  Wrote artifact SemanticManifest to /Users/jeremy/git/dbt-basic/target/semantic_manifest.json
01:45:58  Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': '0c780a08-b282-43f1-a66f-e81c71e661a3', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x12bd39a10>]}
01:45:58  Found 1 model, 1 source, 518 macros
01:45:58  Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': '0c780a08-b282-43f1-a66f-e81c71e661a3', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x106fb6d10>]}
01:45:58  
01:45:58  Concurrency: 1 threads (target='sf')
01:45:58  
01:45:58  Acquiring new snowflake connection 'master'
01:45:58  Acquiring new snowflake connection 'list_db'
01:45:58  Using snowflake connection "list_db"
01:45:58  On list_db: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "connection_name": "list_db"} */
show terse schemas in database db
    limit 10000
01:45:58  Opening a new connection, currently in state init
01:46:00  SQL status: SUCCESS 2 in 2.049 seconds
01:46:00  Re-using an available connection from the pool (formerly list_db, now list_db_dbt_jyeo)
01:46:00  Using snowflake connection "list_db_dbt_jyeo"
01:46:00  On list_db_dbt_jyeo: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "connection_name": "list_db_dbt_jyeo"} */
show objects in db.dbt_jyeo limit 10000;
01:46:01  SQL status: SUCCESS 3 in 0.372 seconds
01:46:01  Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': '0c780a08-b282-43f1-a66f-e81c71e661a3', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x12b8bfe50>]}
01:46:01  Began running node model.my_dbt_project.new_tbl
01:46:01  1 of 1 START sql table model dbt_jyeo.new_tbl .................................. [RUN]
01:46:01  Re-using an available connection from the pool (formerly list_db_dbt_jyeo, now model.my_dbt_project.new_tbl)
01:46:01  Began compiling node model.my_dbt_project.new_tbl
01:46:01  Writing injected SQL for node "model.my_dbt_project.new_tbl"
01:46:01  Began executing node model.my_dbt_project.new_tbl
01:46:01  1 of 1 START external source dbt_jyeo.ext_people
01:46:01  1 of 1 (1) alter external table db.dbt_jyeo.ext_people refresh;        commit;
01:46:01  Using snowflake connection "model.my_dbt_project.new_tbl"
01:46:01  On model.my_dbt_project.new_tbl: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.new_tbl"} */
BEGIN
01:46:01  SQL status: SUCCESS 1 in 0.404 seconds
01:46:01  Using snowflake connection "model.my_dbt_project.new_tbl"
01:46:01  On model.my_dbt_project.new_tbl: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.new_tbl"} */
alter external table db.dbt_jyeo.ext_people refresh;
01:46:02  SQL status: SUCCESS 0 in 0.645 seconds
01:46:02  Using snowflake connection "model.my_dbt_project.new_tbl"
01:46:02  On model.my_dbt_project.new_tbl: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.new_tbl"} */
COMMIT
01:46:02  SQL status: SUCCESS 1 in 0.410 seconds
01:46:02  1 of 1 (1) SUCCESS 0
01:46:02  Writing runtime sql for node "model.my_dbt_project.new_tbl"
01:46:02  Using snowflake connection "model.my_dbt_project.new_tbl"
01:46:02  On model.my_dbt_project.new_tbl: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.new_tbl"} */
create or replace transient table db.dbt_jyeo.new_tbl
         as
        (

select 1 id
        );
01:46:03  SQL status: SUCCESS 1 in 1.020 seconds
01:46:03  Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '0c780a08-b282-43f1-a66f-e81c71e661a3', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10313d150>]}
01:46:03  1 of 1 OK created sql table model dbt_jyeo.new_tbl ............................. [SUCCESS 1 in 2.58s]
01:46:03  Finished running node model.my_dbt_project.new_tbl
01:46:03  Connection 'master' was properly closed.
01:46:03  Connection 'model.my_dbt_project.new_tbl' was left open.
01:46:03  On model.my_dbt_project.new_tbl: Close
01:46:04  
01:46:04  Finished running 1 table model in 0 hours 0 minutes and 5.69 seconds (5.69s).
01:46:04  Command end result
01:46:04  Wrote artifact WritableManifest to /Users/jeremy/git/dbt-basic/target/manifest.json
01:46:04  Wrote artifact SemanticManifest to /Users/jeremy/git/dbt-basic/target/semantic_manifest.json
01:46:04  Wrote artifact RunExecutionResult to /Users/jeremy/git/dbt-basic/target/run_results.json
01:46:04  
01:46:04  Completed successfully
01:46:04  
01:46:04  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
01:46:04  Resource report: {"command_name": "run", "command_success": true, "command_wall_clock_time": 6.4479094, "process_user_time": 1.765473, "process_kernel_time": 2.248902, "process_mem_max_rss": "202440704", "process_in_blocks": "0", "process_out_blocks": "0"}
01:46:04  Command `dbt run` succeeded at 14:46:04.568165 after 6.45 seconds
01:46:04  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x12b896990>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10120b0d0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10120b9d0>]}
01:46:04  Flushing usage events
01:46:05  An error was encountered while trying to flush usage events

No apparent errors attempting to use the dbt_external_tables.stage_external_sources() macro in a pre-hook.

Second pass at this - it looks like hugos logs were erroring around the creation of the schema where we are writing our external table to - i.e. the schema does not yet exist - while my runs above where to a schema that already exist. Let's try with a schema that does not yet exist:

# models/sources.yml
version: 2
sources:
  - name: raw
    database: db
    schema: some_new_schema
    loader: S3
    tables:
      - name: ext_people
        external: &people
          location: "@dbt_test.dbt_external_tables_integration_tests_snowflake.dbt_external_tables_testing/csv"
          file_format: "(type = csv skip_header = 1)"
        columns: &people-cols
          - name: id
            data_type: int
          - name: first_name
            data_type: varchar(64)
          - name: last_name
            data_type: varchar(64)
          - name: email
            data_type: varchar(64)

Notice the use of some_new_schema here.

$ dbt --debug run
01:50:36  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1069c08d0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x106bd4f10>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x106bf7750>]}
01:50:36  Running with dbt=1.9.0-b4
01:50:36  running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'write_json': 'True', 'log_cache_events': 'False', 'partial_parse': 'True', 'cache_selected_only': 'False', 'warn_error': 'None', 'fail_fast': 'False', 'debug': 'True', 'log_path': '/Users/jeremy/git/dbt-basic/logs', 'version_check': 'True', 'profiles_dir': '/Users/jeremy/.dbt', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'empty': 'False', 'log_format': 'default', 'invocation_command': 'dbt --debug run', 'introspect': 'True', 'static_parser': 'True', 'target_path': 'None', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'send_anonymous_usage_stats': 'True'}
01:50:37  Sending event: {'category': 'dbt', 'action': 'project_id', 'label': '9137a38c-8209-4fea-ad4c-9c26517c6153', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x105ed0a90>]}
01:50:37  Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': '9137a38c-8209-4fea-ad4c-9c26517c6153', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10656b350>]}
01:50:37  Registered adapter: snowflake=1.9.0-b1
01:50:37  checksum: 0d3acddb5afbce98f9888b13248063cae3d09e6e329263e3ffe4b2a6aa072c8d, vars: {}, profile: , target: , version: 1.9.0b4
01:50:37  Partial parsing enabled: 0 files deleted, 0 files added, 1 files changed.
01:50:37  Partial parsing: updated file: my_dbt_project://models/sources.yml
01:50:37  Sending event: {'category': 'dbt', 'action': 'load_project', 'label': '9137a38c-8209-4fea-ad4c-9c26517c6153', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x14022ead0>]}
01:50:37  Wrote artifact WritableManifest to /Users/jeremy/git/dbt-basic/target/manifest.json
01:50:37  Wrote artifact SemanticManifest to /Users/jeremy/git/dbt-basic/target/semantic_manifest.json
01:50:37  Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': '9137a38c-8209-4fea-ad4c-9c26517c6153', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1407241d0>]}
01:50:37  Found 1 model, 1 source, 518 macros
01:50:37  Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': '9137a38c-8209-4fea-ad4c-9c26517c6153', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1402ea110>]}
01:50:37  
01:50:37  Concurrency: 1 threads (target='sf')
01:50:37  
01:50:37  Acquiring new snowflake connection 'master'
01:50:37  Acquiring new snowflake connection 'list_db'
01:50:37  Using snowflake connection "list_db"
01:50:37  On list_db: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "connection_name": "list_db"} */
show terse schemas in database db
    limit 10000
01:50:37  Opening a new connection, currently in state init
01:50:39  SQL status: SUCCESS 2 in 2.399 seconds
01:50:39  Re-using an available connection from the pool (formerly list_db, now list_db_dbt_jyeo)
01:50:39  Using snowflake connection "list_db_dbt_jyeo"
01:50:39  On list_db_dbt_jyeo: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "connection_name": "list_db_dbt_jyeo"} */
show objects in db.dbt_jyeo limit 10000;
01:50:40  SQL status: SUCCESS 3 in 0.370 seconds
01:50:40  Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': '9137a38c-8209-4fea-ad4c-9c26517c6153', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x140a4dcd0>]}
01:50:40  Began running node model.my_dbt_project.new_tbl
01:50:40  1 of 1 START sql table model dbt_jyeo.new_tbl .................................. [RUN]
01:50:40  Re-using an available connection from the pool (formerly list_db_dbt_jyeo, now model.my_dbt_project.new_tbl)
01:50:40  Began compiling node model.my_dbt_project.new_tbl
01:50:40  Writing injected SQL for node "model.my_dbt_project.new_tbl"
01:50:40  Began executing node model.my_dbt_project.new_tbl
01:50:40  1 of 1 START external source some_new_schema.ext_people
01:50:40  On "model.my_dbt_project.new_tbl": cache miss for schema "db.some_new_schema", this is inefficient
01:50:40  Using snowflake connection "model.my_dbt_project.new_tbl"
01:50:40  On model.my_dbt_project.new_tbl: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.new_tbl"} */
show objects in db.some_new_schema limit 10000;
01:50:40  Snowflake adapter: Snowflake query id: 01b8880e-0906-d2be-000d-378341b81996
01:50:40  Snowflake adapter: Snowflake error: 002043 (02000): SQL compilation error:
Object does not exist, or operation cannot be performed.
01:50:40  Snowflake adapter: Error running SQL: macro list_relations_without_caching
01:50:40  Snowflake adapter: Rolling back transaction.
01:50:40  While listing relations in database=db, schema=some_new_schema, found: 
01:50:40  Using snowflake connection "model.my_dbt_project.new_tbl"
01:50:40  On model.my_dbt_project.new_tbl: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.new_tbl"} */
show terse schemas like 'some_new_schema' in database db limit 1;
01:50:41  SQL status: SUCCESS 0 in 0.420 seconds
01:50:41  1 of 1 (1) create schema if not exists                             db.some_new_schema      ...  
01:50:41  Using snowflake connection "model.my_dbt_project.new_tbl"
01:50:41  On model.my_dbt_project.new_tbl: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.new_tbl"} */
create schema if not exists 
            
                db.some_new_schema
            
        ;
01:50:41  SQL status: SUCCESS 1 in 0.384 seconds
01:50:41  1 of 1 (1) SUCCESS 1
01:50:41  1 of 1 (2) create or replace external table db.some_new_schema.ext_people(                i...  
01:50:41  Using snowflake connection "model.my_dbt_project.new_tbl"
01:50:41  On model.my_dbt_project.new_tbl: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.new_tbl"} */
create or replace external table db.some_new_schema.ext_people(
                id int as ((case when is_null_value(value:c1) or lower(value:c1) = 'null' then null else value:c1 end)::int),
                first_name varchar(64) as ((case when is_null_value(value:c2) or lower(value:c2) = 'null' then null else value:c2 end)::varchar(64)),
                last_name varchar(64) as ((case when is_null_value(value:c3) or lower(value:c3) = 'null' then null else value:c3 end)::varchar(64)),
                email varchar(64) as ((case when is_null_value(value:c4) or lower(value:c4) = 'null' then null else value:c4 end)::varchar(64))
        )
    location = @dbt_test.dbt_external_tables_integration_tests_snowflake.dbt_external_tables_testing/csv 
    
    
    
    
    
    file_format = (type = csv skip_header = 1)
    
;
01:50:43  SQL status: SUCCESS 1 in 1.704 seconds
01:50:43  1 of 1 (2) SUCCESS 1
01:50:43  Writing runtime sql for node "model.my_dbt_project.new_tbl"
01:50:43  Using snowflake connection "model.my_dbt_project.new_tbl"
01:50:43  On model.my_dbt_project.new_tbl: /* {"app": "dbt", "dbt_version": "1.9.0b4", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.new_tbl"} */
create or replace transient table db.dbt_jyeo.new_tbl
         as
        (

select 1 id
        );
01:50:44  SQL status: SUCCESS 1 in 1.233 seconds
01:50:44  Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '9137a38c-8209-4fea-ad4c-9c26517c6153', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x140d3f110>]}
01:50:44  1 of 1 OK created sql table model dbt_jyeo.new_tbl ............................. [SUCCESS 1 in 4.34s]
01:50:44  Finished running node model.my_dbt_project.new_tbl
01:50:44  Connection 'master' was properly closed.
01:50:44  Connection 'model.my_dbt_project.new_tbl' was left open.
01:50:44  On model.my_dbt_project.new_tbl: Close
01:50:45  
01:50:45  Finished running 1 table model in 0 hours 0 minutes and 7.79 seconds (7.79s).
01:50:45  Command end result
01:50:45  Wrote artifact WritableManifest to /Users/jeremy/git/dbt-basic/target/manifest.json
01:50:45  Wrote artifact SemanticManifest to /Users/jeremy/git/dbt-basic/target/semantic_manifest.json
01:50:45  Wrote artifact RunExecutionResult to /Users/jeremy/git/dbt-basic/target/run_results.json
01:50:45  
01:50:45  Completed successfully
01:50:45  
01:50:45  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
01:50:45  Resource report: {"command_name": "run", "command_success": true, "command_wall_clock_time": 8.64233, "process_user_time": 2.656455, "process_kernel_time": 1.653997, "process_mem_max_rss": "207814656", "process_in_blocks": "0", "process_out_blocks": "0"}
01:50:45  Command `dbt run` succeeded at 14:50:45.285703 after 8.64 seconds
01:50:45  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x106c27ed0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x106c273d0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x102ee1c10>]}
01:50:45  Flushing usage events
01:50:46  An error was encountered while trying to flush usage events

Seems okay.

@jeremyyeo
Copy link
Collaborator

jeremyyeo commented Nov 22, 2024

Initial take: there is actually a legitimate compilation error with hugos model XXX.sql - to truly isolate - make sure the project is clean of all customizations (i.e. remove all models, macros, overrides, all other pacakges, etc) and basically just have a single toy model + single source like the example above demonstrates.

@hugoerolex
Copy link
Author

Hi @jeremyyeo sorry for the time lost, we isolated with the support that it was a bug of partial parsing. The issue isn't reproducing when cleaning the target folder. Sorry for the noise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

2 participants