-
Notifications
You must be signed in to change notification settings - Fork 0
Local Hadoop
Hurley edited this page Dec 23, 2020
·
1 revision
hdfs and database are two entities, and here's how you can transfer data between each other:
- create a table in database (using Impala or Hive)
create table <tablename> (id string, name string)
row format delimited
fields terminated by ','
lines terminated BY '\n'
stored as textfile
IMPORTANT: your table name show be all lowercases
- insert your table from hdfs to the table on database you just created
hdfs dfs -copyFromLocal <path_of_your_table_on_hdfs> /encrypted/warehouse/<database_name>.db/<tablename>
- if you create your table using Hive, in order to see the table on Impala, you need to run following on Impala:
invalidate metadata <tablename>
compute stats <tablename>
- concat all sub-files from database to hdfs
- change delimited
hadoop fs -cat /encrypted/warehouse/<database_name>.db/<tablename>/* | tr '\001' '\t' > <path_of_file_on_hdfs>
a wiki collection