【上海校区】Hive中DDL数据定义之分区表

分区表实际上就是对应一个HDFS文件系统上的独立的文件夹，该文件夹下是该分区所有的数据文件。Hive中的分区就是分目录，把一个大的数据集根据业务需要分割成小的数据集。在查询时通过WHERE子句中的表达式选择查询所需要的指定的分区，这样的查询效率会提高很多

1.单级分区表基本操作

1)创建分区表语法

2)加载数据到分区表中

load data local inpath '/opt/package/hive/txt/dept.txt' into table test partition(month='20180907');
load data local inpath '/opt/package/hive/txt/dept.txt' into table test partition(month='20180908');
load data local inpath '/opt/package/hive/txt/dept.txt' into table test partition(month='20180909');

3)查询分区表中数据

单分区查询

select *from test where month='201709';

多分区联合查询

注意:

4)增加分区

增加单个分区

alter table test add partition(month='201809011')；

增加多个分区

5)删除分区

删除单个分区

alter table test drop partition(month='20180913');

删除多个分区

6)查看分区表有多少分区

show partitions test;

7)查看分区表结构

desc formatted test;

除了能查看出是管理表之外，还能查看分区的信息

2.多级分区表

二级分区表

创建二级分区表

加载数据到二级分区表中

load data local inpath '/opt/package/hive/txt/dept.txt' into table test partition(month='201809',day='01');

查询二级分区表数据

select *from test where month='201709' and day='01';

不二晨 · 不二晨

奈斯

帐号		自动登录	找回密码
密码			加入黑马

1 个回复