发布于 2022-04-14 06:46:12

偶数支持多样化的存储格式部分内容学习及疑问

参考学习文档为 create table 创建语句的介绍：http://www.oushu.com/docs/ch/SQL.html?highlight=append#create-table

在最近的学习过程中，了解到偶数支持多种存储格式，如上图所示。经过对文档的深度学习，总结如下：

AO 存储格式 - 这种存储格式全称为 append-only（沿用 gp 早期名称，并有所改进），上图提到 AO 属于行存储格式，从文档中了解到，appendonly=true 时，可以定义三种存储格式：row(default)/parquet/orc，按照理解，row 属于航存储格式，parquet 属于列存储格式，orc 属于行列混存，这里产生一个疑问 1：AO 存储格式这里是不是特指的 appendonly=true,orientation=row ?如果不是，那是否可以说 AO 支持行存或列寸？疑问 2:其实该问题也与上一问题有相关性，文档中提到对 parquet 的支持时，压缩选项描述为 row 支持 zlib 压缩，parquet 提供 snappy 和 gzip 的支持，跟上图也存在部分不匹配问题，这里还请进行说明。疑问 3:在数据库中，默认的行存储方式通常支持索引，为何上图说 ao 不支持 index？这里的数据库 index 是否与我们通常理解的 index 一致？
ORC 存储格式 - apache orc 是一种优秀的列式存储引擎，能够根据数据类型自己组织索引，具体的介绍可以参考文章。按照通常的理解，我们偶数的 ORC 存储格式是经过优化和改进的，并提供对标准 ORC 格式的兼容，可否介绍一下这里的 ORC 行列混合存储中的行存储是什么概念？另外 ORC 是支持数据自索引的，所以这里说的不支持 index 指的又是什么概念呢？
MAGMA 存储格式 - 当然重头戏还是 MAGMA 啦，这种格式支持行列混合存储，支持最新的执行器，支持更新和索引，还支持自动选择压缩算法，简直不要太优秀。这里我也抛一个问题：这个自动选择压缩算法是如何处理的？依据数据类型的嘛？

以上内容是我对偶数数据库支持的多样化存储格式的学习心得及疑问，还请各位专家不吝赐教，orz。另外上图是否可以增加一个最佳使用场景建议呢？

OushuDB

浏览 (1671) 点赞收藏

阿福Chris 2022-04-14 09:35:24 回复
huor 2022-04-14 09:28:17
1. (1) AO 格式就是指 appendonly=true, orientation=row (2)各种格式所支持的压缩算法如下所示，可能文档需要更新 (3)这里的 index 就是通常的 index，ao 的数据存储在 HDFS 上，不适合支持 index
```
postgres=# create table t1 (id int) with (appendonly=true, orientation=row, compresstype=zlib);
CREATE TABLE
postgres=# create table t2 (id int) with (appendonly=true, orientation=row, compresstype=snappy);
CREATE TABLE
postgres=# create table t3 (id int) with (appendonly=true, orientation=row, compresstype=gzip);
ERROR:  non-parquet table doesn't support compress type: 'gzip'
postgres=# create table t4 (id int) with (appendonly=true, orientation=row, compresstype=none);
CREATE TABLE


postgres=# create table t5 (id int) with (appendonly=true, orientation=parquet, compresstype=zlib);
ERROR:  parquet table doesn't support compress type: 'zlib'
postgres=# create table t6 (id int) with (appendonly=true, orientation=parquet, compresstype=snappy);
CREATE TABLE
postgres=# create table t7 (id int) with (appendonly=true, orientation=parquet, compresstype=gzip);
CREATE TABLE
postgres=# create table t8 (id int) with (appendonly=true, orientation=parquet, compresstype=none);
CREATE TABLE

postgres=# create table t9 (id int) with (appendonly=true, orientation=orc, compresstype=zlib);
CREATE TABLE
postgres=# create table t10 (id int) with (appendonly=true, orientation=orc, compresstype=snappy);
CREATE TABLE
postgres=# create table t11 (id int) with (appendonly=true, orientation=orc, compresstype=gzip);
ERROR:  non-parquet table doesn't support compress type: 'gzip'
postgres=# create table t12 (id int) with (appendonly=true, orientation=orc, compresstype=none);
CREATE TABLE
```
1. ORC 的存储格式里会首先根据行组分割整个表，在每一个行组内进行按列存储。这里一个行分组称为 stripe。广义的来讲，ORC 的自索引包含索引和统计信息两部分。索引支持 row group index、bloom filter index 等。而统计信息介绍如下。
在 ORC 文件中保存了三个层级的统计信息，分别为文件级别、stripe 级别和 row group 级别的，他们都可以用来根据 Search ARGuments（谓词下推条件）判断是否可以跳过某些数据，在统计信息中都包含成员数和是否有 null 值，并且对于不同类型的数据设置一些特定的统计信息。

（1）file level
在 ORC 文件的末尾会记录文件级别的统计信息，会记录整个文件中 columns 的统计信息。这些信息主要用于查询的优化，也可以为一些简单的聚合查询比如 max, min, sum 输出结果。

（2）stripe level
ORC 文件会保存每个字段 stripe 级别的统计信息，ORC reader 使用这些统计信息来确定对于一个查询语句来说，需要读入哪些 stripe 中的记录。比如说某个 stripe 的字段 max(a)=10，min(a)=3，那么当 where 条件为 a >10 或者 a <3 时，那么这个 stripe 中的所有记录在查询语句执行时不会被读入。

（3）row level
为了进一步的避免读入不必要的数据，在逻辑上将一个 column 的 index 以一个给定的值(默认为 10000，可由参数配置)分割为多个 index 组。以 10000 条记录为一个组，对数据进行统计。Hive 查询引擎会将 where 条件中的约束传递给 ORC reader，这些 reader 根据组级别的统计信息，过滤掉不必要的数据。如果该值设置的太小，就会保存更多的统计信息，用户需要根据自己数据的特点权衡一个合理的值。
1. Magma 内部的压缩严格来讲分为编码和压缩两部分。自动选择的含义是它会根据列的不同类型选择 RLE 或者其他编码。此外，每个列的数据都会进行 LZ4 压缩。
感谢老师的回复，现在知识清晰很多啦～
huor 2022-04-14 09:28:17 回复
1. (1) AO 格式就是指 appendonly=true, orientation=row (2)各种格式所支持的压缩算法如下所示，可能文档需要更新 (3)这里的 index 就是通常的 index，ao 的数据存储在 HDFS 上，不适合支持 index
```
postgres=# create table t1 (id int) with (appendonly=true, orientation=row, compresstype=zlib);
CREATE TABLE
postgres=# create table t2 (id int) with (appendonly=true, orientation=row, compresstype=snappy);
CREATE TABLE
postgres=# create table t3 (id int) with (appendonly=true, orientation=row, compresstype=gzip);
ERROR:  non-parquet table doesn't support compress type: 'gzip'
postgres=# create table t4 (id int) with (appendonly=true, orientation=row, compresstype=none);
CREATE TABLE


postgres=# create table t5 (id int) with (appendonly=true, orientation=parquet, compresstype=zlib);
ERROR:  parquet table doesn't support compress type: 'zlib'
postgres=# create table t6 (id int) with (appendonly=true, orientation=parquet, compresstype=snappy);
CREATE TABLE
postgres=# create table t7 (id int) with (appendonly=true, orientation=parquet, compresstype=gzip);
CREATE TABLE
postgres=# create table t8 (id int) with (appendonly=true, orientation=parquet, compresstype=none);
CREATE TABLE

postgres=# create table t9 (id int) with (appendonly=true, orientation=orc, compresstype=zlib);
CREATE TABLE
postgres=# create table t10 (id int) with (appendonly=true, orientation=orc, compresstype=snappy);
CREATE TABLE
postgres=# create table t11 (id int) with (appendonly=true, orientation=orc, compresstype=gzip);
ERROR:  non-parquet table doesn't support compress type: 'gzip'
postgres=# create table t12 (id int) with (appendonly=true, orientation=orc, compresstype=none);
CREATE TABLE
```
1. ORC 的存储格式里会首先根据行组分割整个表，在每一个行组内进行按列存储。这里一个行分组称为 stripe。广义的来讲，ORC 的自索引包含索引和统计信息两部分。索引支持 row group index、bloom filter index 等。而统计信息介绍如下。
在 ORC 文件中保存了三个层级的统计信息，分别为文件级别、stripe 级别和 row group 级别的，他们都可以用来根据 Search ARGuments（谓词下推条件）判断是否可以跳过某些数据，在统计信息中都包含成员数和是否有 null 值，并且对于不同类型的数据设置一些特定的统计信息。

（1）file level
在 ORC 文件的末尾会记录文件级别的统计信息，会记录整个文件中 columns 的统计信息。这些信息主要用于查询的优化，也可以为一些简单的聚合查询比如 max, min, sum 输出结果。

（2）stripe level
ORC 文件会保存每个字段 stripe 级别的统计信息，ORC reader 使用这些统计信息来确定对于一个查询语句来说，需要读入哪些 stripe 中的记录。比如说某个 stripe 的字段 max(a)=10，min(a)=3，那么当 where 条件为 a >10 或者 a <3 时，那么这个 stripe 中的所有记录在查询语句执行时不会被读入。

（3）row level
为了进一步的避免读入不必要的数据，在逻辑上将一个 column 的 index 以一个给定的值(默认为 10000，可由参数配置)分割为多个 index 组。以 10000 条记录为一个组，对数据进行统计。Hive 查询引擎会将 where 条件中的约束传递给 ORC reader，这些 reader 根据组级别的统计信息，过滤掉不必要的数据。如果该值设置的太小，就会保存更多的统计信息，用户需要根据自己数据的特点权衡一个合理的值。
1. Magma 内部的压缩严格来讲分为编码和压缩两部分。自动选择的含义是它会根据列的不同类型选择 RLE 或者其他编码。此外，每个列的数据都会进行 LZ4 压缩。

test