OushuDB学习笔记
=========1.启动 HDFS:start-dfs.sh ==============
Starting namenodes on [localhost]
localhost: WARNING: HADOOP_NAMENODE_OPTS has been replaced by HDFS_NAMENODE_OPTS. Using value of HADOOP_NAMENODE_OPTS.
Starting datanodes
localhost: WARNING: HADOOP_DATANODE_OPTS has been replaced by HDFS_DATANODE_OPTS. Using value of HADOOP_DATANODE_OPTS.
Starting secondary namenodes [localhost.localdomain]
localhost.localdomain: WARNING: HADOOP_SECONDARYNAMENODE_OPTS has been replaced by HDFS_SECONDARYNAMENODE_OPTS. Using value of HADOOP_SECONDARYNAMENODE_OPTS.
[gpadmin@localhost ~]$
========2.停止 HDFS===================
stop-dfs.sh
=======3.启动 OushuDB 集群==============
hawq start cluster -a --with_magma #(含 Magma 服务)
也可分别启动 master 和 segment
hawq start master #启动 master
master hawq start segment #启动 segment
=======4.停止或重启 OushuDB 集群==============
hawq stop cluster -a --with_magma #(含 Magma 服务)
hawq restart cluster #重启 OushuDB 集群
hawq stop cluster #停止 OushuDB 集群
也可以分别操作 master 和 segment
hawq restart master #重启本机的 OushuDB master
hawq restartt segment #重启本机的 OushuDB segment
hawq stop master #停止本机 OushuDB master
hawq stop segment #停止本机 OushuDB segment
hdfs dfsadmin -report
Configured Capacity: 83698397184 (77.95 GB)
Present Capacity: 61156798464 (56.96 GB)
DFS Remaining: 61156761600 (56.96 GB)
DFS Used: 36864 (36 KB)
DFS Used%: 0.00%
Replicated Blocks:
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
Live datanodes (1):
Name: 127.0.0.1:9866 (localhost)
Hostname: localhost
Decommission Status : Normal
Configured Capacity: 83698397184 (77.95 GB)
DFS Used: 36864 (36 KB)
Non DFS Used: 22541598720 (20.99 GB)
DFS Remaining: 61156761600 (56.96 GB)
DFS Used%: 0.00%
DFS Remaining%: 73.07%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Fri Dec 23 03:57:44 CST 2022
Last Block Report: Fri Dec 23 03:56:20 CST 2022
Num of Blocks: 0
============OushuDB Commands===================
The most commonly used OushuDB "commands" are:
start Start OushuDB service.
stop Stop OushuDB service.
init Init OushuDB service.
restart Restart OushuDB service.
activate Activate OushuDB standby master as master.
version Show OushuDB version information.
config Set OushuDB GUC values.
state Show OushuDB cluster status.
filespace Create OushuDB filespaces.
extract Extract table's metadata into a YAML formatted file.
load Load data into OushuDB.
scp Copies files between multiple hosts at once.
SSH Provides SSH access to multiple hosts at once.
ssh-exkeys Exchanges SSH public keys between hosts.
check Verifies and validates OushuDB settings.
checkperf Verifies the baseline hardware performance of hosts.
register Register parquet files generated by other system into the corrsponding table in OushuDB
reload Reload GUC values without restarting OushuDB cluster.
backup Backup OushuDB metadata and user data.
restore Restore OushuDB metadata and user data.
============登录数据库==============
psql -d postgres
\q #登出
============创建数据库==============
create database testdb;
create database dbnew template postgres;(用 postgres 模板创建库)
============连接数据库==============
\c testdb
============创建表===================
create table foo(id int,name varchar);
============显示表===================
\d
============往表里插入数据==============
insert into foo values(1,'hawq'),(2,'hdfs');
============查询表里的数据==============
select * from foo;
============删除表里的数据==============
delete from foo;
============删除表===================
drop database testdb;
============查看查询执行情况============
\timing on
SQL 语句
\timing off # 关闭时间输出
=============使用 explain 语句可以显示出查询计划======
test=# explain select count(*) from foo;
QUERY PLAN
Aggregate (cost=1.07..1.08 rows=1 width=8)
-> Gather Motion 1:1 (slice1; segments: 1) (cost=1.03..1.06 rows=1 width=8)
-> Aggregate (cost=1.03..1.04 rows=1 width=8)
-> Orc Table Scan on foo (cost=0.00..1.02 rows=2 width=0)
New executor mode: ON
New interconnect type: UDP
(6 rows)
Time: 12.158 ms
===============\l 查看数据库清单================
test=# \l
List of databases
Name | Owner | Encoding | Access privileges
-----------+---------+----------+-------------------
postgres | gpadmin | UTF8 |
template0 | gpadmin | UTF8 |
template1 | gpadmin | UTF8 |
test | gpadmin | UTF8 |
(4 rows)
================指定 ORC 格式建表===============
create table test_orc(i int) with (appendonly=true, orientation=orc); # 在
postgres 数据库中创建 ORC 格式表
================\d 显示当前库的表清单================
\d
List of relations
Schema | Name | Type | Owner | Storage
--------+----------+-------+---------+---------
public | foo | table | gpadmin | orc
public | test_orc | table | gpadmin | orc
(2 rows)
===============\dn 查看 schema 模式清单=============
\dn
List of schemas
Name | Owner
--------------------+---------
hawq_toolkit | gpadmin
information_schema | gpadmin
pg_aoseg | gpadmin
pg_bitmapindex | gpadmin
pg_catalog | gpadmin
pg_catalog_magma | gpadmin
pg_toast | gpadmin
public | gpadmin
(8 rows)
#默念 schema 为 public
===============检查集群可用性=================
test=# select * from gp_segment_configuration;
registration_order | role | status | port | hostname | address | description | vcid
--------------------+------+--------+-------+-----------------------+-----------------------+-----------------------+------
0 | m | u | 5432 | localhost.localdomain | localhost.localdomain | master |
1 | p | u | 40000 | 127.0.0.1 | 127.0.0.1 | belongs to vc_default | 9801
(2 rows)
Time: 15.679 ms
============检查集群节点状态历史=========================
test=# select * from gp_configuration_history;
time | registration_order | hostname | description | vcid
-------------------------------+--------------------+-----------+-----------------------+------
2022-03-25 19:28:10.76859+08 | 1 | 127.0.0.1 | belongs to vc_default | 9801
2022-04-11 22:15:26.362114+08 | 1 | 127.0.0.1 | belongs to vc_default | 9801
2022-12-19 06:06:58.556739+08 | 1 | 127.0.0.1 | belongs to vc_default | 9801
2022-12-23 04:04:28.915953+08 | 1 | 127.0.0.1 | belongs to vc_default | 9801
(4 rows)
Time: 5.613 ms
=================检查资源队列定义=========================
test=# select * from pg_resqueue;
rsqname | parentoid | activestats | memorylimit | corelimit | resovercommit | allocpolicy | vsegresourcequota | nvsegupperlimit | nvseglow
erlimit | nvsegupperlimitperseg | nvseglowerlimitperseg | creationtime | updatetime | status | vcid
-----------------------+-----------+-------------+-------------+-----------+---------------+-------------+-------------------+-----------------+---------
--------+-----------------------+-----------------------+--------------+------------+--------+------
vc_default.pg_root | 0 | -1 | 100% | 100% | 1 | even | | 0 |
0 | 0 | 0 | | | branch | 9801
vc_default.pg_default | 9800 | 20 | 50% | 50% | 2 | even | mem:256mb | 0 |
0 | 0 | 0 | | | | 9801
(2 rows)
Time: 48.551 ms
=================检查资源队列状态==========================
test=# select * from pg_resqueue_status;
rsqname | segmem | segcore | segsize | segsizemax | inusemem | inusecore | rsqholders | rsqwaiters | paused
-----------------------+--------+----------+---------+------------+----------+-----------+------------+------------+--------
vc_default.pg_root | 256 | 0.062500 | 256 | 256 | 0 | 0.000000 | 0 | 0 | F
vc_default.pg_default | 256 | 0.062500 | 128 | 256 | 0 | 0.000000 | 0 | 0 | F
(2 rows)
Time: 12.932 ms
===============常见状态监控命令:命令行检测 hawq state=======================
[gpadmin@localhost ~]$ hawq state
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:-- HAWQ instance status summary
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:------------------------------------------------------
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:-- Master instance = Active
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:-- No Standby master defined
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:-- Total segment instance count from config file = 1
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:------------------------------------------------------
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:-- Current HAWQ acl type = standalone
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:------------------------------------------------------
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:-- Segment Status
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:------------------------------------------------------
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:-- Total segments count from catalog = 1
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:-- Total segment valid (at master) = 1
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:-- Total segment failures (at master) = 0
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:-- Total number of postmaster.pid files missing = 0
20221224:11:06:15:016345 hawq_state:localhost:gpadmin-[INFO]:-- Total number of postmaster.pid files found = 1
==================SQL 检测======================
est=# select * from gp_segment_configuration;
registration_order | role | status | port | hostname | address | description | vcid
--------------------+------+--------+-------+-----------------------+-----------------------+-----------------------+------
0 | m | u | 5432 | localhost.localdomain | localhost.localdomain | master |
1 | p | u | 40000 | 127.0.0.1 | 127.0.0.1 | belongs to vc_default | 9801
(2 rows)
==================数据库连接数检查==================
test=# select count(1) from pg_stat_activity;
count
1
(1 row)
test=# select count(1) from pg_stat_activity where current_query<>'' and procpid <> pg_backend_pid();
count
0
(1 row)
test=#
==================数据库备份================
1.通过 gpfdist 外部表导入数据
2.通过 gpfdist 外部表导出数据
3.hdfs 外部表导入数据
4.hdfs 外部表导出数据
5.使用 COPY 命令导入导出数据(COPY TO 表内容至文件,COPY FROM 文件至表内容)
6.使用 pg_dump 和 pg_restore
查看数据库大小:
test=# SELECT sodddatsize FROM hawq_toolkit.hawq_size_of_database WHERE sodddatname=' mydb';
sodddatsize
(0 rows)
=================COPY===============
COPY (SELECT * FROM country) TO 'list_countires.copy';
COPY customers to stdout(delimiter '|') #拷贝表至控制台
COPY customers TO '/database/data/test_data.copy' (DELIMITTER ' '); #拷贝表至文件
COPY customers FROM 'database/data/test_data.popy' (DELIMITER ' '); #从文件拷贝至表
COPY (select * from customers where name like 'A%') TO '/datga/test_data.copy'; #导出查询结果至文件
COPY customers TO PROGRAM 'gzip > /data/test_data.copy.gz';#导出时压缩
COPY customers(col1,col2) TO '/data/test_data.copy' DELIMITER ' ';导出部分列
=================pg_dump&pg_restore==================
适合非常小量的数据备份和恢复
pg_dump -Ft -f mydb.tar mydb
pg_dump -Fc -Z3 -f mydb.dump mydb
pg_restore -d new_db mydb.dump
=================gpfdist 外部表导入=============
hawq load 命令行工具,是对 gpfdist 外部表的一个封装
=================JDBC 连接 OushuDB============
1.加载驱动程序:
Class.forName("com.postgresql.Driver");
2.获得数据库连接:
DriverManager.getConnection("jdbc:postgresql://localhost:5432/postgres","ChangLei","Password");
3.创建 StatementlPreparedStatement 对象:
conn.createStatement();
conn.prepareStatement(sql);
完整示例:
public static Connection getConn(){
Connection conn null;
try {
Class. forName ("org. postgresql. Driver");
String url = "jdbc: postgresql://localhost:5432/postgres";
try{
conn = DriverManager. getConnection (url, "ChangLei", "");
}
catch (SQLException e){
e. printstackTrace ();
}
}
catch (ClassNotFoundException e){
e. printstackTrace ();
}
return conn;
}
======================打印输出数据=======================
public static void main(String[] args){
Connection conn-getConn();
string sql-"deletefrom department where d_id-101";
try {
Statement stmt = conn.createstatement();
//准备 Statement
int rs = stmt.executeUpdate();
//执行 sq1 操作,获取受影响行数
catch (SQLException e){
e.printStackTrace();
}
if(rs > 0){
return true;
//如果受影响行数超过 1 行,则认为操作成功
}
return false;
//默认返回失败,只有受影响函数大于 0 时才返回 true
/实际环境中并不一定必须要受影响行数大于 0 才算执行成功,需根据实际情况判断,这里的方法只是例子/
}