01.介绍
a.说明
数据库是 PostgreSQL 中存储数据的容器
一个 PostgreSQL 实例可以包含多个数据库
每个数据库相互独立,有自己的表、视图、函数等对象
不同数据库之间不能直接查询(除非使用 dblink 或 FDW)
-----------------------------------------------------------------------------------------------------
系统数据库:
postgres:默认数据库,用于管理和连接
template0:原始模板数据库,不可修改
template1:用户模板数据库,创建新数据库时的模板
-----------------------------------------------------------------------------------------------------
数据库的物理存储:
每个数据库在 $PGDATA/base 目录下有一个子目录
子目录名为数据库的 OID
查看数据库 OID:SELECT oid, datname FROM pg_database;
b.命名规范
数据库名必须以字母或下划线开头
只能包含字母、数字、下划线
长度不超过 63 个字符
不能使用 PostgreSQL 保留字(如 SELECT、TABLE 等)
建议使用小写字母和下划线
-----------------------------------------------------------------------------------------------------
命名示例:
mydb # 推荐
my_database # 推荐
app_db_2024 # 推荐
MyDatabase # 不推荐(大小写敏感)
123db # 错误(不能以数字开头)
c.系统数据库
postgres:
默认的管理数据库
用于连接和执行管理命令
不建议在此数据库中创建业务表
-----------------------------------------------------------------------------------------------------
template0:
原始的模板数据库
不可修改,用于恢复 template1
创建数据库时可以指定使用 template0
-----------------------------------------------------------------------------------------------------
template1:
默认的模板数据库
创建新数据库时会复制 template1 的内容
可以在 template1 中创建表、函数等,新数据库会自动包含这些对象
02.命令
a.创建数据库
基本创建:
CREATE DATABASE mydb;
-----------------------------------------------------------------------------------------------------
指定所有者:
CREATE DATABASE mydb OWNER myuser;
-----------------------------------------------------------------------------------------------------
指定模板:
CREATE DATABASE mydb TEMPLATE template0;
-----------------------------------------------------------------------------------------------------
指定字符集和排序规则:
CREATE DATABASE mydb
ENCODING 'UTF8'
LC_COLLATE 'zh_CN.UTF-8'
LC_CTYPE 'zh_CN.UTF-8'
TEMPLATE template0;
-----------------------------------------------------------------------------------------------------
指定连接限制:
CREATE DATABASE mydb CONNECTION LIMIT 100;
-----------------------------------------------------------------------------------------------------
完整示例:
CREATE DATABASE mydb
OWNER myuser
ENCODING 'UTF8'
LC_COLLATE 'en_US.UTF-8'
LC_CTYPE 'en_US.UTF-8'
TABLESPACE pg_default
CONNECTION LIMIT -1
TEMPLATE template0;
b.查看数据库
列出所有数据库:
\l # psql 命令
\l+ # 显示详细信息
SELECT datname FROM pg_database; # SQL 查询
-----------------------------------------------------------------------------------------------------
查看当前数据库:
SELECT current_database();
\c # psql 命令
-----------------------------------------------------------------------------------------------------
查看数据库详细信息:
SELECT * FROM pg_database WHERE datname = 'mydb';
-----------------------------------------------------------------------------------------------------
查看数据库大小:
SELECT pg_database_size('mydb');
SELECT pg_size_pretty(pg_database_size('mydb'));
-----------------------------------------------------------------------------------------------------
查看所有数据库大小:
SELECT datname, pg_size_pretty(pg_database_size(datname))
FROM pg_database
ORDER BY pg_database_size(datname) DESC;
c.切换数据库
使用 psql 切换:
\c mydb # 切换到 mydb
\c mydb myuser # 以 myuser 身份切换到 mydb
\c mydb myuser localhost 5432 # 指定主机和端口
-----------------------------------------------------------------------------------------------------
使用连接字符串:
psql -h localhost -U myuser -d mydb
psql postgresql://myuser:password@localhost:5432/mydb
d.删除数据库
删除数据库:
DROP DATABASE mydb;
DROP DATABASE IF EXISTS mydb; # 如果存在则删除
-----------------------------------------------------------------------------------------------------
强制删除(断开所有连接):
# PostgreSQL 13+
DROP DATABASE mydb WITH (FORCE);
-----------------------------------------------------------------------------------------------------
手动断开连接后删除:
# 查看连接
SELECT pid, usename, application_name, client_addr
FROM pg_stat_activity
WHERE datname = 'mydb';
# 终止连接
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = 'mydb' AND pid <> pg_backend_pid();
# 删除数据库
DROP DATABASE mydb;
e.修改数据库
重命名数据库:
ALTER DATABASE mydb RENAME TO newdb;
-----------------------------------------------------------------------------------------------------
修改所有者:
ALTER DATABASE mydb OWNER TO newuser;
-----------------------------------------------------------------------------------------------------
修改连接限制:
ALTER DATABASE mydb CONNECTION LIMIT 50;
-----------------------------------------------------------------------------------------------------
修改配置参数:
ALTER DATABASE mydb SET timezone TO 'Asia/Shanghai';
ALTER DATABASE mydb SET work_mem TO '16MB';
-----------------------------------------------------------------------------------------------------
重置配置参数:
ALTER DATABASE mydb RESET timezone;
ALTER DATABASE mydb RESET ALL;
03.权限管理
a.角色与用户
在 PostgreSQL 中,用户和角色是统一的概念
角色可以拥有登录权限(用户)或不拥有登录权限(组)
-----------------------------------------------------------------------------------------------------
创建角色:
CREATE ROLE myrole;
CREATE ROLE myuser LOGIN PASSWORD 'password';
CREATE USER myuser PASSWORD 'password'; # 等同于 CREATE ROLE ... LOGIN
-----------------------------------------------------------------------------------------------------
创建超级用户:
CREATE USER admin PASSWORD 'password' SUPERUSER;
-----------------------------------------------------------------------------------------------------
创建角色并授予权限:
CREATE ROLE readonly;
CREATE ROLE readwrite;
CREATE USER app_user PASSWORD 'password' IN ROLE readwrite;
-----------------------------------------------------------------------------------------------------
查看角色:
\du # psql 命令
SELECT rolname FROM pg_roles; # SQL 查询
b.权限授予
授予数据库连接权限:
GRANT CONNECT ON DATABASE mydb TO myuser;
-----------------------------------------------------------------------------------------------------
授予模式使用权限:
GRANT USAGE ON SCHEMA public TO myuser;
-----------------------------------------------------------------------------------------------------
授予表权限:
GRANT SELECT ON TABLE mytable TO myuser;
GRANT INSERT, UPDATE, DELETE ON TABLE mytable TO myuser;
GRANT ALL PRIVILEGES ON TABLE mytable TO myuser;
-----------------------------------------------------------------------------------------------------
授予所有表权限:
GRANT SELECT ON ALL TABLES IN SCHEMA public TO myuser;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO myuser;
-----------------------------------------------------------------------------------------------------
授予未来创建的表权限:
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT ON TABLES TO myuser;
-----------------------------------------------------------------------------------------------------
授予序列权限:
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO myuser;
-----------------------------------------------------------------------------------------------------
授予函数执行权限:
GRANT EXECUTE ON FUNCTION myfunction TO myuser;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA public TO myuser;
c.权限回收
回收数据库连接权限:
REVOKE CONNECT ON DATABASE mydb FROM myuser;
-----------------------------------------------------------------------------------------------------
回收表权限:
REVOKE SELECT ON TABLE mytable FROM myuser;
REVOKE ALL PRIVILEGES ON TABLE mytable FROM myuser;
-----------------------------------------------------------------------------------------------------
回收所有表权限:
REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA public FROM myuser;
-----------------------------------------------------------------------------------------------------
查看权限:
\dp mytable # psql 命令
\z mytable # 同上
SELECT grantee, privilege_type
FROM information_schema.table_privileges
WHERE table_name = 'mytable';
2.2 模式
01.介绍
a.概念
模式(Schema)是数据库中的命名空间
用于组织和管理数据库对象(表、视图、函数等)
一个数据库可以包含多个模式
不同模式中可以有同名的对象
-----------------------------------------------------------------------------------------------------
默认模式:
public:默认的公共模式,所有用户都可以访问
pg_catalog:系统目录模式,存储系统表和视图
information_schema:标准信息模式,提供数据库元数据视图
-----------------------------------------------------------------------------------------------------
模式的作用:
组织数据库对象,避免命名冲突
实现多租户应用(每个租户一个模式)
权限隔离,不同用户访问不同模式
逻辑分组,按业务模块划分模式
b.搜索路径
搜索路径(search_path)决定了查找对象的顺序
默认搜索路径:$user, public
-----------------------------------------------------------------------------------------------------
查看搜索路径:
SHOW search_path;
SELECT current_schemas(true);
-----------------------------------------------------------------------------------------------------
设置搜索路径:
SET search_path TO myschema, public;
SET search_path TO "$user", public, myschema;
-----------------------------------------------------------------------------------------------------
永久设置搜索路径:
ALTER DATABASE mydb SET search_path TO myschema, public;
ALTER ROLE myuser SET search_path TO myschema, public;
-----------------------------------------------------------------------------------------------------
搜索路径的影响:
创建对象时,如果不指定模式,会在搜索路径的第一个模式中创建
查询对象时,按搜索路径顺序查找
c.命名空间
完全限定名:schema_name.object_name
示例:public.users、myschema.orders
-----------------------------------------------------------------------------------------------------
使用完全限定名的好处:
避免歧义,明确指定对象所在的模式
不受搜索路径影响
提高查询性能(减少查找时间)
-----------------------------------------------------------------------------------------------------
示例:
SELECT * FROM public.users;
SELECT * FROM myschema.orders;
INSERT INTO myschema.products VALUES (1, 'Product A');
02.命令
a.创建模式
基本创建:
CREATE SCHEMA myschema;
-----------------------------------------------------------------------------------------------------
指定所有者:
CREATE SCHEMA myschema AUTHORIZATION myuser;
-----------------------------------------------------------------------------------------------------
创建模式并创建对象:
CREATE SCHEMA myschema
CREATE TABLE users (id int, name text)
CREATE VIEW user_view AS SELECT * FROM users;
-----------------------------------------------------------------------------------------------------
如果不存在则创建:
CREATE SCHEMA IF NOT EXISTS myschema;
b.查看模式
列出所有模式:
\dn # psql 命令
\dn+ # 显示详细信息
SELECT schema_name FROM information_schema.schemata;
-----------------------------------------------------------------------------------------------------
查看当前模式:
SELECT current_schema();
-----------------------------------------------------------------------------------------------------
查看模式中的对象:
\dt myschema.* # 查看模式中的表
\dv myschema.* # 查看模式中的视图
\df myschema.* # 查看模式中的函数
-----------------------------------------------------------------------------------------------------
查看模式大小:
SELECT schema_name,
pg_size_pretty(sum(pg_total_relation_size(quote_ident(schemaname)||'.'||quote_ident(tablename)))::bigint)
FROM pg_tables
WHERE schemaname = 'myschema'
GROUP BY schema_name;
c.删除模式
删除空模式:
DROP SCHEMA myschema;
DROP SCHEMA IF EXISTS myschema;
-----------------------------------------------------------------------------------------------------
删除模式及其所有对象:
DROP SCHEMA myschema CASCADE;
-----------------------------------------------------------------------------------------------------
注意事项:
CASCADE 会删除模式中的所有对象,包括表、视图、函数等
如果其他模式的对象依赖于该模式的对象,也会被删除
删除前务必确认,避免误删重要数据
d.修改模式
重命名模式:
ALTER SCHEMA myschema RENAME TO newschema;
-----------------------------------------------------------------------------------------------------
修改所有者:
ALTER SCHEMA myschema OWNER TO newuser;
-----------------------------------------------------------------------------------------------------
授予模式权限:
GRANT USAGE ON SCHEMA myschema TO myuser;
GRANT CREATE ON SCHEMA myschema TO myuser;
GRANT ALL ON SCHEMA myschema TO myuser;
-----------------------------------------------------------------------------------------------------
回收模式权限:
REVOKE USAGE ON SCHEMA myschema FROM myuser;
REVOKE CREATE ON SCHEMA myschema FROM myuser;
2.3 表
01.介绍
a.表类型
普通表(Regular Table):
最常用的表类型
数据持久化存储在磁盘上
支持所有 PostgreSQL 特性
-----------------------------------------------------------------------------------------------------
临时表(Temporary Table):
会话级别的表,会话结束后自动删除
只对创建它的会话可见
不记录 WAL 日志,性能更高
-----------------------------------------------------------------------------------------------------
不记录日志表(Unlogged Table):
不记录 WAL 日志,写入性能更高
崩溃后数据会丢失
适用于临时数据或可重建的数据
-----------------------------------------------------------------------------------------------------
分区表(Partitioned Table):
将大表分割成多个小表
提高查询和维护性能
支持范围分区、列表分区、哈希分区
-----------------------------------------------------------------------------------------------------
外部表(Foreign Table):
通过 FDW 访问外部数据源
数据不存储在本地
可以查询其他数据库或文件
b.命名规范
表名必须以字母或下划线开头
只能包含字母、数字、下划线
长度不超过 63 个字符
建议使用小写字母和下划线
使用复数形式(如 users、orders)
-----------------------------------------------------------------------------------------------------
命名示例:
users # 推荐
user_orders # 推荐
order_items # 推荐
UserOrders # 不推荐(大小写敏感)
123_table # 错误(不能以数字开头)
c.约束类型
主键约束(PRIMARY KEY):
唯一标识表中的每一行
不允许 NULL 值
一个表只能有一个主键
-----------------------------------------------------------------------------------------------------
外键约束(FOREIGN KEY):
建立表之间的关系
引用另一个表的主键或唯一键
支持级联更新和删除
-----------------------------------------------------------------------------------------------------
唯一约束(UNIQUE):
确保列或列组合的值唯一
允许 NULL 值(NULL 不等于 NULL)
一个表可以有多个唯一约束
-----------------------------------------------------------------------------------------------------
检查约束(CHECK):
确保列值满足指定条件
可以引用多个列
-----------------------------------------------------------------------------------------------------
非空约束(NOT NULL):
确保列值不为 NULL
-----------------------------------------------------------------------------------------------------
默认值(DEFAULT):
为列指定默认值
插入数据时如果不指定该列,使用默认值
02.表操作
a.创建表
基本创建:
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50) NOT NULL UNIQUE,
email VARCHAR(100) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-----------------------------------------------------------------------------------------------------
完整示例:
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
user_id INT NOT NULL,
order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
total_amount DECIMAL(10, 2) CHECK (total_amount >= 0),
status VARCHAR(20) DEFAULT 'pending',
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
);
-----------------------------------------------------------------------------------------------------
创建临时表:
CREATE TEMP TABLE temp_users (
id INT,
name TEXT
);
-----------------------------------------------------------------------------------------------------
创建不记录日志表:
CREATE UNLOGGED TABLE logs (
log_id SERIAL PRIMARY KEY,
message TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-----------------------------------------------------------------------------------------------------
从查询结果创建表:
CREATE TABLE users_backup AS SELECT * FROM users;
CREATE TABLE active_users AS SELECT * FROM users WHERE status = 'active';
-----------------------------------------------------------------------------------------------------
如果不存在则创建:
CREATE TABLE IF NOT EXISTS users (id INT, name TEXT);
b.修改表结构
添加列:
ALTER TABLE users ADD COLUMN age INT;
ALTER TABLE users ADD COLUMN phone VARCHAR(20) DEFAULT '';
-----------------------------------------------------------------------------------------------------
删除列:
ALTER TABLE users DROP COLUMN age;
ALTER TABLE users DROP COLUMN IF EXISTS age;
-----------------------------------------------------------------------------------------------------
修改列类型:
ALTER TABLE users ALTER COLUMN age TYPE BIGINT;
ALTER TABLE users ALTER COLUMN email TYPE VARCHAR(200);
-----------------------------------------------------------------------------------------------------
修改列名:
ALTER TABLE users RENAME COLUMN username TO user_name;
-----------------------------------------------------------------------------------------------------
设置默认值:
ALTER TABLE users ALTER COLUMN status SET DEFAULT 'active';
-----------------------------------------------------------------------------------------------------
删除默认值:
ALTER TABLE users ALTER COLUMN status DROP DEFAULT;
-----------------------------------------------------------------------------------------------------
设置非空约束:
ALTER TABLE users ALTER COLUMN email SET NOT NULL;
-----------------------------------------------------------------------------------------------------
删除非空约束:
ALTER TABLE users ALTER COLUMN email DROP NOT NULL;
-----------------------------------------------------------------------------------------------------
添加约束:
ALTER TABLE users ADD CONSTRAINT users_email_unique UNIQUE (email);
ALTER TABLE users ADD CONSTRAINT users_age_check CHECK (age >= 0);
ALTER TABLE orders ADD CONSTRAINT orders_user_fk
FOREIGN KEY (user_id) REFERENCES users(id);
-----------------------------------------------------------------------------------------------------
删除约束:
ALTER TABLE users DROP CONSTRAINT users_email_unique;
ALTER TABLE users DROP CONSTRAINT IF EXISTS users_age_check;
-----------------------------------------------------------------------------------------------------
重命名表:
ALTER TABLE users RENAME TO app_users;
-----------------------------------------------------------------------------------------------------
修改表所有者:
ALTER TABLE users OWNER TO newuser;
-----------------------------------------------------------------------------------------------------
修改表空间:
ALTER TABLE users SET TABLESPACE fastspace;
c.删除表
删除表:
DROP TABLE users;
DROP TABLE IF EXISTS users;
-----------------------------------------------------------------------------------------------------
删除表及其依赖对象:
DROP TABLE users CASCADE;
-----------------------------------------------------------------------------------------------------
清空表数据:
TRUNCATE TABLE users;
TRUNCATE TABLE users RESTART IDENTITY; # 重置自增序列
TRUNCATE TABLE users CASCADE; # 级联清空相关表
d.查看表结构
查看表结构:
\d users # psql 命令
\d+ users # 显示详细信息
-----------------------------------------------------------------------------------------------------
查看表定义:
SELECT column_name, data_type, is_nullable, column_default
FROM information_schema.columns
WHERE table_name = 'users';
-----------------------------------------------------------------------------------------------------
查看表约束:
\d+ users # psql 命令
SELECT constraint_name, constraint_type
FROM information_schema.table_constraints
WHERE table_name = 'users';
-----------------------------------------------------------------------------------------------------
查看表大小:
SELECT pg_size_pretty(pg_total_relation_size('users'));
SELECT pg_size_pretty(pg_relation_size('users')); # 表本身大小
SELECT pg_size_pretty(pg_indexes_size('users')); # 索引大小
-----------------------------------------------------------------------------------------------------
查看表的行数:
SELECT COUNT(*) FROM users;
SELECT reltuples::bigint FROM pg_class WHERE relname = 'users'; # 估算值
03.表设计
a.分区表
范围分区:
CREATE TABLE orders (
order_id SERIAL,
order_date DATE NOT NULL,
amount DECIMAL(10, 2)
) PARTITION BY RANGE (order_date);
CREATE TABLE orders_2023 PARTITION OF orders
FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');
CREATE TABLE orders_2024 PARTITION OF orders
FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');
-----------------------------------------------------------------------------------------------------
列表分区:
CREATE TABLE users (
user_id SERIAL,
country VARCHAR(50),
name TEXT
) PARTITION BY LIST (country);
CREATE TABLE users_cn PARTITION OF users
FOR VALUES IN ('China', 'CN');
CREATE TABLE users_us PARTITION OF users
FOR VALUES IN ('USA', 'US');
-----------------------------------------------------------------------------------------------------
哈希分区:
CREATE TABLE logs (
log_id SERIAL,
message TEXT,
created_at TIMESTAMP
) PARTITION BY HASH (log_id);
CREATE TABLE logs_p0 PARTITION OF logs
FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE logs_p1 PARTITION OF logs
FOR VALUES WITH (MODULUS 4, REMAINDER 1);
b.继承表
创建父表:
CREATE TABLE cities (
name TEXT,
population REAL,
elevation INT
);
-----------------------------------------------------------------------------------------------------
创建子表:
CREATE TABLE capitals (
state CHAR(2)
) INHERITS (cities);
-----------------------------------------------------------------------------------------------------
查询父表(包含子表数据):
SELECT * FROM cities;
-----------------------------------------------------------------------------------------------------
只查询父表:
SELECT * FROM ONLY cities;
c.临时表
创建会话级临时表:
CREATE TEMP TABLE temp_data (
id INT,
value TEXT
);
-----------------------------------------------------------------------------------------------------
创建事务级临时表:
CREATE TEMP TABLE temp_data (
id INT,
value TEXT
) ON COMMIT DROP; # 事务结束后删除
CREATE TEMP TABLE temp_data (
id INT,
value TEXT
) ON COMMIT DELETE ROWS; # 事务结束后删除数据,保留表结构
d.外部表
安装 FDW 扩展:
CREATE EXTENSION postgres_fdw;
-----------------------------------------------------------------------------------------------------
创建外部服务器:
CREATE SERVER foreign_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host '192.168.1.100', port '5432', dbname 'remotedb');
-----------------------------------------------------------------------------------------------------
创建用户映射:
CREATE USER MAPPING FOR myuser
SERVER foreign_server
OPTIONS (user 'remoteuser', password 'password');
-----------------------------------------------------------------------------------------------------
创建外部表:
CREATE FOREIGN TABLE foreign_users (
id INT,
name TEXT,
email TEXT
) SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'users');
-----------------------------------------------------------------------------------------------------
查询外部表:
SELECT * FROM foreign_users;
2.4 数据类型
01.数值类型
a.整数类型
SMALLINT:2 字节,范围 -32768 到 32767
INTEGER(INT):4 字节,范围 -2147483648 到 2147483647
BIGINT:8 字节,范围 -9223372036854775808 到 9223372036854775807
-----------------------------------------------------------------------------------------------------
自增类型:
SMALLSERIAL:自增 SMALLINT
SERIAL:自增 INTEGER
BIGSERIAL:自增 BIGINT
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE products (
id SERIAL PRIMARY KEY,
quantity INT,
stock BIGINT
);
b.浮点类型
REAL:4 字节,6 位十进制精度
DOUBLE PRECISION:8 字节,15 位十进制精度
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE measurements (
temperature REAL,
pressure DOUBLE PRECISION
);
-----------------------------------------------------------------------------------------------------
注意:
浮点数存在精度问题,不适合存储货币
对于货币,应使用 NUMERIC 或 DECIMAL
c.任意精度类型
NUMERIC(precision, scale):可变长度,精确数值
DECIMAL(precision, scale):等同于 NUMERIC
-----------------------------------------------------------------------------------------------------
参数说明:
precision:总位数(包括小数点前后)
scale:小数点后的位数
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE financial (
price NUMERIC(10, 2), -- 最多 10 位,小数点后 2 位
amount DECIMAL(15, 4) -- 最多 15 位,小数点后 4 位
);
-----------------------------------------------------------------------------------------------------
使用场景:
货币金额:NUMERIC(10, 2)
百分比:NUMERIC(5, 2)
科学计算:NUMERIC(20, 10)
02.字符类型
a.CHAR
固定长度字符串
不足长度用空格填充
最大长度 10485760 字节
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE codes (
country_code CHAR(2), -- 固定 2 字符
zip_code CHAR(6) -- 固定 6 字符
);
-----------------------------------------------------------------------------------------------------
使用场景:
固定长度的代码(如国家代码、邮编)
性能要求高的场景(避免长度计算)
b.VARCHAR
可变长度字符串
需要指定最大长度
实际存储长度根据内容决定
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE users (
username VARCHAR(50),
email VARCHAR(100),
address VARCHAR(255)
);
-----------------------------------------------------------------------------------------------------
使用场景:
用户名、邮箱等有长度限制的字段
需要限制输入长度的场景
c.TEXT
可变长度字符串
无长度限制(理论上限 1GB)
性能与 VARCHAR 相同
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE articles (
title TEXT,
content TEXT,
description TEXT
);
-----------------------------------------------------------------------------------------------------
使用场景:
文章内容、评论等长文本
不确定长度的字符串
-----------------------------------------------------------------------------------------------------
VARCHAR vs TEXT:
VARCHAR 需要指定长度,TEXT 不需要
性能上没有区别
建议:已知长度限制用 VARCHAR,否则用 TEXT
03.日期时间类型
a.DATE
日期类型,只存储年月日
格式:YYYY-MM-DD
范围:4713 BC 到 5874897 AD
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE events (
event_date DATE,
birth_date DATE
);
-----------------------------------------------------------------------------------------------------
常用函数:
CURRENT_DATE:当前日期
DATE '2024-01-01':日期字面量
date_column + INTERVAL '1 day':日期运算
b.TIME
时间类型,只存储时分秒
格式:HH:MI:SS
支持时区:TIME WITH TIME ZONE
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE schedules (
start_time TIME,
end_time TIME WITH TIME ZONE
);
-----------------------------------------------------------------------------------------------------
常用函数:
CURRENT_TIME:当前时间
TIME '14:30:00':时间字面量
c.TIMESTAMP
日期时间类型,存储年月日时分秒
格式:YYYY-MM-DD HH:MI:SS
支持时区:TIMESTAMP WITH TIME ZONE(推荐)
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE logs (
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);
-----------------------------------------------------------------------------------------------------
常用函数:
CURRENT_TIMESTAMP:当前时间戳
NOW():当前时间戳(等同于 CURRENT_TIMESTAMP)
TIMESTAMP '2024-01-01 14:30:00':时间戳字面量
-----------------------------------------------------------------------------------------------------
TIMESTAMP vs TIMESTAMPTZ:
TIMESTAMP:不存储时区信息
TIMESTAMPTZ:存储时区信息(推荐使用)
d.INTERVAL
时间间隔类型
用于表示一段时间
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE tasks (
duration INTERVAL,
deadline TIMESTAMP
);
-----------------------------------------------------------------------------------------------------
使用示例:
INTERVAL '1 day'
INTERVAL '2 hours 30 minutes'
INTERVAL '1 year 2 months'
-----------------------------------------------------------------------------------------------------
时间运算:
SELECT CURRENT_TIMESTAMP + INTERVAL '1 day';
SELECT CURRENT_DATE - INTERVAL '1 week';
SELECT age(TIMESTAMP '2024-01-01', TIMESTAMP '2023-01-01');
04.特殊类型
a.JSON/JSONB
JSON:文本格式存储,保留原始格式
JSONB:二进制格式存储,支持索引,性能更好(推荐)
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT,
attributes JSON,
metadata JSONB
);
-----------------------------------------------------------------------------------------------------
插入数据:
INSERT INTO products (name, attributes, metadata) VALUES
('Product A', '{"color": "red", "size": "M"}', '{"tags": ["new", "sale"]}');
-----------------------------------------------------------------------------------------------------
查询操作:
SELECT metadata->>'tags' FROM products; -- 提取字段
SELECT * FROM products WHERE metadata @> '{"tags": ["new"]}'; -- 包含查询
SELECT * FROM products WHERE metadata ? 'tags'; -- 键存在查询
-----------------------------------------------------------------------------------------------------
JSONB 索引:
CREATE INDEX idx_metadata ON products USING GIN (metadata);
b.数组
PostgreSQL 支持任意数据类型的数组
可以是一维或多维数组
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
tags TEXT[],
scores INT[]
);
-----------------------------------------------------------------------------------------------------
插入数据:
INSERT INTO posts (tags, scores) VALUES
(ARRAY['postgresql', 'database'], ARRAY[90, 85, 95]);
INSERT INTO posts (tags, scores) VALUES
('{"tag1", "tag2"}', '{80, 90}');
-----------------------------------------------------------------------------------------------------
查询操作:
SELECT tags[1] FROM posts; -- 访问元素(从 1 开始)
SELECT * FROM posts WHERE 'postgresql' = ANY(tags); -- 包含查询
SELECT * FROM posts WHERE tags @> ARRAY['postgresql']; -- 包含查询
SELECT array_length(tags, 1) FROM posts; -- 数组长度
-----------------------------------------------------------------------------------------------------
数组索引:
CREATE INDEX idx_tags ON posts USING GIN (tags);
c.UUID
通用唯一标识符
128 位,通常表示为 32 个十六进制数字
-----------------------------------------------------------------------------------------------------
示例:
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE TABLE sessions (
session_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
user_id INT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-----------------------------------------------------------------------------------------------------
生成 UUID:
SELECT uuid_generate_v4();
SELECT gen_random_uuid(); -- PostgreSQL 13+
d.枚举
用户自定义的枚举类型
限制列值为预定义的集合
-----------------------------------------------------------------------------------------------------
创建枚举类型:
CREATE TYPE status_enum AS ENUM ('pending', 'active', 'inactive', 'deleted');
CREATE TYPE priority_enum AS ENUM ('low', 'medium', 'high', 'urgent');
-----------------------------------------------------------------------------------------------------
使用枚举:
CREATE TABLE tasks (
id SERIAL PRIMARY KEY,
title TEXT,
status status_enum DEFAULT 'pending',
priority priority_enum DEFAULT 'medium'
);
-----------------------------------------------------------------------------------------------------
插入数据:
INSERT INTO tasks (title, status, priority) VALUES
('Task 1', 'active', 'high');
-----------------------------------------------------------------------------------------------------
修改枚举:
ALTER TYPE status_enum ADD VALUE 'archived';
ALTER TYPE status_enum RENAME VALUE 'deleted' TO 'removed';
e.几何类型
用于存储二维几何数据
需要 PostGIS 扩展支持更复杂的地理信息
-----------------------------------------------------------------------------------------------------
几何类型:
POINT:点 (x, y)
LINE:直线
LSEG:线段
BOX:矩形
PATH:路径
POLYGON:多边形
CIRCLE:圆
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE locations (
id SERIAL PRIMARY KEY,
name TEXT,
position POINT,
area POLYGON
);
-----------------------------------------------------------------------------------------------------
插入数据:
INSERT INTO locations (name, position, area) VALUES
('Location A', POINT(10, 20), POLYGON('((0,0),(0,10),(10,10),(10,0))'));
-----------------------------------------------------------------------------------------------------
查询操作:
SELECT position[0], position[1] FROM locations; -- 提取坐标
SELECT * FROM locations WHERE position <-> POINT(5, 5) < 10; -- 距离查询
f.网络地址类型
用于存储网络地址
-----------------------------------------------------------------------------------------------------
网络地址类型:
INET:IPv4 或 IPv6 地址
CIDR:网络地址
MACADDR:MAC 地址
MACADDR8:EUI-64 MAC 地址
-----------------------------------------------------------------------------------------------------
示例:
CREATE TABLE servers (
id SERIAL PRIMARY KEY,
name TEXT,
ip_address INET,
network CIDR,
mac_address MACADDR
);
-----------------------------------------------------------------------------------------------------
插入数据:
INSERT INTO servers (name, ip_address, network, mac_address) VALUES
('Server 1', '192.168.1.100', '192.168.1.0/24', '08:00:2b:01:02:03');
-----------------------------------------------------------------------------------------------------
查询操作:
SELECT * FROM servers WHERE ip_address << INET '192.168.1.0/24'; -- 包含查询
SELECT * FROM servers WHERE ip_address && INET '192.168.1.0/24'; -- 重叠查询
2.5 索引
01.索引类型
a.B-tree索引
默认索引类型
适用于大多数场景
支持等值查询、范围查询、排序
-----------------------------------------------------------------------------------------------------
创建 B-tree 索引:
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_name ON users(name);
-----------------------------------------------------------------------------------------------------
适用场景:
等值查询:WHERE email = '[email protected]'
范围查询:WHERE age BETWEEN 18 AND 30
排序:ORDER BY created_at
模式匹配:WHERE name LIKE 'John%'(前缀匹配)
-----------------------------------------------------------------------------------------------------
不适用场景:
后缀匹配:WHERE name LIKE '%John'
全文检索
b.Hash索引
基于哈希表的索引
只支持等值查询
不支持范围查询和排序
-----------------------------------------------------------------------------------------------------
创建 Hash 索引:
CREATE INDEX idx_users_id_hash ON users USING HASH (id);
-----------------------------------------------------------------------------------------------------
适用场景:
等值查询:WHERE id = 123
-----------------------------------------------------------------------------------------------------
注意:
PostgreSQL 10 之前 Hash 索引不支持 WAL 日志
现在已经支持,但实际使用较少
大多数情况下 B-tree 性能更好
c.GiST索引
通用搜索树(Generalized Search Tree)
支持多种数据类型和操作符
适用于几何数据、全文检索、范围类型
-----------------------------------------------------------------------------------------------------
创建 GiST 索引:
CREATE INDEX idx_locations_position ON locations USING GIST (position);
CREATE INDEX idx_ranges ON reservations USING GIST (time_range);
-----------------------------------------------------------------------------------------------------
适用场景:
几何数据查询
范围类型查询
全文检索(配合 tsvector)
最近邻查询
d.GIN索引
倒排索引(Generalized Inverted Index)
适用于包含多个值的列
如数组、JSONB、全文检索
-----------------------------------------------------------------------------------------------------
创建 GIN 索引:
CREATE INDEX idx_posts_tags ON posts USING GIN (tags);
CREATE INDEX idx_products_metadata ON products USING GIN (metadata);
CREATE INDEX idx_documents_content ON documents USING GIN (to_tsvector('english', content));
-----------------------------------------------------------------------------------------------------
适用场景:
数组包含查询:WHERE tags @> ARRAY['postgresql']
JSONB 查询:WHERE metadata @> '{"status": "active"}'
全文检索:WHERE to_tsvector('english', content) @@ to_tsquery('postgresql')
-----------------------------------------------------------------------------------------------------
GIN vs GiST:
GIN:查询速度快,但创建和更新慢,占用空间大
GiST:创建和更新快,但查询速度慢,占用空间小
e.BRIN索引
块范围索引(Block Range Index)
适用于大表且数据有序
索引非常小,但查询性能取决于数据分布
-----------------------------------------------------------------------------------------------------
创建 BRIN 索引:
CREATE INDEX idx_logs_created_at ON logs USING BRIN (created_at);
CREATE INDEX idx_orders_order_date ON orders USING BRIN (order_date);
-----------------------------------------------------------------------------------------------------
适用场景:
时间序列数据
日志表
数据按某列自然排序的大表
-----------------------------------------------------------------------------------------------------
优点:
索引非常小(几 KB 到几 MB)
创建和维护成本低
-----------------------------------------------------------------------------------------------------
缺点:
查询性能取决于数据分布
不适合随机分布的数据
f.SP-GiST索引
空间分区 GiST 索引(Space-Partitioned GiST)
适用于非平衡数据结构
如四叉树、k-d 树
-----------------------------------------------------------------------------------------------------
创建 SP-GiST 索引:
CREATE INDEX idx_points ON points USING SPGIST (location);
-----------------------------------------------------------------------------------------------------
适用场景:
电话号码
IP 地址
几何数据
文本前缀搜索
02.索引操作
a.创建索引
基本创建:
CREATE INDEX idx_users_email ON users(email);
-----------------------------------------------------------------------------------------------------
创建唯一索引:
CREATE UNIQUE INDEX idx_users_username ON users(username);
-----------------------------------------------------------------------------------------------------
创建多列索引:
CREATE INDEX idx_users_name_age ON users(name, age);
-----------------------------------------------------------------------------------------------------
创建部分索引:
CREATE INDEX idx_active_users ON users(email) WHERE status = 'active';
-----------------------------------------------------------------------------------------------------
创建表达式索引:
CREATE INDEX idx_users_lower_email ON users(LOWER(email));
CREATE INDEX idx_users_full_name ON users((first_name || ' ' || last_name));
-----------------------------------------------------------------------------------------------------
并发创建索引:
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);
-----------------------------------------------------------------------------------------------------
如果不存在则创建:
CREATE INDEX IF NOT EXISTS idx_users_email ON users(email);
b.查看索引
查看表的索引:
\di users # psql 命令
\d users # 查看表结构(包含索引)
-----------------------------------------------------------------------------------------------------
查询索引信息:
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'users';
-----------------------------------------------------------------------------------------------------
查看索引大小:
SELECT pg_size_pretty(pg_relation_size('idx_users_email'));
-----------------------------------------------------------------------------------------------------
查看索引使用情况:
SELECT schemaname, tablename, indexname, idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
WHERE tablename = 'users';
-----------------------------------------------------------------------------------------------------
查找未使用的索引:
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexname NOT LIKE '%_pkey';
c.删除索引
删除索引:
DROP INDEX idx_users_email;
DROP INDEX IF EXISTS idx_users_email;
-----------------------------------------------------------------------------------------------------
并发删除索引:
DROP INDEX CONCURRENTLY idx_users_email;
-----------------------------------------------------------------------------------------------------
删除表的所有索引:
DROP INDEX idx1, idx2, idx3;
d.重建索引
重建单个索引:
REINDEX INDEX idx_users_email;
-----------------------------------------------------------------------------------------------------
重建表的所有索引:
REINDEX TABLE users;
-----------------------------------------------------------------------------------------------------
重建数据库的所有索引:
REINDEX DATABASE mydb;
-----------------------------------------------------------------------------------------------------
并发重建索引:
REINDEX INDEX CONCURRENTLY idx_users_email;
-----------------------------------------------------------------------------------------------------
重建索引的场景:
索引损坏
索引膨胀
性能下降
03.索引优化
a.部分索引
只为满足条件的行创建索引
减少索引大小,提高性能
-----------------------------------------------------------------------------------------------------
示例:
CREATE INDEX idx_active_users ON users(email) WHERE status = 'active';
CREATE INDEX idx_recent_orders ON orders(order_date) WHERE order_date > '2024-01-01';
-----------------------------------------------------------------------------------------------------
查询时必须包含相同条件:
SELECT * FROM users WHERE email = '[email protected]' AND status = 'active';
b.表达式索引
对表达式的结果创建索引
适用于经常使用表达式的查询
-----------------------------------------------------------------------------------------------------
示例:
CREATE INDEX idx_users_lower_email ON users(LOWER(email));
CREATE INDEX idx_orders_year ON orders(EXTRACT(YEAR FROM order_date));
CREATE INDEX idx_users_age ON users((CURRENT_DATE - birth_date));
-----------------------------------------------------------------------------------------------------
查询时必须使用相同表达式:
SELECT * FROM users WHERE LOWER(email) = '[email protected]';
c.多列索引
为多个列创建单个索引
适用于经常一起查询的列
-----------------------------------------------------------------------------------------------------
示例:
CREATE INDEX idx_users_name_age ON users(name, age);
CREATE INDEX idx_orders_user_date ON orders(user_id, order_date);
-----------------------------------------------------------------------------------------------------
列的顺序很重要:
最常用于过滤的列放在前面
基数高的列放在前面
-----------------------------------------------------------------------------------------------------
索引可以用于:
WHERE name = 'John' AND age = 30
WHERE name = 'John'
ORDER BY name, age
-----------------------------------------------------------------------------------------------------
索引不能用于:
WHERE age = 30(跳过了第一列)
d.覆盖索引
索引包含查询所需的所有列
查询可以只扫描索引,不需要访问表
-----------------------------------------------------------------------------------------------------
创建覆盖索引:
CREATE INDEX idx_users_email_name ON users(email) INCLUDE (name);
-----------------------------------------------------------------------------------------------------
查询示例:
SELECT name FROM users WHERE email = '[email protected]';
-- 只需扫描索引,不需要访问表
e.索引选择策略
为经常出现在 WHERE 子句的列创建索引
为经常用于 JOIN 的列创建索引
为经常用于 ORDER BY 的列创建索引
不要为小表创建索引
不要为低基数列创建索引(如性别、状态等)
-----------------------------------------------------------------------------------------------------
监控索引使用情况:
定期检查索引的使用频率
删除未使用的索引
重建膨胀的索引
-----------------------------------------------------------------------------------------------------
索引的代价:
占用磁盘空间
降低写入性能(INSERT、UPDATE、DELETE)
需要维护成本
2.6 视图
01.普通视图
a.创建视图
基本创建:
CREATE VIEW active_users AS
SELECT id, username, email
FROM users
WHERE status = 'active';
-----------------------------------------------------------------------------------------------------
创建或替换视图:
CREATE OR REPLACE VIEW active_users AS
SELECT id, username, email, created_at
FROM users
WHERE status = 'active';
-----------------------------------------------------------------------------------------------------
创建带列名的视图:
CREATE VIEW user_summary (user_id, user_name, user_email) AS
SELECT id, username, email
FROM users;
-----------------------------------------------------------------------------------------------------
创建复杂视图:
CREATE VIEW order_summary AS
SELECT
u.id AS user_id,
u.username,
COUNT(o.order_id) AS total_orders,
SUM(o.total_amount) AS total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.username;
b.修改视图
替换视图:
CREATE OR REPLACE VIEW active_users AS
SELECT id, username, email, phone
FROM users
WHERE status = 'active';
-----------------------------------------------------------------------------------------------------
重命名视图:
ALTER VIEW active_users RENAME TO current_users;
-----------------------------------------------------------------------------------------------------
修改视图所有者:
ALTER VIEW active_users OWNER TO newuser;
-----------------------------------------------------------------------------------------------------
修改视图模式:
ALTER VIEW active_users SET SCHEMA newschema;
c.删除视图
删除视图:
DROP VIEW active_users;
DROP VIEW IF EXISTS active_users;
-----------------------------------------------------------------------------------------------------
删除视图及其依赖对象:
DROP VIEW active_users CASCADE;
-----------------------------------------------------------------------------------------------------
删除多个视图:
DROP VIEW view1, view2, view3;
d.查询视图
查询视图与查询表相同:
SELECT * FROM active_users;
SELECT * FROM active_users WHERE username LIKE 'John%';
-----------------------------------------------------------------------------------------------------
查看视图定义:
\d+ active_users # psql 命令
SELECT definition FROM pg_views WHERE viewname = 'active_users';
-----------------------------------------------------------------------------------------------------
查看所有视图:
\dv # psql 命令
SELECT viewname FROM pg_views WHERE schemaname = 'public';
02.物化视图
a.创建物化视图
基本创建:
CREATE MATERIALIZED VIEW user_stats AS
SELECT
DATE_TRUNC('day', created_at) AS date,
COUNT(*) AS user_count
FROM users
GROUP BY DATE_TRUNC('day', created_at);
-----------------------------------------------------------------------------------------------------
创建带索引的物化视图:
CREATE MATERIALIZED VIEW user_stats AS
SELECT
DATE_TRUNC('day', created_at) AS date,
COUNT(*) AS user_count
FROM users
GROUP BY DATE_TRUNC('day', created_at);
CREATE UNIQUE INDEX idx_user_stats_date ON user_stats(date);
-----------------------------------------------------------------------------------------------------
创建不存储数据的物化视图:
CREATE MATERIALIZED VIEW user_stats AS
SELECT
DATE_TRUNC('day', created_at) AS date,
COUNT(*) AS user_count
FROM users
GROUP BY DATE_TRUNC('day', created_at)
WITH NO DATA;
b.刷新物化视图
完全刷新:
REFRESH MATERIALIZED VIEW user_stats;
-----------------------------------------------------------------------------------------------------
并发刷新(不阻塞查询):
REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats;
-- 需要在物化视图上创建唯一索引
-----------------------------------------------------------------------------------------------------
刷新策略:
定时刷新:使用 cron 或 pg_cron 扩展
触发器刷新:在基表更新时触发刷新
手动刷新:根据业务需求手动刷新
c.使用场景
复杂聚合查询:
避免每次查询都执行复杂的聚合计算
预先计算结果,提高查询性能
-----------------------------------------------------------------------------------------------------
报表查询:
生成日报、周报、月报
避免实时计算,减轻数据库压力
-----------------------------------------------------------------------------------------------------
数据仓库:
ETL 过程中的中间结果
数据汇总和分析
-----------------------------------------------------------------------------------------------------
缓存查询结果:
对于变化不频繁的数据
可以使用物化视图缓存查询结果
d.物化视图 vs 普通视图
普通视图:
不存储数据,每次查询都执行底层查询
数据始终是最新的
适合简单查询和实时数据
-----------------------------------------------------------------------------------------------------
物化视图:
存储查询结果,查询速度快
数据可能不是最新的,需要手动刷新
适合复杂查询和对实时性要求不高的场景
-----------------------------------------------------------------------------------------------------
选择建议:
实时性要求高:使用普通视图
查询复杂且数据变化不频繁:使用物化视图
需要频繁刷新:考虑使用触发器或定时任务
3 PostgreSQL进阶
3.1 SQL语句
01.DQL查询
a.基础查询
查询所有列:
SELECT * FROM users;
-----------------------------------------------------------------------------------------------------
查询指定列:
SELECT id, username, email FROM users;
-----------------------------------------------------------------------------------------------------
查询去重:
SELECT DISTINCT country FROM users;
-----------------------------------------------------------------------------------------------------
查询限制行数:
SELECT * FROM users LIMIT 10;
SELECT * FROM users LIMIT 10 OFFSET 20; -- 跳过前 20 行
-----------------------------------------------------------------------------------------------------
查询别名:
SELECT id AS user_id, username AS name FROM users;
SELECT u.id, u.username FROM users AS u;
b.条件查询
WHERE 子句:
SELECT * FROM users WHERE age > 18;
SELECT * FROM users WHERE status = 'active';
SELECT * FROM users WHERE email LIKE '%@gmail.com';
-----------------------------------------------------------------------------------------------------
比较运算符:
=, !=, <>, >, <, >=, <=
SELECT * FROM products WHERE price >= 100;
-----------------------------------------------------------------------------------------------------
逻辑运算符:
AND, OR, NOT
SELECT * FROM users WHERE age > 18 AND status = 'active';
SELECT * FROM users WHERE country = 'US' OR country = 'UK';
SELECT * FROM users WHERE NOT status = 'deleted';
-----------------------------------------------------------------------------------------------------
范围查询:
BETWEEN ... AND ...
SELECT * FROM orders WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31';
-----------------------------------------------------------------------------------------------------
IN 查询:
SELECT * FROM users WHERE country IN ('US', 'UK', 'CA');
SELECT * FROM users WHERE id IN (1, 2, 3, 4, 5);
-----------------------------------------------------------------------------------------------------
NULL 查询:
SELECT * FROM users WHERE phone IS NULL;
SELECT * FROM users WHERE phone IS NOT NULL;
-----------------------------------------------------------------------------------------------------
模糊查询:
LIKE, ILIKE(不区分大小写)
SELECT * FROM users WHERE username LIKE 'John%'; -- 前缀匹配
SELECT * FROM users WHERE username LIKE '%Smith'; -- 后缀匹配
SELECT * FROM users WHERE username LIKE '%admin%'; -- 包含匹配
SELECT * FROM users WHERE username ILIKE 'JOHN%'; -- 不区分大小写
-----------------------------------------------------------------------------------------------------
正则表达式:
~ 区分大小写,~* 不区分大小写
SELECT * FROM users WHERE username ~ '^[A-Z]';
SELECT * FROM users WHERE email ~* '@gmail\.com$';
c.聚合查询
聚合函数:
COUNT(*):计数
SUM(column):求和
AVG(column):平均值
MAX(column):最大值
MIN(column):最小值
-----------------------------------------------------------------------------------------------------
示例:
SELECT COUNT(*) FROM users;
SELECT COUNT(DISTINCT country) FROM users;
SELECT SUM(total_amount) FROM orders;
SELECT AVG(age) FROM users;
SELECT MAX(created_at), MIN(created_at) FROM users;
d.分组查询
GROUP BY:
SELECT country, COUNT(*) FROM users GROUP BY country;
SELECT status, AVG(age) FROM users GROUP BY status;
-----------------------------------------------------------------------------------------------------
HAVING:
SELECT country, COUNT(*) as user_count
FROM users
GROUP BY country
HAVING COUNT(*) > 100;
-----------------------------------------------------------------------------------------------------
多列分组:
SELECT country, status, COUNT(*)
FROM users
GROUP BY country, status;
e.排序与分页
ORDER BY:
SELECT * FROM users ORDER BY created_at DESC;
SELECT * FROM users ORDER BY age ASC, username DESC;
-----------------------------------------------------------------------------------------------------
分页:
SELECT * FROM users ORDER BY id LIMIT 10 OFFSET 0; -- 第 1 页
SELECT * FROM users ORDER BY id LIMIT 10 OFFSET 10; -- 第 2 页
SELECT * FROM users ORDER BY id LIMIT 10 OFFSET 20; -- 第 3 页
f.子查询
WHERE 子查询:
SELECT * FROM users WHERE id IN (SELECT user_id FROM orders WHERE total_amount > 1000);
-----------------------------------------------------------------------------------------------------
FROM 子查询:
SELECT * FROM (SELECT * FROM users WHERE age > 18) AS adults;
-----------------------------------------------------------------------------------------------------
EXISTS 子查询:
SELECT * FROM users u WHERE EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id);
-----------------------------------------------------------------------------------------------------
标量子查询:
SELECT username, (SELECT COUNT(*) FROM orders WHERE user_id = users.id) AS order_count
FROM users;
g.连接查询
INNER JOIN:
SELECT u.username, o.order_id, o.total_amount
FROM users u
INNER JOIN orders o ON u.id = o.user_id;
-----------------------------------------------------------------------------------------------------
LEFT JOIN:
SELECT u.username, o.order_id
FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
-----------------------------------------------------------------------------------------------------
RIGHT JOIN:
SELECT u.username, o.order_id
FROM users u
RIGHT JOIN orders o ON u.id = o.user_id;
-----------------------------------------------------------------------------------------------------
FULL OUTER JOIN:
SELECT u.username, o.order_id
FROM users u
FULL OUTER JOIN orders o ON u.id = o.user_id;
-----------------------------------------------------------------------------------------------------
CROSS JOIN:
SELECT u.username, p.product_name
FROM users u
CROSS JOIN products p;
-----------------------------------------------------------------------------------------------------
SELF JOIN:
SELECT e1.name AS employee, e2.name AS manager
FROM employees e1
LEFT JOIN employees e2 ON e1.manager_id = e2.id;
h.窗口函数
ROW_NUMBER():
SELECT username, age, ROW_NUMBER() OVER (ORDER BY age DESC) AS rank
FROM users;
-----------------------------------------------------------------------------------------------------
RANK() 和 DENSE_RANK():
SELECT username, score,
RANK() OVER (ORDER BY score DESC) AS rank,
DENSE_RANK() OVER (ORDER BY score DESC) AS dense_rank
FROM users;
-----------------------------------------------------------------------------------------------------
PARTITION BY:
SELECT username, country, age,
ROW_NUMBER() OVER (PARTITION BY country ORDER BY age DESC) AS rank_in_country
FROM users;
-----------------------------------------------------------------------------------------------------
聚合窗口函数:
SELECT username, salary,
SUM(salary) OVER (ORDER BY id) AS running_total,
AVG(salary) OVER (ORDER BY id ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg
FROM employees;
-----------------------------------------------------------------------------------------------------
LEAD() 和 LAG():
SELECT order_date,
total_amount,
LAG(total_amount) OVER (ORDER BY order_date) AS prev_amount,
LEAD(total_amount) OVER (ORDER BY order_date) AS next_amount
FROM orders;
i.CTE公用表表达式
基本 CTE:
WITH active_users AS (
SELECT * FROM users WHERE status = 'active'
)
SELECT * FROM active_users WHERE age > 18;
-----------------------------------------------------------------------------------------------------
多个 CTE:
WITH
active_users AS (
SELECT * FROM users WHERE status = 'active'
),
user_orders AS (
SELECT user_id, COUNT(*) AS order_count
FROM orders
GROUP BY user_id
)
SELECT u.username, uo.order_count
FROM active_users u
LEFT JOIN user_orders uo ON u.id = uo.user_id;
-----------------------------------------------------------------------------------------------------
递归 CTE:
WITH RECURSIVE subordinates AS (
SELECT id, name, manager_id, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.id, e.name, e.manager_id, s.level + 1
FROM employees e
INNER JOIN subordinates s ON e.manager_id = s.id
)
SELECT * FROM subordinates;
02.DML操作
a.INSERT插入
插入单行:
INSERT INTO users (username, email, age) VALUES ('john', '[email protected]', 25);
-----------------------------------------------------------------------------------------------------
插入多行:
INSERT INTO users (username, email, age) VALUES
('alice', '[email protected]', 30),
('bob', '[email protected]', 28),
('charlie', '[email protected]', 35);
-----------------------------------------------------------------------------------------------------
从查询结果插入:
INSERT INTO users_backup SELECT * FROM users WHERE status = 'active';
-----------------------------------------------------------------------------------------------------
插入并返回:
INSERT INTO users (username, email) VALUES ('david', '[email protected]')
RETURNING id, username, created_at;
b.UPDATE更新
更新单行:
UPDATE users SET email = '[email protected]' WHERE id = 1;
-----------------------------------------------------------------------------------------------------
更新多列:
UPDATE users SET email = '[email protected]', age = 26 WHERE id = 1;
-----------------------------------------------------------------------------------------------------
更新多行:
UPDATE users SET status = 'inactive' WHERE last_login < '2023-01-01';
-----------------------------------------------------------------------------------------------------
使用子查询更新:
UPDATE orders SET status = 'shipped'
WHERE user_id IN (SELECT id FROM users WHERE country = 'US');
-----------------------------------------------------------------------------------------------------
更新并返回:
UPDATE users SET status = 'active' WHERE id = 1
RETURNING id, username, status, updated_at;
c.DELETE删除
删除单行:
DELETE FROM users WHERE id = 1;
-----------------------------------------------------------------------------------------------------
删除多行:
DELETE FROM users WHERE status = 'deleted' AND created_at < '2020-01-01';
-----------------------------------------------------------------------------------------------------
使用子查询删除:
DELETE FROM orders WHERE user_id IN (SELECT id FROM users WHERE status = 'deleted');
-----------------------------------------------------------------------------------------------------
删除并返回:
DELETE FROM users WHERE id = 1 RETURNING *;
d.UPSERT(INSERT ON CONFLICT)
插入或更新:
INSERT INTO users (id, username, email)
VALUES (1, 'john', '[email protected]')
ON CONFLICT (id) DO UPDATE SET email = EXCLUDED.email;
-----------------------------------------------------------------------------------------------------
插入或忽略:
INSERT INTO users (username, email)
VALUES ('john', '[email protected]')
ON CONFLICT (username) DO NOTHING;
-----------------------------------------------------------------------------------------------------
条件更新:
INSERT INTO products (id, name, price)
VALUES (1, 'Product A', 100)
ON CONFLICT (id) DO UPDATE
SET price = EXCLUDED.price
WHERE products.price < EXCLUDED.price;
e.RETURNING子句
INSERT RETURNING:
INSERT INTO users (username, email) VALUES ('john', '[email protected]')
RETURNING id, username, created_at;
-----------------------------------------------------------------------------------------------------
UPDATE RETURNING:
UPDATE users SET status = 'active' WHERE id = 1
RETURNING id, username, status;
-----------------------------------------------------------------------------------------------------
DELETE RETURNING:
DELETE FROM users WHERE id = 1 RETURNING *;
03.DDL定义
a.CREATE创建
创建表:
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50) NOT NULL UNIQUE,
email VARCHAR(100) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-----------------------------------------------------------------------------------------------------
创建索引:
CREATE INDEX idx_users_email ON users(email);
-----------------------------------------------------------------------------------------------------
创建视图:
CREATE VIEW active_users AS SELECT * FROM users WHERE status = 'active';
-----------------------------------------------------------------------------------------------------
创建函数:
CREATE FUNCTION get_user_count() RETURNS INTEGER AS $$
BEGIN
RETURN (SELECT COUNT(*) FROM users);
END;
$$ LANGUAGE plpgsql;
b.ALTER修改
修改表:
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
ALTER TABLE users DROP COLUMN phone;
ALTER TABLE users ALTER COLUMN email TYPE VARCHAR(200);
ALTER TABLE users RENAME COLUMN username TO user_name;
-----------------------------------------------------------------------------------------------------
修改索引:
ALTER INDEX idx_users_email RENAME TO idx_users_email_new;
-----------------------------------------------------------------------------------------------------
修改视图:
CREATE OR REPLACE VIEW active_users AS
SELECT id, username, email FROM users WHERE status = 'active';
c.DROP删除
删除表:
DROP TABLE users;
DROP TABLE IF EXISTS users CASCADE;
-----------------------------------------------------------------------------------------------------
删除索引:
DROP INDEX idx_users_email;
-----------------------------------------------------------------------------------------------------
删除视图:
DROP VIEW active_users;
d.TRUNCATE清空
清空表:
TRUNCATE TABLE users;
TRUNCATE TABLE users RESTART IDENTITY;
TRUNCATE TABLE users CASCADE;
04.DCL控制
a.GRANT授权
授予表权限:
GRANT SELECT ON users TO myuser;
GRANT INSERT, UPDATE, DELETE ON users TO myuser;
GRANT ALL PRIVILEGES ON users TO myuser;
-----------------------------------------------------------------------------------------------------
授予所有表权限:
GRANT SELECT ON ALL TABLES IN SCHEMA public TO myuser;
-----------------------------------------------------------------------------------------------------
授予数据库权限:
GRANT CONNECT ON DATABASE mydb TO myuser;
-----------------------------------------------------------------------------------------------------
授予模式权限:
GRANT USAGE ON SCHEMA public TO myuser;
b.REVOKE撤销
撤销表权限:
REVOKE SELECT ON users FROM myuser;
REVOKE ALL PRIVILEGES ON users FROM myuser;
-----------------------------------------------------------------------------------------------------
撤销所有表权限:
REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA public FROM myuser;
c.角色管理
创建角色:
CREATE ROLE readonly;
CREATE ROLE readwrite;
CREATE USER myuser PASSWORD 'password' IN ROLE readwrite;
-----------------------------------------------------------------------------------------------------
授予角色:
GRANT readonly TO myuser;
GRANT readwrite TO myuser;
-----------------------------------------------------------------------------------------------------
撤销角色:
REVOKE readonly FROM myuser;
05.TCL事务
a.BEGIN开启事务
开启事务:
BEGIN;
BEGIN TRANSACTION;
START TRANSACTION;
b.COMMIT提交
提交事务:
COMMIT;
COMMIT TRANSACTION;
c.ROLLBACK回滚
回滚事务:
ROLLBACK;
ROLLBACK TRANSACTION;
-----------------------------------------------------------------------------------------------------
回滚到保存点:
ROLLBACK TO SAVEPOINT sp1;
d.SAVEPOINT保存点
创建保存点:
BEGIN;
INSERT INTO users (username) VALUES ('user1');
SAVEPOINT sp1;
INSERT INTO users (username) VALUES ('user2');
SAVEPOINT sp2;
INSERT INTO users (username) VALUES ('user3');
ROLLBACK TO SAVEPOINT sp2;
COMMIT;
-----------------------------------------------------------------------------------------------------
释放保存点:
RELEASE SAVEPOINT sp1;
3.2 函数
01.内置函数
a.字符串函数
LENGTH():返回字符串长度
SELECT LENGTH('Hello'); -- 5
-----------------------------------------------------------------------------------------------------
CONCAT():连接字符串
SELECT CONCAT('Hello', ' ', 'World'); -- Hello World
SELECT 'Hello' || ' ' || 'World'; -- Hello World
-----------------------------------------------------------------------------------------------------
UPPER() / LOWER():大小写转换
SELECT UPPER('hello'); -- HELLO
SELECT LOWER('HELLO'); -- hello
-----------------------------------------------------------------------------------------------------
TRIM() / LTRIM() / RTRIM():去除空格
SELECT TRIM(' hello '); -- hello
SELECT LTRIM(' hello '); -- 'hello '
SELECT RTRIM(' hello '); -- ' hello'
-----------------------------------------------------------------------------------------------------
SUBSTRING():提取子字符串
SELECT SUBSTRING('Hello World', 1, 5); -- Hello
SELECT SUBSTRING('Hello World' FROM 7); -- World
-----------------------------------------------------------------------------------------------------
REPLACE():替换字符串
SELECT REPLACE('Hello World', 'World', 'PostgreSQL'); -- Hello PostgreSQL
-----------------------------------------------------------------------------------------------------
SPLIT_PART():分割字符串
SELECT SPLIT_PART('a,b,c', ',', 2); -- b
-----------------------------------------------------------------------------------------------------
POSITION():查找子字符串位置
SELECT POSITION('World' IN 'Hello World'); -- 7
-----------------------------------------------------------------------------------------------------
LEFT() / RIGHT():提取左右子字符串
SELECT LEFT('Hello World', 5); -- Hello
SELECT RIGHT('Hello World', 5); -- World
b.数值函数
ABS():绝对值
SELECT ABS(-10); -- 10
-----------------------------------------------------------------------------------------------------
ROUND():四舍五入
SELECT ROUND(3.14159, 2); -- 3.14
-----------------------------------------------------------------------------------------------------
CEIL() / FLOOR():向上/向下取整
SELECT CEIL(3.14); -- 4
SELECT FLOOR(3.14); -- 3
-----------------------------------------------------------------------------------------------------
MOD():取模
SELECT MOD(10, 3); -- 1
-----------------------------------------------------------------------------------------------------
POWER():幂运算
SELECT POWER(2, 3); -- 8
-----------------------------------------------------------------------------------------------------
SQRT():平方根
SELECT SQRT(16); -- 4
-----------------------------------------------------------------------------------------------------
RANDOM():随机数
SELECT RANDOM(); -- 0 到 1 之间的随机数
SELECT FLOOR(RANDOM() * 100); -- 0 到 99 之间的随机整数
c.日期时间函数
CURRENT_DATE / CURRENT_TIME / CURRENT_TIMESTAMP:当前日期时间
SELECT CURRENT_DATE; -- 2024-01-01
SELECT CURRENT_TIME; -- 14:30:00
SELECT CURRENT_TIMESTAMP; -- 2024-01-01 14:30:00
SELECT NOW(); -- 等同于 CURRENT_TIMESTAMP
-----------------------------------------------------------------------------------------------------
EXTRACT():提取日期部分
SELECT EXTRACT(YEAR FROM TIMESTAMP '2024-01-01 14:30:00'); -- 2024
SELECT EXTRACT(MONTH FROM TIMESTAMP '2024-01-01 14:30:00'); -- 1
SELECT EXTRACT(DAY FROM TIMESTAMP '2024-01-01 14:30:00'); -- 1
SELECT EXTRACT(HOUR FROM TIMESTAMP '2024-01-01 14:30:00'); -- 14
-----------------------------------------------------------------------------------------------------
DATE_TRUNC():截断日期
SELECT DATE_TRUNC('day', TIMESTAMP '2024-01-01 14:30:00'); -- 2024-01-01 00:00:00
SELECT DATE_TRUNC('month', TIMESTAMP '2024-01-15 14:30:00'); -- 2024-01-01 00:00:00
SELECT DATE_TRUNC('year', TIMESTAMP '2024-06-15 14:30:00'); -- 2024-01-01 00:00:00
-----------------------------------------------------------------------------------------------------
AGE():计算时间差
SELECT AGE(TIMESTAMP '2024-01-01', TIMESTAMP '2023-01-01'); -- 1 year
SELECT AGE(TIMESTAMP '2024-01-01'); -- 从当前时间到指定时间的差
-----------------------------------------------------------------------------------------------------
TO_CHAR():格式化日期
SELECT TO_CHAR(NOW(), 'YYYY-MM-DD HH24:MI:SS');
SELECT TO_CHAR(NOW(), 'Day, DD Month YYYY');
-----------------------------------------------------------------------------------------------------
TO_DATE() / TO_TIMESTAMP():字符串转日期
SELECT TO_DATE('2024-01-01', 'YYYY-MM-DD');
SELECT TO_TIMESTAMP('2024-01-01 14:30:00', 'YYYY-MM-DD HH24:MI:SS');
d.聚合函数
COUNT():计数
SELECT COUNT(*) FROM users;
SELECT COUNT(DISTINCT country) FROM users;
-----------------------------------------------------------------------------------------------------
SUM():求和
SELECT SUM(total_amount) FROM orders;
-----------------------------------------------------------------------------------------------------
AVG():平均值
SELECT AVG(age) FROM users;
-----------------------------------------------------------------------------------------------------
MAX() / MIN():最大值/最小值
SELECT MAX(created_at) FROM users;
SELECT MIN(created_at) FROM users;
-----------------------------------------------------------------------------------------------------
STRING_AGG():字符串聚合
SELECT STRING_AGG(username, ', ') FROM users;
SELECT STRING_AGG(username, ', ' ORDER BY username) FROM users;
-----------------------------------------------------------------------------------------------------
ARRAY_AGG():数组聚合
SELECT ARRAY_AGG(username) FROM users;
SELECT ARRAY_AGG(username ORDER BY username) FROM users;
e.JSON函数
JSON 提取:
SELECT '{"name": "John", "age": 30}'::json -> 'name'; -- "John"
SELECT '{"name": "John", "age": 30}'::json ->> 'name'; -- John
-----------------------------------------------------------------------------------------------------
JSONB 提取:
SELECT '{"name": "John", "age": 30}'::jsonb -> 'name';
SELECT '{"name": "John", "age": 30}'::jsonb ->> 'name';
-----------------------------------------------------------------------------------------------------
JSON 路径提取:
SELECT '{"user": {"name": "John"}}'::jsonb #> '{user,name}';
SELECT '{"user": {"name": "John"}}'::jsonb #>> '{user,name}';
-----------------------------------------------------------------------------------------------------
JSON 包含:
SELECT '{"name": "John", "age": 30}'::jsonb @> '{"name": "John"}'; -- true
-----------------------------------------------------------------------------------------------------
JSON 键存在:
SELECT '{"name": "John", "age": 30}'::jsonb ? 'name'; -- true
-----------------------------------------------------------------------------------------------------
JSONB_BUILD_OBJECT():构建 JSON 对象
SELECT JSONB_BUILD_OBJECT('name', 'John', 'age', 30);
-----------------------------------------------------------------------------------------------------
JSONB_AGG():聚合为 JSON 数组
SELECT JSONB_AGG(username) FROM users;
02.自定义函数
a.创建函数
基本函数:
CREATE FUNCTION add_numbers(a INTEGER, b INTEGER) RETURNS INTEGER AS $$
BEGIN
RETURN a + b;
END;
$$ LANGUAGE plpgsql;
-----------------------------------------------------------------------------------------------------
调用函数:
SELECT add_numbers(10, 20); -- 30
-----------------------------------------------------------------------------------------------------
返回表的函数:
CREATE FUNCTION get_active_users() RETURNS TABLE(id INT, username TEXT) AS $$
BEGIN
RETURN QUERY SELECT id, username FROM users WHERE status = 'active';
END;
$$ LANGUAGE plpgsql;
-----------------------------------------------------------------------------------------------------
调用返回表的函数:
SELECT * FROM get_active_users();
-----------------------------------------------------------------------------------------------------
带默认参数的函数:
CREATE FUNCTION greet(name TEXT DEFAULT 'World') RETURNS TEXT AS $$
BEGIN
RETURN 'Hello, ' || name || '!';
END;
$$ LANGUAGE plpgsql;
SELECT greet(); -- Hello, World!
SELECT greet('John'); -- Hello, John!
b.函数语言
PL/pgSQL:
PostgreSQL 的过程语言,类似 SQL
支持变量、控制结构、异常处理
-----------------------------------------------------------------------------------------------------
SQL:
纯 SQL 函数,性能更好
CREATE FUNCTION get_user_count() RETURNS BIGINT AS $$
SELECT COUNT(*) FROM users;
$$ LANGUAGE sql;
-----------------------------------------------------------------------------------------------------
PL/Python:
使用 Python 编写函数
CREATE EXTENSION plpythonu;
CREATE FUNCTION pymax(a INTEGER, b INTEGER) RETURNS INTEGER AS $$
return max(a, b)
$$ LANGUAGE plpythonu;
-----------------------------------------------------------------------------------------------------
PL/Perl:
使用 Perl 编写函数
CREATE EXTENSION plperl;
c.控制结构
IF 语句:
CREATE FUNCTION check_age(age INTEGER) RETURNS TEXT AS $$
BEGIN
IF age < 18 THEN
RETURN 'Minor';
ELSIF age < 65 THEN
RETURN 'Adult';
ELSE
RETURN 'Senior';
END IF;
END;
$$ LANGUAGE plpgsql;
-----------------------------------------------------------------------------------------------------
CASE 语句:
CREATE FUNCTION get_grade(score INTEGER) RETURNS CHAR AS $$
BEGIN
RETURN CASE
WHEN score >= 90 THEN 'A'
WHEN score >= 80 THEN 'B'
WHEN score >= 70 THEN 'C'
WHEN score >= 60 THEN 'D'
ELSE 'F'
END;
END;
$$ LANGUAGE plpgsql;
-----------------------------------------------------------------------------------------------------
LOOP 循环:
CREATE FUNCTION sum_to_n(n INTEGER) RETURNS INTEGER AS $$
DECLARE
i INTEGER := 1;
total INTEGER := 0;
BEGIN
LOOP
EXIT WHEN i > n;
total := total + i;
i := i + 1;
END LOOP;
RETURN total;
END;
$$ LANGUAGE plpgsql;
-----------------------------------------------------------------------------------------------------
WHILE 循环:
CREATE FUNCTION factorial(n INTEGER) RETURNS INTEGER AS $$
DECLARE
result INTEGER := 1;
i INTEGER := 1;
BEGIN
WHILE i <= n LOOP
result := result * i;
i := i + 1;
END LOOP;
RETURN result;
END;
$$ LANGUAGE plpgsql;
-----------------------------------------------------------------------------------------------------
FOR 循环:
CREATE FUNCTION print_numbers(n INTEGER) RETURNS VOID AS $$
DECLARE
i INTEGER;
BEGIN
FOR i IN 1..n LOOP
RAISE NOTICE 'Number: %', i;
END LOOP;
END;
$$ LANGUAGE plpgsql;
d.异常处理
捕获异常:
CREATE FUNCTION safe_divide(a NUMERIC, b NUMERIC) RETURNS NUMERIC AS $$
BEGIN
RETURN a / b;
EXCEPTION
WHEN division_by_zero THEN
RAISE NOTICE 'Division by zero!';
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
-----------------------------------------------------------------------------------------------------
抛出异常:
CREATE FUNCTION check_positive(n INTEGER) RETURNS VOID AS $$
BEGIN
IF n <= 0 THEN
RAISE EXCEPTION 'Number must be positive';
END IF;
END;
$$ LANGUAGE plpgsql;
e.删除函数
删除函数:
DROP FUNCTION add_numbers(INTEGER, INTEGER);
DROP FUNCTION IF EXISTS add_numbers(INTEGER, INTEGER);
-----------------------------------------------------------------------------------------------------
替换函数:
CREATE OR REPLACE FUNCTION add_numbers(a INTEGER, b INTEGER) RETURNS INTEGER AS $$
BEGIN
RETURN a + b;
END;
$$ LANGUAGE plpgsql;
3.3 触发器
01.触发器基础
a.触发器类型
行级触发器(FOR EACH ROW):
每影响一行就触发一次
可以访问 NEW 和 OLD 变量
-----------------------------------------------------------------------------------------------------
语句级触发器(FOR EACH STATEMENT):
每个 SQL 语句触发一次
不能访问 NEW 和 OLD 变量
-----------------------------------------------------------------------------------------------------
触发时机:
BEFORE:在操作执行前触发
AFTER:在操作执行后触发
INSTEAD OF:替代操作(仅用于视图)
-----------------------------------------------------------------------------------------------------
触发事件:
INSERT:插入时触发
UPDATE:更新时触发
DELETE:删除时触发
TRUNCATE:清空表时触发(仅语句级)
b.创建触发器
基本触发器:
CREATE TRIGGER update_timestamp
BEFORE UPDATE ON users
FOR EACH ROW
EXECUTE FUNCTION update_modified_column();
-----------------------------------------------------------------------------------------------------
触发器函数:
CREATE FUNCTION update_modified_column() RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-----------------------------------------------------------------------------------------------------
完整示例:
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50),
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
CREATE FUNCTION update_timestamp() RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER users_update_timestamp
BEFORE UPDATE ON users
FOR EACH ROW
EXECUTE FUNCTION update_timestamp();
c.NEW和OLD变量
NEW:新行数据(INSERT 和 UPDATE 可用)
OLD:旧行数据(UPDATE 和 DELETE 可用)
-----------------------------------------------------------------------------------------------------
示例:
CREATE FUNCTION log_user_changes() RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' THEN
INSERT INTO user_audit (user_id, action, new_data)
VALUES (NEW.id, 'INSERT', row_to_json(NEW));
ELSIF TG_OP = 'UPDATE' THEN
INSERT INTO user_audit (user_id, action, old_data, new_data)
VALUES (NEW.id, 'UPDATE', row_to_json(OLD), row_to_json(NEW));
ELSIF TG_OP = 'DELETE' THEN
INSERT INTO user_audit (user_id, action, old_data)
VALUES (OLD.id, 'DELETE', row_to_json(OLD));
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
02.触发器应用
a.审计日志
创建审计表:
CREATE TABLE audit_log (
id SERIAL PRIMARY KEY,
table_name TEXT,
operation TEXT,
user_name TEXT,
old_data JSONB,
new_data JSONB,
created_at TIMESTAMP DEFAULT NOW()
);
-----------------------------------------------------------------------------------------------------
创建审计触发器:
CREATE FUNCTION audit_trigger() RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' THEN
INSERT INTO audit_log (table_name, operation, user_name, new_data)
VALUES (TG_TABLE_NAME, TG_OP, current_user, row_to_json(NEW));
ELSIF TG_OP = 'UPDATE' THEN
INSERT INTO audit_log (table_name, operation, user_name, old_data, new_data)
VALUES (TG_TABLE_NAME, TG_OP, current_user, row_to_json(OLD), row_to_json(NEW));
ELSIF TG_OP = 'DELETE' THEN
INSERT INTO audit_log (table_name, operation, user_name, old_data)
VALUES (TG_TABLE_NAME, TG_OP, current_user, row_to_json(OLD));
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER users_audit
AFTER INSERT OR UPDATE OR DELETE ON users
FOR EACH ROW
EXECUTE FUNCTION audit_trigger();
b.数据验证
验证触发器:
CREATE FUNCTION validate_email() RETURNS TRIGGER AS $$
BEGIN
IF NEW.email !~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}$' THEN
RAISE EXCEPTION 'Invalid email format: %', NEW.email;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER users_validate_email
BEFORE INSERT OR UPDATE ON users
FOR EACH ROW
EXECUTE FUNCTION validate_email();
c.自动更新
级联更新:
CREATE FUNCTION update_order_total() RETURNS TRIGGER AS $$
BEGIN
UPDATE orders
SET total_amount = (
SELECT SUM(quantity * price)
FROM order_items
WHERE order_id = NEW.order_id
)
WHERE id = NEW.order_id;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER order_items_update_total
AFTER INSERT OR UPDATE OR DELETE ON order_items
FOR EACH ROW
EXECUTE FUNCTION update_order_total();
d.防止操作
防止删除:
CREATE FUNCTION prevent_delete() RETURNS TRIGGER AS $$
BEGIN
RAISE EXCEPTION 'Deletion is not allowed on this table';
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER users_prevent_delete
BEFORE DELETE ON users
FOR EACH ROW
EXECUTE FUNCTION prevent_delete();
-----------------------------------------------------------------------------------------------------
条件防止:
CREATE FUNCTION prevent_admin_delete() RETURNS TRIGGER AS $$
BEGIN
IF OLD.role = 'admin' THEN
RAISE EXCEPTION 'Cannot delete admin users';
END IF;
RETURN OLD;
END;
$$ LANGUAGE plpgsql;
03.触发器管理
a.查看触发器
查看表的触发器:
\dS users # psql 命令
-----------------------------------------------------------------------------------------------------
查询触发器:
SELECT trigger_name, event_manipulation, event_object_table
FROM information_schema.triggers
WHERE event_object_table = 'users';
b.禁用触发器
禁用单个触发器:
ALTER TABLE users DISABLE TRIGGER users_update_timestamp;
-----------------------------------------------------------------------------------------------------
启用触发器:
ALTER TABLE users ENABLE TRIGGER users_update_timestamp;
-----------------------------------------------------------------------------------------------------
禁用所有触发器:
ALTER TABLE users DISABLE TRIGGER ALL;
-----------------------------------------------------------------------------------------------------
启用所有触发器:
ALTER TABLE users ENABLE TRIGGER ALL;
c.删除触发器
删除触发器:
DROP TRIGGER users_update_timestamp ON users;
DROP TRIGGER IF EXISTS users_update_timestamp ON users;
3.4 事务与锁
01.事务基础
a.ACID特性
原子性(Atomicity):
事务中的所有操作要么全部成功,要么全部失败
不会出现部分成功的情况
-----------------------------------------------------------------------------------------------------
一致性(Consistency):
事务执行前后,数据库从一个一致状态转换到另一个一致状态
满足所有约束和规则
-----------------------------------------------------------------------------------------------------
隔离性(Isolation):
多个事务并发执行时,相互之间不会干扰
每个事务都感觉像是独占数据库
-----------------------------------------------------------------------------------------------------
持久性(Durability):
事务一旦提交,其结果就是永久性的
即使系统崩溃,数据也不会丢失
b.事务操作
开始事务:
BEGIN;
BEGIN TRANSACTION;
START TRANSACTION;
-----------------------------------------------------------------------------------------------------
提交事务:
COMMIT;
COMMIT TRANSACTION;
-----------------------------------------------------------------------------------------------------
回滚事务:
ROLLBACK;
ROLLBACK TRANSACTION;
-----------------------------------------------------------------------------------------------------
保存点:
SAVEPOINT sp1;
ROLLBACK TO SAVEPOINT sp1;
RELEASE SAVEPOINT sp1;
-----------------------------------------------------------------------------------------------------
示例:
BEGIN;
INSERT INTO users (username) VALUES ('user1');
SAVEPOINT sp1;
INSERT INTO users (username) VALUES ('user2');
ROLLBACK TO SAVEPOINT sp1; -- 回滚到 sp1,user2 不会插入
COMMIT; -- 提交,user1 插入成功
c.隔离级别
READ UNCOMMITTED:
最低隔离级别(PostgreSQL 实际上等同于 READ COMMITTED)
可以读取未提交的数据(脏读)
-----------------------------------------------------------------------------------------------------
READ COMMITTED(默认):
只能读取已提交的数据
避免脏读,但可能出现不可重复读和幻读
-----------------------------------------------------------------------------------------------------
REPEATABLE READ:
事务期间多次读取同一数据,结果相同
避免脏读和不可重复读,但可能出现幻读
-----------------------------------------------------------------------------------------------------
SERIALIZABLE:
最高隔离级别
完全隔离,避免所有并发问题
性能最低
-----------------------------------------------------------------------------------------------------
设置隔离级别:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-----------------------------------------------------------------------------------------------------
查看隔离级别:
SHOW transaction_isolation;
SELECT current_setting('transaction_isolation');
02.锁机制
a.锁类型
表级锁:
ACCESS SHARE:SELECT 获取
ROW SHARE:SELECT FOR UPDATE/SHARE 获取
ROW EXCLUSIVE:INSERT、UPDATE、DELETE 获取
SHARE UPDATE EXCLUSIVE:VACUUM、CREATE INDEX CONCURRENTLY 获取
SHARE:CREATE INDEX 获取
SHARE ROW EXCLUSIVE:较少使用
EXCLUSIVE:较少使用
ACCESS EXCLUSIVE:DROP TABLE、TRUNCATE、VACUUM FULL 获取
-----------------------------------------------------------------------------------------------------
行级锁:
FOR UPDATE:排他锁,阻止其他事务读取和修改
FOR NO KEY UPDATE:排他锁,但不阻止外键检查
FOR SHARE:共享锁,阻止其他事务修改
FOR KEY SHARE:共享锁,但不阻止 UPDATE(非键列)
-----------------------------------------------------------------------------------------------------
页级锁:
PostgreSQL 内部使用,用户无法直接控制
-----------------------------------------------------------------------------------------------------
咨询锁(Advisory Lock):
应用层自定义的锁
不与表或行关联
b.显式锁定
表级锁定:
LOCK TABLE users IN ACCESS SHARE MODE;
LOCK TABLE users IN ROW EXCLUSIVE MODE;
LOCK TABLE users IN ACCESS EXCLUSIVE MODE;
-----------------------------------------------------------------------------------------------------
行级锁定:
SELECT * FROM users WHERE id = 1 FOR UPDATE;
SELECT * FROM users WHERE id = 1 FOR SHARE;
SELECT * FROM users WHERE id = 1 FOR NO KEY UPDATE;
SELECT * FROM users WHERE id = 1 FOR KEY SHARE;
-----------------------------------------------------------------------------------------------------
NOWAIT 选项:
SELECT * FROM users WHERE id = 1 FOR UPDATE NOWAIT;
-- 如果无法立即获取锁,立即返回错误
-----------------------------------------------------------------------------------------------------
SKIP LOCKED 选项:
SELECT * FROM users WHERE status = 'pending' FOR UPDATE SKIP LOCKED LIMIT 10;
-- 跳过已锁定的行,处理未锁定的行
c.死锁
死锁的产生:
事务 A 锁定资源 1,等待资源 2
事务 B 锁定资源 2,等待资源 1
形成循环等待
-----------------------------------------------------------------------------------------------------
死锁检测:
PostgreSQL 自动检测死锁
默认超时时间:1 秒(deadlock_timeout)
检测到死锁后,回滚其中一个事务
-----------------------------------------------------------------------------------------------------
避免死锁:
按相同顺序访问资源
保持事务简短
使用较低的隔离级别
使用 NOWAIT 或 SKIP LOCKED
-----------------------------------------------------------------------------------------------------
死锁示例:
-- 事务 A
BEGIN;
UPDATE users SET age = 30 WHERE id = 1;
-- 等待...
UPDATE users SET age = 25 WHERE id = 2;
COMMIT;
-- 事务 B
BEGIN;
UPDATE users SET age = 28 WHERE id = 2;
-- 等待...
UPDATE users SET age = 32 WHERE id = 1;
COMMIT;
-- 其中一个事务会被回滚
d.锁监控
查看当前锁:
SELECT * FROM pg_locks;
-----------------------------------------------------------------------------------------------------
查看阻塞的查询:
SELECT
blocked_locks.pid AS blocked_pid,
blocked_activity.usename AS blocked_user,
blocking_locks.pid AS blocking_pid,
blocking_activity.usename AS blocking_user,
blocked_activity.query AS blocked_statement,
blocking_activity.query AS blocking_statement
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;
-----------------------------------------------------------------------------------------------------
终止阻塞的进程:
SELECT pg_terminate_backend(pid);
03.并发控制
a.MVCC
多版本并发控制:
PostgreSQL 使用 MVCC 实现并发控制
每个事务看到的是数据的一个快照
读操作不阻塞写操作,写操作不阻塞读操作
-----------------------------------------------------------------------------------------------------
版本管理:
每行数据都有隐藏列:xmin(插入事务 ID)、xmax(删除事务 ID)
UPDATE 操作实际上是 INSERT 新版本 + 标记旧版本为删除
旧版本数据称为死元组(Dead Tuple)
-----------------------------------------------------------------------------------------------------
可见性判断:
根据事务快照判断数据版本是否可见
如果行的 xmin 在快照之前提交,且 xmax 未设置或在快照之后,则可见
b.乐观锁
概念:
假设冲突很少发生
读取时不加锁,更新时检查版本
-----------------------------------------------------------------------------------------------------
实现方式:
使用版本号或时间戳
UPDATE users SET age = 30, version = version + 1
WHERE id = 1 AND version = 5;
-- 如果 version 不匹配,说明数据已被其他事务修改
-----------------------------------------------------------------------------------------------------
优点:
并发性能高
不会产生死锁
-----------------------------------------------------------------------------------------------------
缺点:
冲突时需要重试
不适合高冲突场景
c.悲观锁
概念:
假设冲突经常发生
读取时就加锁,防止其他事务修改
-----------------------------------------------------------------------------------------------------
实现方式:
SELECT * FROM users WHERE id = 1 FOR UPDATE;
-- 其他事务无法修改该行,直到当前事务提交或回滚
-----------------------------------------------------------------------------------------------------
优点:
避免冲突
适合高冲突场景
-----------------------------------------------------------------------------------------------------
缺点:
并发性能低
可能产生死锁
04.事务最佳实践
a.事务设计
保持事务简短:
减少锁定时间
降低死锁风险
提高并发性能
-----------------------------------------------------------------------------------------------------
避免长事务:
长事务会阻塞 VACUUM
导致表膨胀
影响性能
-----------------------------------------------------------------------------------------------------
合理使用隔离级别:
根据业务需求选择合适的隔离级别
不要盲目使用最高隔离级别
b.错误处理
捕获异常:
BEGIN;
-- 执行操作
EXCEPTION WHEN others THEN
ROLLBACK;
RAISE;
END;
-----------------------------------------------------------------------------------------------------
重试机制:
对于死锁或序列化失败,实现重试逻辑
设置最大重试次数
使用指数退避策略
c.性能优化
批量操作:
使用批量插入代替单条插入
INSERT INTO users (username) VALUES ('user1'), ('user2'), ('user3');
-----------------------------------------------------------------------------------------------------
减少锁竞争:
避免热点行
使用分区表
使用队列表
3.5 性能优化
01.查询优化
a.EXPLAIN分析
基本用法:
EXPLAIN SELECT * FROM users WHERE age > 18;
-----------------------------------------------------------------------------------------------------
详细分析:
EXPLAIN ANALYZE SELECT * FROM users WHERE age > 18;
-- 实际执行查询并显示执行时间
-----------------------------------------------------------------------------------------------------
输出格式:
EXPLAIN (FORMAT JSON) SELECT * FROM users WHERE age > 18;
EXPLAIN (FORMAT YAML) SELECT * FROM users WHERE age > 18;
EXPLAIN (FORMAT XML) SELECT * FROM users WHERE age > 18;
-----------------------------------------------------------------------------------------------------
详细选项:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE) SELECT * FROM users WHERE age > 18;
-----------------------------------------------------------------------------------------------------
执行计划解读:
Seq Scan:全表扫描(性能差)
Index Scan:索引扫描(性能好)
Index Only Scan:只扫描索引(性能最好)
Bitmap Index Scan:位图索引扫描
Nested Loop:嵌套循环连接
Hash Join:哈希连接
Merge Join:归并连接
b.索引优化
创建合适的索引:
为 WHERE 子句中的列创建索引
为 JOIN 列创建索引
为 ORDER BY 列创建索引
-----------------------------------------------------------------------------------------------------
使用部分索引:
CREATE INDEX idx_active_users ON users(email) WHERE status = 'active';
-----------------------------------------------------------------------------------------------------
使用表达式索引:
CREATE INDEX idx_lower_email ON users(LOWER(email));
-----------------------------------------------------------------------------------------------------
使用覆盖索引:
CREATE INDEX idx_users_email_name ON users(email) INCLUDE (name);
-----------------------------------------------------------------------------------------------------
删除未使用的索引:
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexname NOT LIKE '%_pkey';
c.查询重写
避免 SELECT *:
SELECT id, username, email FROM users; -- 好
SELECT * FROM users; -- 差
-----------------------------------------------------------------------------------------------------
使用 EXISTS 代替 IN:
SELECT * FROM users WHERE EXISTS (SELECT 1 FROM orders WHERE user_id = users.id); -- 好
SELECT * FROM users WHERE id IN (SELECT user_id FROM orders); -- 差
-----------------------------------------------------------------------------------------------------
使用 JOIN 代替子查询:
SELECT u.*, o.order_count
FROM users u
LEFT JOIN (SELECT user_id, COUNT(*) as order_count FROM orders GROUP BY user_id) o
ON u.id = o.user_id; -- 好
SELECT *, (SELECT COUNT(*) FROM orders WHERE user_id = users.id) as order_count
FROM users; -- 差
-----------------------------------------------------------------------------------------------------
避免函数包裹索引列:
SELECT * FROM users WHERE LOWER(email) = '[email protected]'; -- 差,无法使用索引
SELECT * FROM users WHERE email = '[email protected]'; -- 好,可以使用索引
-- 或创建表达式索引
CREATE INDEX idx_lower_email ON users(LOWER(email));
d.分页优化
传统分页(性能差):
SELECT * FROM users ORDER BY id LIMIT 10 OFFSET 1000000;
-- OFFSET 很大时性能很差
-----------------------------------------------------------------------------------------------------
游标分页(性能好):
SELECT * FROM users WHERE id > 1000000 ORDER BY id LIMIT 10;
-- 使用上一页的最后一个 ID 作为起点
-----------------------------------------------------------------------------------------------------
使用 FETCH:
SELECT * FROM users ORDER BY id OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY;
02.配置优化
a.内存配置
shared_buffers:
建议值:系统内存的 25%
默认值:128MB(太小)
配置:shared_buffers = 4GB
-----------------------------------------------------------------------------------------------------
effective_cache_size:
建议值:系统内存的 50-75%
用于查询计划,不实际分配内存
配置:effective_cache_size = 12GB
-----------------------------------------------------------------------------------------------------
work_mem:
用于排序、哈希、连接操作
建议值:根据并发连接数调整
配置:work_mem = 16MB
-----------------------------------------------------------------------------------------------------
maintenance_work_mem:
用于维护操作(VACUUM、CREATE INDEX)
建议值:256MB - 1GB
配置:maintenance_work_mem = 512MB
b.连接配置
max_connections:
最大连接数
默认值:100
建议:根据实际需求调整,不要设置过大
配置:max_connections = 200
-----------------------------------------------------------------------------------------------------
使用连接池:
PgBouncer、Pgpool-II
减少连接开销
提高并发性能
c.WAL配置
wal_level:
minimal:最小 WAL(不支持复制)
replica:支持物理复制(默认)
logical:支持逻辑复制
-----------------------------------------------------------------------------------------------------
wal_buffers:
WAL 缓冲区大小
默认值:shared_buffers 的 1/32
建议值:16MB
-----------------------------------------------------------------------------------------------------
checkpoint_timeout:
检查点超时时间
默认值:5 分钟
建议值:10-30 分钟
-----------------------------------------------------------------------------------------------------
checkpoint_completion_target:
检查点完成目标
默认值:0.5
建议值:0.9
d.查询配置
random_page_cost:
随机页访问成本
默认值:4.0
SSD 建议值:1.1
HDD 建议值:4.0
-----------------------------------------------------------------------------------------------------
effective_io_concurrency:
I/O 并发度
默认值:1
SSD 建议值:200
HDD 建议值:2
-----------------------------------------------------------------------------------------------------
default_statistics_target:
统计信息目标
默认值:100
建议值:100-1000(根据表大小)
03.维护优化
a.VACUUM
自动 VACUUM:
autovacuum = on(默认开启)
autovacuum_naptime = 1min
autovacuum_vacuum_threshold = 50
autovacuum_vacuum_scale_factor = 0.2
-----------------------------------------------------------------------------------------------------
手动 VACUUM:
VACUUM users; # 清理死元组
VACUUM ANALYZE users; # 清理 + 更新统计信息
VACUUM FULL users; # 完全清理(锁表)
VACUUM VERBOSE users; # 显示详细信息
-----------------------------------------------------------------------------------------------------
VACUUM 策略:
定期执行 VACUUM
对于频繁更新的表,增加 VACUUM 频率
避免在高峰期执行 VACUUM FULL
b.ANALYZE
更新统计信息:
ANALYZE users;
ANALYZE; # 分析所有表
-----------------------------------------------------------------------------------------------------
自动 ANALYZE:
autovacuum_analyze_threshold = 50
autovacuum_analyze_scale_factor = 0.1
-----------------------------------------------------------------------------------------------------
查看统计信息:
SELECT * FROM pg_stats WHERE tablename = 'users';
c.REINDEX
重建索引:
REINDEX INDEX idx_users_email;
REINDEX TABLE users;
REINDEX DATABASE mydb;
-----------------------------------------------------------------------------------------------------
并发重建:
REINDEX INDEX CONCURRENTLY idx_users_email;
-----------------------------------------------------------------------------------------------------
重建时机:
索引膨胀
索引损坏
性能下降
d.表膨胀
查看表膨胀:
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename) - pg_relation_size(schemaname||'.'||tablename)) AS bloat
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
-----------------------------------------------------------------------------------------------------
减少膨胀:
定期 VACUUM
调整 autovacuum 参数
使用 VACUUM FULL(锁表)
使用 pg_repack 扩展(不锁表)
04.监控优化
a.性能监控
pg_stat_statements:
CREATE EXTENSION pg_stat_statements;
SELECT query, calls, total_time, mean_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;
-----------------------------------------------------------------------------------------------------
慢查询日志:
log_min_duration_statement = 1000 # 记录超过 1 秒的查询
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '
-----------------------------------------------------------------------------------------------------
查看活动查询:
SELECT pid, usename, application_name, state, query
FROM pg_stat_activity
WHERE state = 'active';
b.资源监控
查看数据库大小:
SELECT pg_database_size('mydb');
SELECT pg_size_pretty(pg_database_size('mydb'));
-----------------------------------------------------------------------------------------------------
查看表大小:
SELECT pg_size_pretty(pg_total_relation_size('users'));
-----------------------------------------------------------------------------------------------------
查看索引大小:
SELECT pg_size_pretty(pg_indexes_size('users'));
-----------------------------------------------------------------------------------------------------
查看缓存命中率:
SELECT
sum(heap_blks_read) as heap_read,
sum(heap_blks_hit) as heap_hit,
sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio
FROM pg_statio_user_tables;
c.连接监控
查看连接数:
SELECT COUNT(*) FROM pg_stat_activity;
-----------------------------------------------------------------------------------------------------
查看每个数据库的连接数:
SELECT datname, COUNT(*) FROM pg_stat_activity GROUP BY datname;
-----------------------------------------------------------------------------------------------------
查看空闲连接:
SELECT COUNT(*) FROM pg_stat_activity WHERE state = 'idle';
4 PostgreSQL高级
4.1 分区表
01.分区表介绍
a.概念
分区表:
将一个大表按照某种规则分割成多个小表
每个小表称为分区(Partition)
逻辑上是一个表,物理上是多个表
-----------------------------------------------------------------------------------------------------
分区键:
用于确定数据存储在哪个分区的列
可以是单列或多列
-----------------------------------------------------------------------------------------------------
优势:
提高查询性能(分区裁剪)
简化数据维护(删除整个分区)
提高并发性能(减少锁竞争)
便于数据归档
b.分区类型
范围分区(Range Partitioning):
按照值的范围分区
适用于时间序列数据、数值范围
-----------------------------------------------------------------------------------------------------
列表分区(List Partitioning):
按照离散值分区
适用于枚举类型、地区代码
-----------------------------------------------------------------------------------------------------
哈希分区(Hash Partitioning):
按照哈希值分区
适用于均匀分布数据
-----------------------------------------------------------------------------------------------------
组合分区:
多级分区,如先按时间范围分区,再按哈希分区
02.范围分区
a.创建范围分区表
创建父表:
CREATE TABLE orders (
id SERIAL,
user_id INTEGER,
total_amount DECIMAL(10, 2),
created_at TIMESTAMP NOT NULL,
PRIMARY KEY (id, created_at)
) PARTITION BY RANGE (created_at);
-----------------------------------------------------------------------------------------------------
创建分区:
CREATE TABLE orders_2023_q1 PARTITION OF orders
FOR VALUES FROM ('2023-01-01') TO ('2023-04-01');
CREATE TABLE orders_2023_q2 PARTITION OF orders
FOR VALUES FROM ('2023-04-01') TO ('2023-07-01');
CREATE TABLE orders_2023_q3 PARTITION OF orders
FOR VALUES FROM ('2023-07-01') TO ('2023-10-01');
CREATE TABLE orders_2023_q4 PARTITION OF orders
FOR VALUES FROM ('2023-10-01') TO ('2024-01-01');
-----------------------------------------------------------------------------------------------------
创建默认分区:
CREATE TABLE orders_default PARTITION OF orders DEFAULT;
-- 不满足任何分区条件的数据存储在默认分区
b.分区操作
插入数据:
INSERT INTO orders (user_id, total_amount, created_at)
VALUES (1, 100.00, '2023-05-15');
-- 自动路由到 orders_2023_q2 分区
-----------------------------------------------------------------------------------------------------
查询数据:
SELECT * FROM orders WHERE created_at >= '2023-04-01' AND created_at < '2023-07-01';
-- 自动进行分区裁剪,只扫描 orders_2023_q2 分区
-----------------------------------------------------------------------------------------------------
更新数据:
UPDATE orders SET total_amount = 150.00 WHERE id = 1;
-- 如果更新分区键,可能导致行移动
-----------------------------------------------------------------------------------------------------
删除数据:
DELETE FROM orders WHERE created_at < '2023-01-01';
-- 只扫描相关分区
c.分区维护
添加分区:
CREATE TABLE orders_2024_q1 PARTITION OF orders
FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');
-----------------------------------------------------------------------------------------------------
删除分区:
DROP TABLE orders_2023_q1;
-- 直接删除分区表,数据也会被删除
-----------------------------------------------------------------------------------------------------
分离分区:
ALTER TABLE orders DETACH PARTITION orders_2023_q1;
-- 分区变成独立的表,数据保留
-----------------------------------------------------------------------------------------------------
附加分区:
ALTER TABLE orders ATTACH PARTITION orders_2023_q1
FOR VALUES FROM ('2023-01-01') TO ('2023-04-01');
-- 将独立表附加为分区
03.列表分区
a.创建列表分区表
创建父表:
CREATE TABLE users (
id SERIAL,
username VARCHAR(50),
country VARCHAR(2) NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
PRIMARY KEY (id, country)
) PARTITION BY LIST (country);
-----------------------------------------------------------------------------------------------------
创建分区:
CREATE TABLE users_cn PARTITION OF users
FOR VALUES IN ('CN');
CREATE TABLE users_us PARTITION OF users
FOR VALUES IN ('US');
CREATE TABLE users_eu PARTITION OF users
FOR VALUES IN ('UK', 'DE', 'FR', 'IT', 'ES');
CREATE TABLE users_other PARTITION OF users DEFAULT;
b.多列列表分区
创建父表:
CREATE TABLE sales (
id SERIAL,
product_id INTEGER,
region VARCHAR(10),
country VARCHAR(2),
amount DECIMAL(10, 2),
PRIMARY KEY (id, region, country)
) PARTITION BY LIST (region, country);
-----------------------------------------------------------------------------------------------------
创建分区:
CREATE TABLE sales_asia_cn PARTITION OF sales
FOR VALUES IN (('ASIA', 'CN'));
CREATE TABLE sales_asia_jp PARTITION OF sales
FOR VALUES IN (('ASIA', 'JP'));
CREATE TABLE sales_europe PARTITION OF sales
FOR VALUES IN (('EU', 'UK'), ('EU', 'DE'), ('EU', 'FR'));
04.哈希分区
a.创建哈希分区表
创建父表:
CREATE TABLE logs (
id SERIAL,
user_id INTEGER NOT NULL,
action VARCHAR(50),
created_at TIMESTAMP DEFAULT NOW(),
PRIMARY KEY (id, user_id)
) PARTITION BY HASH (user_id);
-----------------------------------------------------------------------------------------------------
创建分区:
CREATE TABLE logs_p0 PARTITION OF logs
FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE logs_p1 PARTITION OF logs
FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE logs_p2 PARTITION OF logs
FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE logs_p3 PARTITION OF logs
FOR VALUES WITH (MODULUS 4, REMAINDER 3);
-----------------------------------------------------------------------------------------------------
说明:
MODULUS:分区总数
REMAINDER:分区编号(0 到 MODULUS-1)
数据根据 HASH(user_id) % MODULUS 的结果分配到对应分区
05.组合分区
a.多级分区
创建父表(按时间范围分区):
CREATE TABLE events (
id SERIAL,
user_id INTEGER NOT NULL,
event_type VARCHAR(50),
created_at TIMESTAMP NOT NULL,
PRIMARY KEY (id, created_at, user_id)
) PARTITION BY RANGE (created_at);
-----------------------------------------------------------------------------------------------------
创建一级分区(按时间):
CREATE TABLE events_2023 PARTITION OF events
FOR VALUES FROM ('2023-01-01') TO ('2024-01-01')
PARTITION BY HASH (user_id);
CREATE TABLE events_2024 PARTITION OF events
FOR VALUES FROM ('2024-01-01') TO ('2025-01-01')
PARTITION BY HASH (user_id);
-----------------------------------------------------------------------------------------------------
创建二级分区(按哈希):
CREATE TABLE events_2023_p0 PARTITION OF events_2023
FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE events_2023_p1 PARTITION OF events_2023
FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE events_2023_p2 PARTITION OF events_2023
FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE events_2023_p3 PARTITION OF events_2023
FOR VALUES WITH (MODULUS 4, REMAINDER 3);
06.分区表最佳实践
a.设计原则
选择合适的分区键:
根据查询模式选择分区键
分区键应该出现在大多数查询的 WHERE 子句中
-----------------------------------------------------------------------------------------------------
合理的分区数量:
分区数量不宜过多(建议不超过几千个)
每个分区应该有足够的数据量
-----------------------------------------------------------------------------------------------------
分区大小:
单个分区建议在 10GB - 100GB
根据硬件和查询模式调整
b.索引策略
在父表上创建索引:
CREATE INDEX idx_orders_user_id ON orders(user_id);
-- 自动在所有分区上创建相同的索引
-----------------------------------------------------------------------------------------------------
在分区上创建独立索引:
CREATE INDEX idx_orders_2023_q1_user_id ON orders_2023_q1(user_id);
-- 只在特定分区上创建索引
c.查询优化
启用分区裁剪:
SET enable_partition_pruning = on; -- 默认开启
-----------------------------------------------------------------------------------------------------
查看分区裁剪:
EXPLAIN SELECT * FROM orders WHERE created_at >= '2023-04-01' AND created_at < '2023-07-01';
-- 查看执行计划,确认只扫描相关分区
d.维护策略
定期创建新分区:
提前创建未来的分区
避免数据插入到默认分区
-----------------------------------------------------------------------------------------------------
定期归档旧分区:
分离旧分区
导出数据到归档存储
删除分区表
-----------------------------------------------------------------------------------------------------
监控分区大小:
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE tablename LIKE 'orders_%'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
4.2 扩展
01.扩展管理
a.扩展基础
查看可用扩展:
SELECT * FROM pg_available_extensions;
-----------------------------------------------------------------------------------------------------
查看已安装扩展:
SELECT * FROM pg_extension;
\dx # psql 命令
-----------------------------------------------------------------------------------------------------
安装扩展:
CREATE EXTENSION extension_name;
CREATE EXTENSION IF NOT EXISTS extension_name;
-----------------------------------------------------------------------------------------------------
删除扩展:
DROP EXTENSION extension_name;
DROP EXTENSION IF EXISTS extension_name CASCADE;
-----------------------------------------------------------------------------------------------------
更新扩展:
ALTER EXTENSION extension_name UPDATE;
ALTER EXTENSION extension_name UPDATE TO '1.5';
02.常用扩展
a.pg_stat_statements
功能:
跟踪所有 SQL 语句的执行统计信息
分析慢查询、热点查询
-----------------------------------------------------------------------------------------------------
安装:
CREATE EXTENSION pg_stat_statements;
-----------------------------------------------------------------------------------------------------
配置(postgresql.conf):
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 10000
pg_stat_statements.track = all
-----------------------------------------------------------------------------------------------------
使用:
SELECT
query,
calls,
total_exec_time,
mean_exec_time,
max_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
-----------------------------------------------------------------------------------------------------
重置统计:
SELECT pg_stat_statements_reset();
b.pgcrypto
功能:
提供加密函数
支持哈希、对称加密、非对称加密
-----------------------------------------------------------------------------------------------------
安装:
CREATE EXTENSION pgcrypto;
-----------------------------------------------------------------------------------------------------
哈希函数:
SELECT digest('password', 'sha256');
SELECT encode(digest('password', 'sha256'), 'hex');
-----------------------------------------------------------------------------------------------------
密码加密:
SELECT crypt('password', gen_salt('bf'));
-- 使用 Blowfish 算法加密密码
-----------------------------------------------------------------------------------------------------
密码验证:
SELECT crypt('password', stored_hash) = stored_hash;
-----------------------------------------------------------------------------------------------------
UUID 生成:
SELECT gen_random_uuid();
c.uuid-ossp
功能:
生成 UUID
-----------------------------------------------------------------------------------------------------
安装:
CREATE EXTENSION "uuid-ossp";
-----------------------------------------------------------------------------------------------------
使用:
SELECT uuid_generate_v1(); # 基于时间和 MAC 地址
SELECT uuid_generate_v4(); # 随机 UUID
d.hstore
功能:
键值对存储
类似 JSON,但性能更好
-----------------------------------------------------------------------------------------------------
安装:
CREATE EXTENSION hstore;
-----------------------------------------------------------------------------------------------------
创建表:
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
attributes hstore
);
-----------------------------------------------------------------------------------------------------
插入数据:
INSERT INTO products (name, attributes) VALUES
('Product 1', 'color => "red", size => "L", weight => "1.5kg"');
-----------------------------------------------------------------------------------------------------
查询数据:
SELECT name, attributes -> 'color' AS color FROM products;
SELECT * FROM products WHERE attributes @> 'color => red';
-----------------------------------------------------------------------------------------------------
更新数据:
UPDATE products SET attributes = attributes || 'price => "99.99"' WHERE id = 1;
e.PostGIS
功能:
地理空间数据支持
GIS 功能
-----------------------------------------------------------------------------------------------------
安装:
CREATE EXTENSION postgis;
-----------------------------------------------------------------------------------------------------
创建表:
CREATE TABLE locations (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
geom GEOMETRY(Point, 4326)
);
-----------------------------------------------------------------------------------------------------
插入数据:
INSERT INTO locations (name, geom) VALUES
('Beijing', ST_GeomFromText('POINT(116.4074 39.9042)', 4326));
-----------------------------------------------------------------------------------------------------
查询距离:
SELECT name, ST_Distance(
geom,
ST_GeomFromText('POINT(116.4074 39.9042)', 4326)
) AS distance
FROM locations
ORDER BY distance;
f.pg_trgm
功能:
三元组相似度搜索
模糊匹配
-----------------------------------------------------------------------------------------------------
安装:
CREATE EXTENSION pg_trgm;
-----------------------------------------------------------------------------------------------------
相似度查询:
SELECT similarity('hello', 'hallo');
SELECT * FROM users WHERE username % 'jhon'; -- 模糊匹配 'john'
-----------------------------------------------------------------------------------------------------
创建 GIN 索引:
CREATE INDEX idx_users_username_trgm ON users USING gin(username gin_trgm_ops);
g.tablefunc
功能:
表函数,如交叉表(crosstab)
-----------------------------------------------------------------------------------------------------
安装:
CREATE EXTENSION tablefunc;
-----------------------------------------------------------------------------------------------------
交叉表:
SELECT * FROM crosstab(
'SELECT year, month, revenue FROM sales ORDER BY 1, 2',
'SELECT DISTINCT month FROM sales ORDER BY 1'
) AS ct(year INT, jan NUMERIC, feb NUMERIC, mar NUMERIC);
03.自定义扩展
a.创建扩展
创建扩展文件:
-- myextension.control
comment = 'My custom extension'
default_version = '1.0'
relocatable = true
-----------------------------------------------------------------------------------------------------
创建 SQL 文件:
-- myextension--1.0.sql
CREATE FUNCTION my_function() RETURNS TEXT AS $$
BEGIN
RETURN 'Hello from my extension';
END;
$$ LANGUAGE plpgsql;
-----------------------------------------------------------------------------------------------------
安装扩展:
将文件放到 PostgreSQL 的 extension 目录
CREATE EXTENSION myextension;
4.3 全文检索
01.全文检索基础
a.概念
全文检索:
在文本中搜索关键词
支持词干提取、停用词过滤、相关性排序
-----------------------------------------------------------------------------------------------------
tsvector:
文本搜索向量
存储经过处理的文本(分词、词干提取)
-----------------------------------------------------------------------------------------------------
tsquery:
文本搜索查询
表示搜索条件
-----------------------------------------------------------------------------------------------------
文本搜索配置:
定义语言、词典、停用词
默认配置:english
b.基本用法
创建 tsvector:
SELECT to_tsvector('english', 'The quick brown fox jumps over the lazy dog');
-- 'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2
-----------------------------------------------------------------------------------------------------
创建 tsquery:
SELECT to_tsquery('english', 'quick & fox');
SELECT to_tsquery('english', 'quick | fox');
SELECT to_tsquery('english', 'quick & !lazy');
-----------------------------------------------------------------------------------------------------
匹配查询:
SELECT to_tsvector('english', 'The quick brown fox') @@ to_tsquery('english', 'quick & fox');
-- true
02.全文检索实践
a.创建全文检索表
创建表:
CREATE TABLE articles (
id SERIAL PRIMARY KEY,
title VARCHAR(200),
content TEXT,
tsv tsvector
);
-----------------------------------------------------------------------------------------------------
生成 tsvector:
UPDATE articles SET tsv =
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(content, '')), 'B');
-----------------------------------------------------------------------------------------------------
自动更新 tsvector(触发器):
CREATE FUNCTION articles_tsv_trigger() RETURNS TRIGGER AS $$
BEGIN
NEW.tsv :=
setweight(to_tsvector('english', coalesce(NEW.title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(NEW.content, '')), 'B');
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER articles_tsv_update
BEFORE INSERT OR UPDATE ON articles
FOR EACH ROW
EXECUTE FUNCTION articles_tsv_trigger();
b.创建索引
GIN 索引(推荐):
CREATE INDEX idx_articles_tsv ON articles USING gin(tsv);
-----------------------------------------------------------------------------------------------------
GiST 索引:
CREATE INDEX idx_articles_tsv ON articles USING gist(tsv);
c.搜索查询
基本搜索:
SELECT * FROM articles WHERE tsv @@ to_tsquery('english', 'postgresql');
-----------------------------------------------------------------------------------------------------
多关键词搜索:
SELECT * FROM articles WHERE tsv @@ to_tsquery('english', 'postgresql & database');
SELECT * FROM articles WHERE tsv @@ to_tsquery('english', 'postgresql | mysql');
-----------------------------------------------------------------------------------------------------
短语搜索:
SELECT * FROM articles WHERE tsv @@ phraseto_tsquery('english', 'postgresql database');
-----------------------------------------------------------------------------------------------------
相关性排序:
SELECT
id,
title,
ts_rank(tsv, to_tsquery('english', 'postgresql')) AS rank
FROM articles
WHERE tsv @@ to_tsquery('english', 'postgresql')
ORDER BY rank DESC;
-----------------------------------------------------------------------------------------------------
高亮显示:
SELECT
id,
title,
ts_headline('english', content, to_tsquery('english', 'postgresql')) AS snippet
FROM articles
WHERE tsv @@ to_tsquery('english', 'postgresql');
03.中文全文检索
a.zhparser扩展
安装 zhparser:
# 需要先安装 scws 分词库
# 然后编译安装 zhparser 扩展
CREATE EXTENSION zhparser;
-----------------------------------------------------------------------------------------------------
创建中文配置:
CREATE TEXT SEARCH CONFIGURATION chinese_zh (PARSER = zhparser);
ALTER TEXT SEARCH CONFIGURATION chinese_zh ADD MAPPING FOR n,v,a,i,e,l WITH simple;
-----------------------------------------------------------------------------------------------------
使用中文搜索:
SELECT to_tsvector('chinese_zh', '我爱PostgreSQL数据库');
SELECT to_tsquery('chinese_zh', 'PostgreSQL & 数据库');
-----------------------------------------------------------------------------------------------------
创建中文搜索表:
CREATE TABLE articles_cn (
id SERIAL PRIMARY KEY,
title VARCHAR(200),
content TEXT,
tsv tsvector
);
CREATE FUNCTION articles_cn_tsv_trigger() RETURNS TRIGGER AS $$
BEGIN
NEW.tsv :=
setweight(to_tsvector('chinese_zh', coalesce(NEW.title, '')), 'A') ||
setweight(to_tsvector('chinese_zh', coalesce(NEW.content, '')), 'B');
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER articles_cn_tsv_update
BEFORE INSERT OR UPDATE ON articles_cn
FOR EACH ROW
EXECUTE FUNCTION articles_cn_tsv_trigger();
CREATE INDEX idx_articles_cn_tsv ON articles_cn USING gin(tsv);
4.4 JSON支持
01.JSON类型
a.JSON vs JSONB
JSON:
存储原始文本
保留空格、键顺序
插入快,查询慢
-----------------------------------------------------------------------------------------------------
JSONB(推荐):
存储二进制格式
不保留空格、键顺序(自动去重)
插入慢,查询快
支持索引
-----------------------------------------------------------------------------------------------------
选择建议:
大多数情况使用 JSONB
需要保留原始格式时使用 JSON
b.创建JSON表
创建表:
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50),
profile JSONB,
settings JSON
);
-----------------------------------------------------------------------------------------------------
插入数据:
INSERT INTO users (username, profile) VALUES
('user1', '{"age": 25, "city": "Beijing", "hobbies": ["reading", "coding"]}');
INSERT INTO users (username, profile) VALUES
('user2', '{"age": 30, "city": "Shanghai", "hobbies": ["music", "travel"]}');
02.JSON操作
a.提取数据
提取 JSON 字段(返回 JSON):
SELECT profile -> 'age' FROM users; -- 返回 JSON: 25
SELECT profile -> 'hobbies' FROM users; -- 返回 JSON: ["reading", "coding"]
-----------------------------------------------------------------------------------------------------
提取 JSON 字段(返回文本):
SELECT profile ->> 'age' FROM users; -- 返回文本: 25
SELECT profile ->> 'city' FROM users; -- 返回文本: Beijing
-----------------------------------------------------------------------------------------------------
提取嵌套字段:
SELECT profile -> 'address' -> 'city' FROM users;
SELECT profile #> '{address,city}' FROM users; -- 路径提取
SELECT profile #>> '{address,city}' FROM users; -- 路径提取(返回文本)
-----------------------------------------------------------------------------------------------------
提取数组元素:
SELECT profile -> 'hobbies' -> 0 FROM users; -- 第一个元素
SELECT profile -> 'hobbies' -> 1 FROM users; -- 第二个元素
b.查询数据
包含查询:
SELECT * FROM users WHERE profile @> '{"city": "Beijing"}';
SELECT * FROM users WHERE profile @> '{"hobbies": ["reading"]}';
-----------------------------------------------------------------------------------------------------
被包含查询:
SELECT * FROM users WHERE '{"age": 25}' <@ profile;
-----------------------------------------------------------------------------------------------------
键存在查询:
SELECT * FROM users WHERE profile ? 'age'; -- 是否存在 age 键
SELECT * FROM users WHERE profile ?| array['age', 'city']; -- 是否存在任一键
SELECT * FROM users WHERE profile ?& array['age', 'city']; -- 是否存在所有键
-----------------------------------------------------------------------------------------------------
路径查询:
SELECT * FROM users WHERE profile @? '$.age ? (@ > 25)';
SELECT * FROM users WHERE profile @@ '$.age > 25';
c.修改数据
更新整个 JSON:
UPDATE users SET profile = '{"age": 26, "city": "Beijing"}' WHERE id = 1;
-----------------------------------------------------------------------------------------------------
更新 JSON 字段:
UPDATE users SET profile = jsonb_set(profile, '{age}', '26') WHERE id = 1;
UPDATE users SET profile = jsonb_set(profile, '{address,city}', '"Beijing"') WHERE id = 1;
-----------------------------------------------------------------------------------------------------
删除 JSON 字段:
UPDATE users SET profile = profile - 'age' WHERE id = 1;
UPDATE users SET profile = profile - 'hobbies' WHERE id = 1;
-----------------------------------------------------------------------------------------------------
删除嵌套字段:
UPDATE users SET profile = profile #- '{address,city}' WHERE id = 1;
-----------------------------------------------------------------------------------------------------
合并 JSON:
UPDATE users SET profile = profile || '{"email": "[email protected]"}' WHERE id = 1;
03.JSON函数
a.构建JSON
JSONB_BUILD_OBJECT:
SELECT JSONB_BUILD_OBJECT('name', 'John', 'age', 30);
-- {"name": "John", "age": 30}
-----------------------------------------------------------------------------------------------------
JSONB_BUILD_ARRAY:
SELECT JSONB_BUILD_ARRAY('a', 'b', 'c');
-- ["a", "b", "c"]
-----------------------------------------------------------------------------------------------------
ROW_TO_JSON:
SELECT ROW_TO_JSON(users) FROM users;
-----------------------------------------------------------------------------------------------------
JSON_AGG:
SELECT JSON_AGG(username) FROM users;
-- ["user1", "user2"]
-----------------------------------------------------------------------------------------------------
JSONB_OBJECT_AGG:
SELECT JSONB_OBJECT_AGG(username, profile) FROM users;
-- {"user1": {...}, "user2": {...}}
b.解析JSON
JSONB_EACH:
SELECT * FROM JSONB_EACH('{"a": 1, "b": 2}'::jsonb);
-- key | value
-- a | 1
-- b | 2
-----------------------------------------------------------------------------------------------------
JSONB_EACH_TEXT:
SELECT * FROM JSONB_EACH_TEXT('{"a": 1, "b": 2}'::jsonb);
-----------------------------------------------------------------------------------------------------
JSONB_ARRAY_ELEMENTS:
SELECT * FROM JSONB_ARRAY_ELEMENTS('["a", "b", "c"]'::jsonb);
-- value
-- "a"
-- "b"
-- "c"
-----------------------------------------------------------------------------------------------------
JSONB_ARRAY_ELEMENTS_TEXT:
SELECT * FROM JSONB_ARRAY_ELEMENTS_TEXT('["a", "b", "c"]'::jsonb);
-----------------------------------------------------------------------------------------------------
JSONB_TO_RECORD:
SELECT * FROM JSONB_TO_RECORD('{"name": "John", "age": 30}'::jsonb)
AS x(name TEXT, age INTEGER);
c.JSON工具函数
JSONB_TYPEOF:
SELECT JSONB_TYPEOF('{"a": 1}'::jsonb); -- object
SELECT JSONB_TYPEOF('[1, 2, 3]'::jsonb); -- array
SELECT JSONB_TYPEOF('123'::jsonb); -- number
-----------------------------------------------------------------------------------------------------
JSONB_ARRAY_LENGTH:
SELECT JSONB_ARRAY_LENGTH('["a", "b", "c"]'::jsonb); -- 3
-----------------------------------------------------------------------------------------------------
JSONB_PRETTY:
SELECT JSONB_PRETTY('{"name":"John","age":30}'::jsonb);
-- {
-- "name": "John",
-- "age": 30
-- }
-----------------------------------------------------------------------------------------------------
JSONB_STRIP_NULLS:
SELECT JSONB_STRIP_NULLS('{"a": 1, "b": null, "c": 3}'::jsonb);
-- {"a": 1, "c": 3}
04.JSON索引
a.GIN索引
默认 GIN 索引:
CREATE INDEX idx_users_profile ON users USING gin(profile);
-- 支持 @>、?、?|、?& 操作符
-----------------------------------------------------------------------------------------------------
路径 GIN 索引:
CREATE INDEX idx_users_profile_path ON users USING gin(profile jsonb_path_ops);
-- 只支持 @> 操作符,但性能更好,索引更小
b.表达式索引
为特定字段创建索引:
CREATE INDEX idx_users_profile_age ON users((profile->>'age'));
CREATE INDEX idx_users_profile_city ON users((profile->>'city'));
-----------------------------------------------------------------------------------------------------
查询使用索引:
SELECT * FROM users WHERE profile->>'age' = '25';
SELECT * FROM users WHERE profile->>'city' = 'Beijing';
05.JSON最佳实践
a.设计原则
合理使用 JSON:
适合存储半结构化数据
不适合频繁更新的字段
不适合需要复杂查询的数据
-----------------------------------------------------------------------------------------------------
JSON vs 关系表:
关系表:结构固定、需要 JOIN、强类型
JSON:结构灵活、嵌套数据、弱类型
-----------------------------------------------------------------------------------------------------
混合使用:
核心字段使用关系列
扩展字段使用 JSON
b.性能优化
使用 JSONB 而非 JSON:
JSONB 支持索引,查询性能更好
-----------------------------------------------------------------------------------------------------
创建合适的索引:
根据查询模式选择索引类型
为常用字段创建表达式索引
-----------------------------------------------------------------------------------------------------
避免过大的 JSON:
单个 JSON 不宜超过几 MB
考虑拆分为多个字段或关系表
c.查询优化
使用索引支持的操作符:
@>、?、?|、?& 可以使用 GIN 索引
->、->> 需要表达式索引
-----------------------------------------------------------------------------------------------------
避免全表扫描:
为常用查询创建索引
使用 EXPLAIN 分析查询计划
4.5 外部数据包装器
01.FDW介绍
a.概念
外部数据包装器(Foreign Data Wrapper,FDW):
访问外部数据源的接口
将外部数据映射为 PostgreSQL 表
支持查询、插入、更新、删除(取决于 FDW 实现)
-----------------------------------------------------------------------------------------------------
应用场景:
跨数据库查询
数据迁移
数据联邦
异构数据集成
b.常用FDW
postgres_fdw:
连接其他 PostgreSQL 数据库
官方扩展,功能完善
-----------------------------------------------------------------------------------------------------
mysql_fdw:
连接 MySQL 数据库
-----------------------------------------------------------------------------------------------------
oracle_fdw:
连接 Oracle 数据库
-----------------------------------------------------------------------------------------------------
file_fdw:
读取文件(CSV、文本)
-----------------------------------------------------------------------------------------------------
mongo_fdw:
连接 MongoDB
02.postgres_fdw
a.安装配置
安装扩展:
CREATE EXTENSION postgres_fdw;
-----------------------------------------------------------------------------------------------------
创建外部服务器:
CREATE SERVER foreign_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'remote_host', port '5432', dbname 'remote_db');
-----------------------------------------------------------------------------------------------------
创建用户映射:
CREATE USER MAPPING FOR local_user
SERVER foreign_server
OPTIONS (user 'remote_user', password 'remote_password');
-----------------------------------------------------------------------------------------------------
导入外部表:
IMPORT FOREIGN SCHEMA public
FROM SERVER foreign_server
INTO local_schema;
-----------------------------------------------------------------------------------------------------
手动创建外部表:
CREATE FOREIGN TABLE remote_users (
id INTEGER,
username VARCHAR(50),
email VARCHAR(100)
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'users');
b.使用外部表
查询外部表:
SELECT * FROM remote_users;
SELECT * FROM remote_users WHERE id > 100;
-----------------------------------------------------------------------------------------------------
JOIN 本地表和外部表:
SELECT
l.id,
l.order_id,
r.username
FROM local_orders l
JOIN remote_users r ON l.user_id = r.id;
-----------------------------------------------------------------------------------------------------
插入数据:
INSERT INTO remote_users (username, email) VALUES ('user1', '[email protected]');
-----------------------------------------------------------------------------------------------------
更新数据:
UPDATE remote_users SET email = '[email protected]' WHERE id = 1;
-----------------------------------------------------------------------------------------------------
删除数据:
DELETE FROM remote_users WHERE id = 1;
c.性能优化
推送查询条件:
postgres_fdw 会自动将 WHERE 条件推送到远程服务器
减少网络传输
-----------------------------------------------------------------------------------------------------
推送聚合:
postgres_fdw 支持推送聚合函数(COUNT、SUM、AVG 等)
-----------------------------------------------------------------------------------------------------
批量操作:
使用 batch_size 选项控制批量大小
ALTER SERVER foreign_server OPTIONS (ADD batch_size '100');
-----------------------------------------------------------------------------------------------------
并行查询:
使用 async_capable 选项启用异步查询
ALTER SERVER foreign_server OPTIONS (ADD async_capable 'true');
03.file_fdw
a.安装配置
安装扩展:
CREATE EXTENSION file_fdw;
-----------------------------------------------------------------------------------------------------
创建外部服务器:
CREATE SERVER file_server
FOREIGN DATA WRAPPER file_fdw;
-----------------------------------------------------------------------------------------------------
创建外部表:
CREATE FOREIGN TABLE csv_data (
id INTEGER,
name VARCHAR(50),
age INTEGER
)
SERVER file_server
OPTIONS (filename '/path/to/data.csv', format 'csv', header 'true');
b.使用外部表
查询 CSV 文件:
SELECT * FROM csv_data;
SELECT * FROM csv_data WHERE age > 25;
-----------------------------------------------------------------------------------------------------
导入数据:
INSERT INTO local_table SELECT * FROM csv_data;
-----------------------------------------------------------------------------------------------------
注意事项:
file_fdw 只支持读取,不支持写入
文件路径必须是服务器上的绝对路径
需要 PostgreSQL 进程有读取权限
04.FDW管理
a.查看FDW
查看已安装的 FDW:
SELECT * FROM pg_foreign_data_wrapper;
-----------------------------------------------------------------------------------------------------
查看外部服务器:
SELECT * FROM pg_foreign_server;
-----------------------------------------------------------------------------------------------------
查看外部表:
SELECT * FROM pg_foreign_table;
-----------------------------------------------------------------------------------------------------
查看用户映射:
SELECT * FROM pg_user_mappings;
b.修改FDW
修改服务器选项:
ALTER SERVER foreign_server OPTIONS (SET host 'new_host');
ALTER SERVER foreign_server OPTIONS (ADD port '5433');
ALTER SERVER foreign_server OPTIONS (DROP port);
-----------------------------------------------------------------------------------------------------
修改用户映射:
ALTER USER MAPPING FOR local_user SERVER foreign_server
OPTIONS (SET password 'new_password');
-----------------------------------------------------------------------------------------------------
修改外部表:
ALTER FOREIGN TABLE remote_users OPTIONS (SET table_name 'new_users');
c.删除FDW
删除外部表:
DROP FOREIGN TABLE remote_users;
-----------------------------------------------------------------------------------------------------
删除用户映射:
DROP USER MAPPING FOR local_user SERVER foreign_server;
-----------------------------------------------------------------------------------------------------
删除服务器:
DROP SERVER foreign_server CASCADE;
-----------------------------------------------------------------------------------------------------
删除扩展:
DROP EXTENSION postgres_fdw CASCADE;
05.FDW最佳实践
a.安全性
使用密码文件:
避免在 SQL 中硬编码密码
使用 .pgpass 文件存储密码
-----------------------------------------------------------------------------------------------------
限制权限:
只授予必要的权限
使用只读用户访问外部数据
-----------------------------------------------------------------------------------------------------
加密连接:
使用 SSL 连接远程数据库
ALTER SERVER foreign_server OPTIONS (ADD sslmode 'require');
b.性能优化
减少网络传输:
只查询需要的列
使用 WHERE 条件过滤数据
-----------------------------------------------------------------------------------------------------
使用本地缓存:
将常用数据缓存到本地表
定期同步
-----------------------------------------------------------------------------------------------------
批量操作:
使用批量插入代替单条插入
调整 batch_size 参数
c.监控维护
监控查询性能:
使用 EXPLAIN 分析查询计划
检查是否正确推送条件
-----------------------------------------------------------------------------------------------------
监控连接:
检查外部服务器连接状态
设置连接超时
-----------------------------------------------------------------------------------------------------
定期测试:
定期测试外部表连接
监控远程数据库状态
01.认证安全
a.密码策略
强制密码复杂度:
安装 passwordcheck 扩展
CREATE EXTENSION passwordcheck;
-----------------------------------------------------------------------------------------------------
密码加密:
使用 SCRAM-SHA-256 认证
配置 postgresql.conf:
password_encryption = scram-sha-256
配置 pg_hba.conf:
host all all 0.0.0.0/0 scram-sha-256
b.访问控制
限制访问IP:
配置 pg_hba.conf:
host all all 192.168.1.0/24 scram-sha-256
host all all 10.0.0.0/8 reject
-----------------------------------------------------------------------------------------------------
禁用超级用户远程登录:
host all postgres 0.0.0.0/0 reject
host all postgres 127.0.0.1/32 scram-sha-256
02.权限管理
a.最小权限原则
创建只读用户:
CREATE USER readonly_user WITH PASSWORD 'password';
GRANT CONNECT ON DATABASE mydb TO readonly_user;
GRANT USAGE ON SCHEMA public TO readonly_user;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly_user;
-----------------------------------------------------------------------------------------------------
创建应用用户:
CREATE USER app_user WITH PASSWORD 'password';
GRANT CONNECT ON DATABASE mydb TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;
b.行级安全
启用行级安全:
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
-----------------------------------------------------------------------------------------------------
创建策略:
CREATE POLICY user_isolation ON users
FOR ALL
TO app_user
USING (user_id = current_user_id());
03.数据加密
a.传输加密
配置 SSL:
配置 postgresql.conf:
ssl = on
ssl_cert_file = 'server.crt'
ssl_key_file = 'server.key'
配置 pg_hba.conf:
hostssl all all 0.0.0.0/0 scram-sha-256
b.列级加密
使用 pgcrypto 扩展:
CREATE EXTENSION pgcrypto;
加密存储:
INSERT INTO users (username, password) VALUES
('user1', crypt('password', gen_salt('bf')));
验证密码:
SELECT * FROM users WHERE username = 'user1' AND password = crypt('input_password', password);
04.审计日志
a.配置日志
启用日志:
配置 postgresql.conf:
logging_collector = on
log_directory = 'log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_statement = 'all'
log_duration = on
log_min_duration_statement = 1000
5.5 常见问题
01.性能问题
a.慢查询
问题:
查询执行时间过长
-----------------------------------------------------------------------------------------------------
排查:
使用 EXPLAIN ANALYZE 分析查询计划
检查是否使用索引
检查统计信息是否过期
-----------------------------------------------------------------------------------------------------
解决:
创建合适的索引
优化查询语句
更新统计信息:ANALYZE
b.连接数过多
问题:
FATAL: sorry, too many clients already
-----------------------------------------------------------------------------------------------------
排查:
SELECT COUNT(*) FROM pg_stat_activity;
SELECT state, COUNT(*) FROM pg_stat_activity GROUP BY state;
-----------------------------------------------------------------------------------------------------
解决:
增加 max_connections
使用连接池
关闭空闲连接
c.表膨胀
问题:
表和索引占用空间过大
查询性能下降
-----------------------------------------------------------------------------------------------------
排查:
SELECT pg_size_pretty(pg_total_relation_size('users'));
SELECT schemaname, tablename, n_dead_tup FROM pg_stat_user_tables ORDER BY n_dead_tup DESC;
-----------------------------------------------------------------------------------------------------
解决:
执行 VACUUM
调整 autovacuum 参数
使用 VACUUM FULL(锁表)
02.复制问题
a.复制延迟
问题:
从库数据落后主库
-----------------------------------------------------------------------------------------------------
排查:
主库:SELECT pg_current_wal_lsn();
从库:SELECT pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn();
-----------------------------------------------------------------------------------------------------
解决:
优化网络带宽
调整 wal_sender_timeout
增加从库硬件资源
b.复制中断
问题:
从库停止复制
-----------------------------------------------------------------------------------------------------
排查:
SELECT * FROM pg_stat_replication;
查看从库日志
-----------------------------------------------------------------------------------------------------
解决:
检查网络连接
检查 WAL 文件是否被清理
重新建立复制
03.锁问题
a.死锁
问题:
ERROR: deadlock detected
-----------------------------------------------------------------------------------------------------
排查:
查看日志中的死锁信息
-----------------------------------------------------------------------------------------------------
解决:
按相同顺序访问资源
保持事务简短
使用较低的隔离级别
b.锁等待
问题:
查询长时间等待
-----------------------------------------------------------------------------------------------------
排查:
SELECT * FROM pg_locks WHERE NOT granted;
SELECT * FROM pg_stat_activity WHERE state = 'active' AND wait_event IS NOT NULL;
-----------------------------------------------------------------------------------------------------
解决:
终止阻塞的查询:SELECT pg_terminate_backend(pid);
优化查询,减少锁定时间
04.数据问题
a.数据损坏
问题:
ERROR: invalid page header
-----------------------------------------------------------------------------------------------------
排查:
检查硬件故障
检查文件系统
-----------------------------------------------------------------------------------------------------
解决:
从备份恢复
使用 pg_resetwal(谨慎使用)
b.数据不一致
问题:
主从数据不一致
-----------------------------------------------------------------------------------------------------
排查:
比较主从数据
检查复制状态
-----------------------------------------------------------------------------------------------------
解决:
重新建立从库
使用 pg_rewind 同步
05.运维问题
a.磁盘空间不足
问题:
ERROR: could not extend file
-----------------------------------------------------------------------------------------------------
排查:
df -h
SELECT pg_size_pretty(pg_database_size('mydb'));
-----------------------------------------------------------------------------------------------------
解决:
清理日志文件
删除旧的 WAL 文件
执行 VACUUM FULL
扩展磁盘空间
b.内存不足
问题:
Out of memory
-----------------------------------------------------------------------------------------------------
排查:
free -h
SELECT * FROM pg_stat_activity;
-----------------------------------------------------------------------------------------------------
解决:
调整 shared_buffers
调整 work_mem
优化查询
增加物理内存