
éçºè ãšããŠã®ãã£ãªã¢ãå§ãããšããæåã®ä»äºã¯DBAïŒããŒã¿ããŒã¹ç®¡çè ãDBAïŒã§ãããåœæãAWS RDSãAzureãGoogle Cloudããã®ä»ã®ã¯ã©ãŠããµãŒãã¹ãç»å ŽããåãããDBAã«ã¯æ¬¡ã®2çš®é¡ããããŸããã
- , . « », , .
- : , , SQL. ETL- . , .
ã¢ããªã±ãŒã·ã§ã³DBAã¯éåžžãéçºããŒã ã®äžéšã§ãã圌ãã¯ç¹å®ã®ãããã¯ã«ã€ããŠæ·±ãç¥èãæã£ãŠããã®ã§ãéåžžã¯1ã€ã2ã€ã®ãããžã§ã¯ãã«ããåãçµã¿ãŸããã§ãããã€ã³ãã©ã¹ãã©ã¯ãã£DBAã¯éåžžITããŒã ã®äžéšã§ãããåæã«è€æ°ã®ãããžã§ã¯ãã«åãçµãããšãã§ããŸããã
ç§ã¯ã¢ããªã±ãŒã·ã§ã³ããŒã¿ããŒã¹ã®ç®¡çè ã§ã
ããã¯ã¢ãããã¹ãã¬ãŒãžã®èª¿æŽãããããããšããè¡åã¯äžåºŠããããŸããã§ããïŒãã£ãšæ¥œããã§ãïŒïŒã仿¥ãŸã§ãç§ã¯ã¢ããªã±ãŒã·ã§ã³ã®éçºæ¹æ³ãç¥ã£ãŠããDB管çè ã§ãããããŒã¿ããŒã¹ãçè§£ããŠããéçºè ã§ã¯ãªããšèšãããã§ãã
ãã®èšäºã§ã¯ãç§ã®ãã£ãªã¢ã®äžã§åŠãã ããŒã¿ããŒã¹éçºã®ç§èš£ãããã€ã玹ä»ããŸãã
ã³ã³ãã³ãïŒ
- æŽæ°ãå¿ èŠãªãã®ã ããæŽæ°ãã
- éãè² è·ã®å¶çŽãšã€ã³ããã¯ã¹ãç¡å¹ã«ãã
- äžéããŒã¿ã«ã¯UNLOGGEDããŒãã«ã䜿çšããŸã
- WITHããã³RETURNINGã䜿çšããŠããã»ã¹å šäœãå®è£ ãã
- éžææ§ã®äœãåã®ã€ã³ããã¯ã¹ã¯é¿ããŠãã ãã
- éšåã€ã³ããã¯ã¹ã䜿çšãã
- ãœãŒããããããŒã¿ãåžžã«ããŒããã
- BRINãšã®çžé¢æ§ã®é«ãåã€ã³ããã¯ã¹
- ã€ã³ããã¯ã¹ããé衚瀺ãã«ãã
- é·ãããã»ã¹ã1æéã®éå§æã«éå§ããããã«ã¹ã±ãžã¥ãŒã«ããªãã§ãã ãã
- çµè«
æŽæ°ãå¿ èŠãªãã®ã ããæŽæ°ãã
ãã®æäœ
UPDATEã¯éåžžã«å€ãã®ãªãœãŒã¹ãæ¶è²»ããŸãããããã¹ããŒãã¢ããããæè¯ã®æ¹æ³ã¯ãæŽæ°ããå¿
èŠããããã®ã ããæŽæ°ããããšã§ãã
ã¡ãŒã«åãæ£èŠåãããªã¯ãšã¹ãã®äŸã次ã«ç€ºããŸãã
db=# UPDATE users SET email = lower(email);
UPDATE 1010000
Time: 1583.935 ms (00:01.584)
ç¡å®ã«èŠããŸãããïŒãã®ãªã¯ãšã¹ãã«ããã1,010,000ãŠãŒã¶ãŒã®ã¡ãŒã«ã¢ãã¬ã¹ãæŽæ°ãããŸãããããããã¹ãŠã®è¡ãæŽæ°ããå¿ èŠããããŸããïŒ
db=# UPDATE users SET email = lower(email)
db-# WHERE email != lower(email);
UPDATE 10000
Time: 299.470 ms
æŽæ°ããå¿ èŠãããã®ã¯10,000è¡ã ãã§ããåŠçããããŒã¿éãæžããããšã§ãå®è¡æéã1.5ç§ãã300msæªæºã«ççž®ããŸãããããã«ãããããŒã¿ããŒã¹ã®ä¿å®ã«ãããåŽåãç¯çŽã§ããŸãã

æŽæ°ãå¿ èŠãªãã®ã ããæŽæ°ããŸãã
ãã®ã¿ã€ãã®å€§èŠæš¡ãªæŽæ°ã¯ãããŒã¿ç§»è¡ã¹ã¯ãªããã§éåžžã«äžè¬çã§ããæ¬¡åãã®ãããªã¹ã¯ãªãããäœæãããšãã¯ãå¿ èŠãªãã®ã ããæŽæ°ããŠãã ããã
éãè² è·ã®å¶çŽãšã€ã³ããã¯ã¹ãç¡å¹ã«ãã
å¶çŽã¯ãªã¬ãŒã·ã§ãã«ããŒã¿ããŒã¹ã®éèŠãªéšåã§ããå¶çŽã¯ããŒã¿ã®äžè²«æ§ãšä¿¡é Œæ§ãç¶æããŸãããã ãããã¹ãŠã«ç¬èªã®äŸ¡æ Œããããå€ãã®å Žåã倿°ã®è¡ãããŠã³ããŒããŸãã¯æŽæ°ãããšãã«æéãæ¯æãå¿ èŠããããŸãã
å°ããªã¹ãã¬ãŒãžã¹ããŒããå®çŸ©ããŸãããïŒ
DROP TABLE IF EXISTS product CASCADE;
CREATE TABLE product (
id serial PRIMARY KEY,
name TEXT NOT NULL,
price INT NOT NULL
);
INSERT INTO product (name, price)
SELECT random()::text, (random() * 1000)::int
FROM generate_series(0, 10000);
DROP TABLE IF EXISTS customer CASCADE;
CREATE TABLE customer (
id serial PRIMARY KEY,
name TEXT NOT NULL
);
INSERT INTO customer (name)
SELECT random()::text
FROM generate_series(0, 100000);
DROP TABLE IF EXISTS sale;
CREATE TABLE sale (
id serial PRIMARY KEY,
created timestamptz NOT NULL,
product_id int NOT NULL,
customer_id int NOT NULL
);
ãnullã§ã¯ãªãããªã©ã®ããŸããŸãªã¿ã€ãã®å¶çŽãšäžæã®å¶çŽãå®çŸ©ããŸã...
éå§ç¹ãèšå®ããããã«ãããŒãã«ãžã®
saleå€éšããŒã®è¿œå ãéå§ããŸãããã
db=# ALTER TABLE sale ADD CONSTRAINT sale_product_fk
db-# FOREIGN KEY (product_id) REFERENCES product(id);
ALTER TABLE
Time: 18.413 ms
db=# ALTER TABLE sale ADD CONSTRAINT sale_customer_fk
db-# FOREIGN KEY (customer_id) REFERENCES customer(id);
ALTER TABLE
Time: 5.464 ms
db=# CREATE INDEX sale_created_ix ON sale(created);
CREATE INDEX
Time: 12.605 ms
db=# INSERT INTO SALE (created, product_id, customer_id)
db-# SELECT
db-# now() - interval '1 hour' * random() * 1000,
db-# (random() * 10000)::int + 1,
db-# (random() * 100000)::int + 1
db-# FROM generate_series(1, 1000000);
INSERT 0 1000000
Time: 15410.234 ms (00:15.410)
å¶çŽãšã€ã³ããã¯ã¹ãå®çŸ©ããåŸã100äžè¡ãããŒãã«ã«ããŒãããã®ã«çŽ15.4ç§ããããŸããã
ããã§ã¯ãæåã«ããŒã¿ãããŒãã«ã«ããŒãããŠãããå¶çŽãšã€ã³ããã¯ã¹ã远å ããŸãããã
db=# INSERT INTO SALE (created, product_id, customer_id)
db-# SELECT
db-# now() - interval '1 hour' * random() * 1000,
db-# (random() * 10000)::int + 1,
db-# (random() * 100000)::int + 1
db-# FROM generate_series(1, 1000000);
INSERT 0 1000000
Time: 2277.824 ms (00:02.278)
db=# ALTER TABLE sale ADD CONSTRAINT sale_product_fk
db-# FOREIGN KEY (product_id) REFERENCES product(id);
ALTER TABLE
Time: 169.193 ms
db=# ALTER TABLE sale ADD CONSTRAINT sale_customer_fk
db-# FOREIGN KEY (customer_id) REFERENCES customer(id);
ALTER TABLE
Time: 185.633 ms
db=# CREATE INDEX sale_created_ix ON sale(created);
CREATE INDEX
Time: 484.244 ms
èªã¿èŸŒã¿ã¯ã¯ããã«éãã2.27ç§ã§ããã15.4ã®ä»£ããã«ãã€ã³ããã¯ã¹ãšå¶éã¯ãããŒã¿ã®ããŒãåŸã«ã¯ããã«é·ãäœæãããŸããããããã»ã¹å šäœã¯ã¯ããã«é«éã§ããïŒ3.1ç§ã15.4ã®ä»£ããã«ã
æ®å¿µãªãããPostgreSQLã§ã¯ãã€ã³ããã¯ã¹ã䜿çšããŠåãããšãè¡ãããšã¯ã§ããŸãããã€ã³ããã¯ã¹ãç Žæ£ããŠåäœæããããšããã§ããŸãããOracleãªã©ã®ä»ã®ããŒã¿ããŒã¹ã§ã¯ãåæ§ç¯ããã«ã€ã³ããã¯ã¹ãç¡å¹ãŸãã¯æå¹ã«ã§ããŸãã
UNLOGGED-
PostgreSQLã§ããŒã¿ã倿Žãããšã倿Žã¯å èªã¿ãã°ïŒWALïŒã«æžã蟌ãŸããŸããããã¯ãäžè²«æ§ãç¶æãããªã«ããªäžã«è¿ éã«ã€ã³ããã¯ã¹ãåäœæããã¬ããªã±ãŒã·ã§ã³ãç¶æããããã«äœ¿çšãããŸãã
WALãžã®æžã蟌ã¿ãå¿ èŠã«ãªãããšããããããŸãããWALããªããã¢ãŠãããŠåŠçãé«éåã§ããç¶æ³ããããŸããããšãã°ãã¹ããŒãžã³ã°ããŒãã«ã®å Žåã§ãã
äžéããŒãã«ã¯ã¯ã³ã¿ã€ã ããŒãã«ãšåŒã°ããäžéšã®ããã»ã¹ã®å®è£ ã«äœ¿çšãããäžæããŒã¿ãæ ŒçŽããŸããããšãã°ãETLããã»ã¹ã§ã¯ãCSVãã¡ã€ã«ããã¹ããŒãžã³ã°ããŒãã«ã«ããŒã¿ãããŒãããæ å ±ãã¯ãªã¢ããŠãããã¿ãŒã²ããããŒãã«ã«ããŒãããã®ãéåžžã«äžè¬çã§ãããã®ã·ããªãªã§ã¯ãã¹ããŒãžã³ã°ããŒãã«ã¯1åéãã®äœ¿çšã§ãããããã¯ã¢ãããŸãã¯ã¬ããªã«ã§ã¯äœ¿çšãããŸããã

UNLOGGEDããŒãã«ã
é害ãçºçããå Žåã«å埩ããå¿ èŠããªããã¬ããªã«ã§å¿ èŠãšãããªãã¹ããŒãžã³ã°ããŒãã«ã¯ãUNLOGGEDãšããŠèšå®ã§ããŸãã
CREATE UNLOGGED TABLE staging_table ( /* table definition */ );
泚æïŒã䜿çš
UNLOGGEDããåã«ããã¹ãŠã®åœ±é¿ãå®å
šã«çè§£ããŠããããšã確èªããŠãã ããã
WITHããã³RETURNINGã䜿çšããŠããã»ã¹å šäœãå®è£ ãã
ãŠãŒã¶ãŒããŒãã«ããããããŒã¿ãéè€ããŠããããšãããã£ããšããŸãã
Table setup
db=# SELECT u.id, u.email, o.id as order_id
FROM orders o JOIN users u ON o.user_id = u.id;
id | email | order_id
----+-------------------+----------
1 | foo@bar.baz | 1
1 | foo@bar.baz | 2
2 | me@hakibenita.com | 3
3 | ME@hakibenita.com | 4
3 | ME@hakibenita.com | 5
ãŠãŒã¶ãŒhakibenitaã¯ã¡ãŒã«
ME@hakibenita.comãšã§2åç»é²ããŸããme@hakibenita.comãããŒãã«ã«å
¥åãããšãã«é»åã¡ãŒã«ã¢ãã¬ã¹ãæ£èŠåããŠããªããããéè€ãåŠçããå¿
èŠããããŸãã
å¿ èŠãªãã®ïŒ
- å°æåã§éè€ããã¢ãã¬ã¹ãç¹å®ããéè€ãããŠãŒã¶ãŒãçžäºã«ãªã³ã¯ããŸãã
- éè€ã®1ã€ã®ã¿ãåç §ããããã«æ³šæãæŽæ°ããŸãã
- ããŒãã«ããéè€ãåé€ããŸãã
ã¹ããŒãžã³ã°ããŒãã«ã䜿çšããŠãéè€ãããŠãŒã¶ãŒããªã³ã¯ã§ããŸãã
db=# CREATE UNLOGGED TABLE duplicate_users AS
db-# SELECT
db-# lower(email) AS normalized_email,
db-# min(id) AS convert_to_user,
db-# array_remove(ARRAY_AGG(id), min(id)) as convert_from_users
db-# FROM
db-# users
db-# GROUP BY
db-# normalized_email
db-# HAVING
db-# count(*) > 1;
CREATE TABLE
db=# SELECT * FROM duplicate_users;
normalized_email | convert_to_user | convert_from_users
-------------------+-----------------+--------------------
me@hakibenita.com | 2 | {3}
äžéããŒãã«ã«ã¯ããã€ã¯éã®ãªã³ã¯ãå«ãŸããŠããŸããæ£èŠåãããé»åã¡ãŒã«ã¢ãã¬ã¹ãæã€ãŠãŒã¶ãŒãè€æ°å衚瀺ãããå Žåãæå°ã®ãŠãŒã¶ãŒIDãå²ãåœãŠããã¹ãŠã®éè€ãæãããã¿ãŸããæ®ãã®ãŠãŒã¶ãŒã¯é ååã«æ ŒçŽãããããããžã®ãã¹ãŠã®åç §ãæŽæ°ãããŸãã
äžéããŒãã«ã䜿çšããŠãããŒãã«å ã®éè€ãžã®ãªã³ã¯ãæŽæ°ããŸã
ordersã
db=# UPDATE
db-# orders o
db-# SET
db-# user_id = du.convert_to_user
db-# FROM
db-# duplicate_users du
db-# WHERE
db-# o.user_id = ANY(du.convert_from_users);
UPDATE 2
ããã§ã以äžããéè€ãå®å šã«åé€ã§ããŸã
usersã
db=# DELETE FROM
db-# users
db-# WHERE
db-# id IN (
db(# SELECT unnest(convert_from_users)
db(# FROM duplicate_users
db(# );
DELETE 1
unnest 颿°ã䜿çšããŠé åãã倿ãããåèŠçŽ ãæååã«å€æããããšã«æ³šæããŠãã ããã
çµæïŒ
db=# SELECT u.id, u.email, o.id as order_id
db-# FROM orders o JOIN users u ON o.user_id = u.id;
id | email | order_id
----+-------------------+----------
1 | foo@bar.baz | 1
1 | foo@bar.baz | 2
2 | me@hakibenita.com | 3
2 | me@hakibenita.com | 4
2 | me@hakibenita.com | 5
ãã¹ãŠã®user
3ïŒME@hakibenita.comïŒã€ã³ã¹ã¿ã³ã¹ãuser 2ïŒme@hakibenita.comïŒã«å€æãããŸãã
éè€ãããŒãã«ããåé€ãããŠããããšã確èªããããšãã§ããŸã
usersã
db=# SELECT * FROM users;
id | email
----+-------------------
1 | foo@bar.baz
2 | me@hakibenita.com
ããã§ãã¹ããŒãžã³ã°ããŒãã«ãåãé€ãããšãã§ããŸãã
db=# DROP TABLE duplicate_users;
DROP TABLE
倧äžå€«ã§ãããæéãããããããŠæé€ãå¿ èŠã§ãïŒããè¯ãæ¹æ³ã¯ãããŸããïŒ
äžè¬åãããããŒãã«åŒïŒCTEïŒ
ã§ã¯ãäžè¬çãªããŒãã«åŒãåŒãšããŠç¥ãããŠããã
WITHæã
ã¯ãåäžã®SQLåŒã§å
šäœã®æé ãå®è¡ããããšãã§ããŸãïŒ
WITH duplicate_users AS (
SELECT
min(id) AS convert_to_user,
array_remove(ARRAY_AGG(id), min(id)) as convert_from_users
FROM
users
GROUP BY
lower(email)
HAVING
count(*) > 1
),
update_orders_of_duplicate_users AS (
UPDATE
orders o
SET
user_id = du.convert_to_user
FROM
duplicate_users du
WHERE
o.user_id = ANY(du.convert_from_users)
)
DELETE FROM
users
WHERE
id IN (
SELECT
unnest(convert_from_users)
FROM
duplicate_users
);
ã¹ããŒãžã³ã°ããŒãã«ã®ä»£ããã«ãæ±çšããŒãã«åŒãäœæããŠåå©çšããŸããã
CTEããçµæãè¿ã
åŒå ã§DMLãå®è¡ããå©ç¹ã®1ã€ã¯ãRETURNING
WITHããŒã¯ãŒãã䜿çšããŠDMLããããŒã¿ãè¿ãããšãã§ããããšã§ããæŽæ°ããã³åé€ãããè¡ã®æ°ã«é¢ããã¬ããŒããå¿
èŠã ãšããŸãã
WITH duplicate_users AS (
SELECT
min(id) AS convert_to_user,
array_remove(ARRAY_AGG(id), min(id)) as convert_from_users
FROM
users
GROUP BY
lower(email)
HAVING
count(*) > 1
),
update_orders_of_duplicate_users AS (
UPDATE
orders o
SET
user_id = du.convert_to_user
FROM
duplicate_users du
WHERE
o.user_id = ANY(du.convert_from_users)
RETURNING o.id
),
delete_duplicate_user AS (
DELETE FROM
users
WHERE
id IN (
SELECT unnest(convert_from_users)
FROM duplicate_users
)
RETURNING id
)
SELECT
(SELECT count(*) FROM update_orders_of_duplicate_users) AS orders_updated,
(SELECT count(*) FROM delete_duplicate_user) AS users_deleted
;
çµæïŒ
orders_updated | users_deleted
----------------+---------------
2 | 1
ãã®ã¢ãããŒãã®å©ç¹ã¯ãããã»ã¹å šäœã1ã€ã®ã³ãã³ãã§å®è¡ãããããããã©ã³ã¶ã¯ã·ã§ã³ã管çããããããã»ã¹ã«é害ãçºçããå Žåã«ã¹ããŒãžã³ã°ããŒãã«ããã©ãã·ã¥ããããšãå¿é ãããããå¿ èŠããªãããšã§ãã
èŠåïŒRedditã®èªè ããæ±çšããŒãã«åŒã§ã®DMLå®è¡ã®äºæž¬ã§ããªãåäœã®å¯èœæ§ãææããŸããã
ã®éšååŒã¯WITHãçžäºã«ãããã³ã¡ã€ã³ã¯ãšãªãšåæã«å®è¡ãããŸãããããã£ãŠãWITHããŒã¿å€æŽåŒã§äœ¿çšããå Žåãå®éã®æŽæ°é åºã¯äºæž¬ã§ããŸããã
ããã¯ãç¬ç«ããéšååŒãå®è¡ãããé åºã«äŸåã§ããªãããšãæå³ããŸããäžèšã®äŸã®ããã«ããããã®éã«äŸåé¢ä¿ãããå Žåããããã䜿çšããåã«ãäŸåããéšååŒã®å®è¡ã«äŸåã§ããããšãããããŸãã
éžææ§ã®äœãåã®ã€ã³ããã¯ã¹ã¯é¿ããŠãã ãã
ãŠãŒã¶ãŒãé»åã¡ãŒã«ã¢ãã¬ã¹ã§ãã°ã€ã³ãããµã€ã³ã¢ããããã»ã¹ããããšããŸããã¢ã«ãŠã³ããã¢ã¯ãã£ãåããã«ã¯ãã¡ãŒã«ã確èªããå¿ èŠããããŸããããŒãã«ã¯æ¬¡ã®ããã«ãªããŸãã
db=# CREATE TABLE users (
db-# id serial,
db-# username text,
db-# activated boolean
db-#);
CREATE TABLE
ã»ãšãã©ã®ãŠãŒã¶ãŒã¯åžæ°ãæèããŠãããæ£ããéµéå äœæã§ç»é²ããããã«ã¢ã«ãŠã³ããã¢ã¯ãã£ãã«ããŸããããŒãã«ã«ãŠãŒã¶ãŒããŒã¿ãå ¥åãããŠãŒã¶ãŒã®90ïŒ ãã¢ã¯ãã£ãåãããŠãããšä»®å®ããŸãããã
db=# INSERT INTO users (username, activated)
db-# SELECT
db-# md5(random()::text) AS username,
db-# random() < 0.9 AS activated
db-# FROM
db-# generate_series(1, 1000000);
INSERT 0 1000000
db=# SELECT activated, count(*) FROM users GROUP BY activated;
activated | count
-----------+--------
f | 102567
t | 897433
db=# VACUUM ANALYZE users;
VACUUM
ã¢ã¯ãã£ãåããããŠãŒã¶ãŒãšã¢ã¯ãã£ãåãããŠããªããŠãŒã¶ãŒã®æ°ãç §äŒããã«ã¯ãåããšã«ã€ã³ããã¯ã¹ãäœæããŸã
activatedã
db=# CREATE INDEX users_activated_ix ON users(activated);
CREATE INDEX
ãŸããã¢ã¯ãã£ãåãããŠããªããŠãŒã¶ãŒã®æ°ãå°ãããšãããŒã¿ããŒã¹ã¯ã€ã³ããã¯ã¹ã䜿çšããŸãã
db=# EXPLAIN SELECT * FROM users WHERE NOT activated;
QUERY PLAN
--------------------------------------------------------------------------------------
Bitmap Heap Scan on users (cost=1923.32..11282.99 rows=102567 width=38)
Filter: (NOT activated)
-> Bitmap Index Scan on users_activated_ix (cost=0.00..1897.68 rows=102567 width=0)
Index Cond: (activated = false)
ããŒã¹ã¯ããã£ã«ã¿ãŒã102,567ã¢ã€ãã ãã€ãŸãããŒãã«ã®çŽ10ïŒ ãè¿ãããšã決å®ããŸãããããã¯ããŒãããããŒã¿ãšäžèŽããŠããã®ã§ãããŒãã«ã¯ããŸãæ©èœããŸããã
ãã ããã¢ã¯ãã£ãåããããŠãŒã¶ãŒã®æ°ãç §äŒãããšãããŒã¿ããŒã¹ãã€ã³ããã¯ã¹ã䜿çšããªãããšã決å®ããããšãããããŸãã
db=# EXPLAIN SELECT * FROM users WHERE activated;
QUERY PLAN
---------------------------------------------------------------
Seq Scan on users (cost=0.00..18334.00 rows=897433 width=38)
Filter: activated
ããŒã¿ããŒã¹ãã€ã³ããã¯ã¹ã䜿çšããŠããªãå Žåãå€ãã®éçºè ã¯æ··ä¹±ããŸãããããè¡ãçç±ã説æãããšã次ã®ããã«ãªããŸããããŒãã«å šäœãèªã¿åãå¿ èŠãããå Žåãã€ã³ããã¯ã¹ã䜿çšããŸããïŒ
ããããããã§ã¯ãªãã§ãããããªããããå¿ èŠãªã®ã§ããïŒãã£ã¹ã¯ããã®èªã¿åãã¯ã³ã¹ããããããããèªã¿åãã¯ã§ããã ãå°ãªãããå¿ èŠããããŸããããšãã°ãããŒãã«ã®ãµã€ãºã10 MBã§ãã€ã³ããã¯ã¹ã1 MBã®å ŽåãããŒãã«å šäœãèªã¿åãã«ã¯ããã£ã¹ã¯ãã10MBãèªã¿åãå¿ èŠããããŸãããŸããã€ã³ããã¯ã¹ã远å ãããšã11MBã«ãªããŸããããã¯ç¡é§ã§ãã
次ã«ãPostgreSQLãããŒãã«ã«åéããçµ±èšãèŠãŠã¿ãŸãããã
db=# SELECT attname, n_distinct, most_common_vals, most_common_freqs
db-# FROM pg_stats
db-# WHERE tablename = 'users' AND attname='activated';
------------------+------------------------
attname | activated
n_distinct | 2
most_common_vals | {t,f}
most_common_freqs | {0.89743334,0.10256667}
PostgreSQLãããŒãã«ãè§£æãããšãããåã«
activated2ã€ã®ç°ãªãå€ãããããšãããããŸãããtåã®å€ã¯åmost_common_valsã®åšæ³¢æ°0.89743334ã«å¯Ÿå¿ããmost_common_freqså€fã¯åšæ³¢æ°ã«å¯Ÿå¿ããŸã0.10256667ãããŒãã«ãåæããåŸãããŒã¿ããŒã¹ã¯ãã¬ã³ãŒãã®89.74ïŒ
ãã¢ã¯ãã£ãåããããŠãŒã¶ãŒã§ãããæ®ãã®10.26ïŒ
ãéã¢ã¯ãã£ãåãããŠãããšå€æããŸããã
ãããã®çµ±èšã«åºã¥ããŠãPostgreSQLã¯ãè¡ã®90ïŒ ãæ¡ä»¶ãæºãããšæ³å®ããããããããŒãã«å šäœãã¹ãã£ã³ããæ¹ããããšå€æããŸãããããŒã¿ããŒã¹ãã€ã³ããã¯ã¹ã䜿çšãããã©ãããæ±ºå®ã§ãããããå€ã¯ãå€ãã®èŠå ã«äŸåãã倧ãŸããªã«ãŒã«ã¯ãããŸããã

éžææ§ãäœãåãšé«ãåã®ã€ã³ããã¯ã¹ã
éšåã€ã³ããã¯ã¹ã䜿çšãã
åã®ç« ã§ã¯ãã¬ã³ãŒãã®çŽ90ïŒ
trueïŒã¢ã¯ãã£ãåããããŠãŒã¶ãŒïŒãæã€ããŒã«åã®ã€ã³ããã¯ã¹ãäœæããŸããã
ã¢ã¯ãã£ããŠãŒã¶ãŒã®æ°ãå°ãããšãããããŒã¿ããŒã¹ã¯ã€ã³ããã¯ã¹ã䜿çšããŠããŸããã§ããããŸããéã¢ã¯ãã£ãåãããæ°ãå°ãããããšããããŒã¿ããŒã¹ã¯ã€ã³ããã¯ã¹ã䜿çšããŸããã
ããŒã¿ããŒã¹ãã¢ã¯ãã£ããŠãŒã¶ãŒãé€å€ããããã«ã€ã³ããã¯ã¹ã䜿çšããªãå Žåããããããªããããã«ã€ã³ããã¯ã¹ãä»ããã®ã§ããããã
ãã®è³ªåã«çããåã«ãåããšã®å®å šãªã€ã³ããã¯ã¹ã®éã¿ãèŠãŠã¿ãŸããã
activatedã
db=# \di+ users_activated_ix
Schema | Name | Type | Owner | Table | Size
--------+--------------------+-------+-------+-------+------
public | users_activated_ix | index | haki | users | 21 MB
ã€ã³ããã¯ã¹ã®ééã¯21MBã§ããåèãŸã§ã«ããŠãŒã¶ãŒã®ããããŒãã«ã¯65MBã§ããã€ãŸããã€ã³ããã¯ã¹ã®éã¿ã¯ããŒã¹ã®éã¿ã®çŽ32ïŒ ã§ããããã¯èšã£ãŠããã€ã³ããã¯ã¹ã³ã³ãã³ãã®çŽ90ïŒ ã䜿çšãããå¯èœæ§ã¯äœãããšãããã£ãŠããŸãã
PostgreSQLã§ã¯ãããŒãã«ã®äžéšã«ã®ã¿ã€ã³ããã¯ã¹ãäœæã§ããŸããããããéšåã€ã³ããã¯ã¹ã§ãã
db=# CREATE INDEX users_unactivated_partial_ix ON users(id)
db-# WHERE not activated;
CREATE INDEX
åŒã䜿çš
WHEREããŠãã€ã³ããã¯ã¹ã§ã«ããŒãããæååãå¶çŽããŸãããããæ©èœãããã©ããã確èªããŸãããïŒ
db=# EXPLAIN SELECT * FROM users WHERE not activated;
QUERY PLAN
------------------------------------------------------------------------------------------------
Index Scan using users_unactivated_partial_ix on users (cost=0.29..3493.60 rows=102567 width=38)
ãã°ãããããšã«ãããŒã¿ããŒã¹ã¯ãã¯ãšãªã§äœ¿çšããããŒã«åŒãéšåçãªã€ã³ããã¯ã¹ã«å¯ŸããŠæ©èœããå¯èœæ§ãããããšãçè§£ããã®ã«ååã¹ããŒãã§ããããšãããããŸããã
ãã®ã¢ãããŒãã«ã¯å¥ã®å©ç¹ããããŸãã
db=# \di+ users_unactivated_partial_ix
List of relations
Schema | Name | Type | Owner | Table | Size
--------+------------------------------+-------+-------+-------+---------
public | users_unactivated_partial_ix | index | haki | users | 2216 kB
å®å šãªåã®ã€ã³ããã¯ã¹ã®éã¿ã¯21MBã§ãéšåçãªã€ã³ããã¯ã¹ã¯ããã2.2MBã§ããããã¯10ïŒ ã§ãããããŒãã«å ã®éã¢ã¯ãã£ãåããããŠãŒã¶ãŒã®å²åã«å¯Ÿå¿ããŸãã
ãœãŒããããããŒã¿ãåžžã«ããŒããã
ããã¯ãã³ãŒããè§£æãããšãã«æãé »ç¹ã«äœ¿çšããã³ã¡ã³ãã®1ã€ã§ããã¢ããã€ã¹ã¯ä»ã®ã¢ããã€ã¹ã»ã©çŽæçã§ã¯ãªããçç£æ§ã«å€§ããªåœ±é¿ãäžããå¯èœæ§ããããŸãã
ããªããç¹å®ã®å£²äžé«ãæã€å·šå€§ãªããŒãã«ãæã£ãŠãããšããŸãããïŒ
db=# CREATE TABLE sale_fact (id serial, username text, sold_at date);
CREATE TABLE
ETLããã»ã¹äžã¯æ¯æ©ãããŒã¿ãããŒãã«ã«ããŒãããŸãã
db=# INSERT INTO sale_fact (username, sold_at)
db-# SELECT
db-# md5(random()::text) AS username,
db-# '2020-01-01'::date + (interval '1 day') * round(random() * 365 * 2) AS sold_at
db-# FROM
db-# generate_series(1, 100000);
INSERT 0 100000
db=# VACUUM ANALYZE sale_fact;
VACUUM
ããŠã³ããŒããã·ãã¥ã¬ãŒãããããã«ãã©ã³ãã ããŒã¿ã䜿çšããŸããã©ã³ãã ãªååã®10äžè¡ãæ¿å ¥ãã2020幎1æ1æ¥ãã2幎åãŸã§ã®è²©å£²æ¥ãèšèŒããŸããã
ã»ãšãã©ã®å Žåããã®è¡šã¯èŠçŽè²©å£²ã¬ããŒãã«äœ¿çšãããŸããã»ãšãã©ã®å Žåãç¹å®ã®æéã®å£²äžã確èªããããã«æ¥ä»ã§ãã£ã«ã¿ãªã³ã°ããŸããç¯å²ã¹ãã£ã³ãé«éåããããã«ãæ¬¡ã®æ¹æ³ã§ã€ã³ããã¯ã¹ãäœæããŸããã
sold_atã
db=# CREATE INDEX sale_fact_sold_at_ix ON sale_fact(sold_at);
CREATE INDEX
2020幎6æã«ãã¹ãŠã®å£²äžãååŸãããªã¯ãšã¹ãã®å®è¡èšç»ãèŠãŠã¿ãŸãããã
db=# EXPLAIN (ANALYZE)
db-# SELECT *
db-# FROM sale_fact
db-# WHERE sold_at BETWEEN '2020-07-01' AND '2020-07-31';
QUERY PLAN
-----------------------------------------------------------------------------------------------
Bitmap Heap Scan on sale_fact (cost=108.30..1107.69 rows=4293 width=41)
Recheck Cond: ((sold_at >= '2020-07-01'::date) AND (sold_at <= '2020-07-31'::date))
Heap Blocks: exact=927
-> Bitmap Index Scan on sale_fact_sold_at_ix (cost=0.00..107.22 rows=4293 width=0)
Index Cond: ((sold_at >= '2020-07-01'::date) AND (sold_at <= '2020-07-31'::date))
Planning Time: 0.191 ms
Execution Time: 5.906 ms
ãã£ãã·ã¥ããŠã©ãŒã ã¢ããããããã«ãªã¯ãšã¹ããæ°åå®è¡ããåŸãå®è¡æéã¯6ããªç§ã®ã¬ãã«ã§å®å®ããŸããã
ããããããã¹ãã£ã³
å®è¡ã«é¢ããŠã¯ãããŒã¹ãããããããã¹ãã£ã³ã䜿çšããŠããããšãããããŸããããã¯2ã€ã®æ®µéã§è¡ãããŸãïŒ
(Bitmap Index Scan)ïŒããŒã¹ã¯ã€ã³ããã¯ã¹å šäœsale_fact_sold_at_ixã調ã¹ãŠãé¢é£ããè¡ãå«ãããŒãã«å ã®ãã¹ãŠã®ããŒãžãæ€çŽ¢ããŸãã(Bitmap Heap Scan)ïŒããŒã¹ã¯ãé¢é£ããæååãå«ãããŒãžãèªã¿åããæ¡ä»¶ãæºããããŒãžãèŠã€ããŸãã
ããŒãžã«ã¯å€ãã®è¡ãå«ããããšãã§ããŸããæåã®ã¹ãããã§ã¯ãã€ã³ããã¯ã¹ã䜿çšããŠããŒãžãæ€çŽ¢ããŸãã第2段éã§ã¯ããŒãžå ã®è¡ãæ€çŽ¢ãããã
Recheck Condãå®è¡èšç»ã®æäœã¯æ¬¡ã®ããã«ãªããŸãã
ãã®æç¹ã§ãå€ãã®DBAãšéçºè ã¯ç· ãããããæ¬¡ã®ã¯ãšãªã«é²ã¿ãŸãããããããã®ã¯ãšãªãæ¹åããæ¹æ³ããããŸãã
ã€ã³ããã¯ã¹ã¹ãã£ã³
ããŒã¿ã®èªã¿èŸŒã¿ã«å°ããªå€æŽãå ããŸãããã
db=# TRUNCATE sale_fact;
TRUNCATE TABLE
db=# INSERT INTO sale_fact (username, sold_at)
db-# SELECT
db-# md5(random()::text) AS username,
db-# '2020-01-01'::date + (interval '1 day') * round(random() * 365 * 2) AS sold_at
db-# FROM
db-# generate_series(1, 100000)
db-# ORDER BY sold_at;
INSERT 0 100000
db=# VACUUM ANALYZE sale_fact;
VACUUM
ä»åã¯ãã§ãœãŒããããããŒã¿ãããŒãããŸãã
sold_atã
ããã§ãåãã¯ãšãªã®å®è¡èšç»ã¯æ¬¡ã®ããã«ãªããŸãã
db=# EXPLAIN (ANALYZE)
db-# SELECT *
db-# FROM sale_fact
db-# WHERE sold_at BETWEEN '2020-07-01' AND '2020-07-31';
QUERY PLAN
---------------------------------------------------------------------------------------------
Index Scan using sale_fact_sold_at_ix on sale_fact (cost=0.29..184.73 rows=4272 width=41)
Index Cond: ((sold_at >= '2020-07-01'::date) AND (sold_at <= '2020-07-31'::date))
Planning Time: 0.145 ms
Execution Time: 2.294 ms
æ°åå®è¡ããåŸãå®è¡æéã¯2.3msã§å®å®ããŸãããçŽ60ïŒ ã®æç¶å¯èœãªç¯çŽãéæããŸããã
ãŸããä»åã¯ããŒã¿ããŒã¹ãããããããã¹ãã£ã³ã䜿çšããããéåžžã®ãã€ã³ããã¯ã¹ã¹ãã£ã³ãé©çšããããšãããããŸããã©ãããŠïŒ
çžé¢
ããŒã¿ããŒã¹ãããŒãã«ãåæãããšãååŸã§ãããã¹ãŠã®çµ±èšãåéãããŸãããã©ã¡ãŒã¿ã®1ã€ã¯çžé¢ã§ãïŒ
è¡ã®ç©ççãªé åºãšåã®å€ã®è«ççãªé åºã®éã®çµ±èšççžé¢ãå€ãçŽ-1ãŸãã¯+1ã®å Žåãã©ã³ãã ãã£ã¹ã¯ã¢ã¯ã»ã¹ã®æ°ãæžå°ãããããçžé¢å€ãçŽ0ã®å Žåããããåå šäœã®ã€ã³ããã¯ã¹ã¹ãã£ã³ã®æ¹ãæå©ã§ãããšèŠãªãããŸãã
å ¬åŒããã¥ã¡ã³ãã§èª¬æãããŠããããã«ãçžé¢ã¯ããã£ã¹ã¯äžã®ç¹å®ã®åã®å€ãã©ã®ããã«ããœãŒãããããŠãããã®å°ºåºŠã§ãã

çžé¢= 1ã
çžé¢ã1çšåºŠã®å ŽåãããŒãžãããŒãã«ã®è¡ãšã»ãŒåãé åºã§ãã£ã¹ã¯ã«ä¿åãããŠããããšãæå³ããŸããããã¯éåžžã«äžè¬çã§ããããšãã°ãèªåã€ã³ã¯ãªã¡ã³ãIDã®çžé¢ã¯1ã«è¿ãåŸåããããŸããè¡ãäœæãããæ¥æãç€ºãæ¥ä»åãšã¿ã€ã ã¹ã¿ã³ãåã®çžé¢ã1ã«è¿ããªããŸãã
çžé¢ã-1ã®å ŽåãããŒãžã¯åã®éã®é åºã§äžŠã¹æ¿ããããŸãã

çžé¢ã0ã
çžé¢ã0ã«è¿ãå Žåã¯ãåã®å€ãããŒãã«ã®ããŒãžé åºãšçžé¢ããŠããªãããã»ãšãã©çžé¢ããŠããªãããšãæå³ããŸãã
ã«æ»ããŸããã
sale_factãäºåã«äžŠã¹æ¿ããã«ããŒã¿ãããŒãã«ã«ããŒããããšãçžé¢é¢ä¿ã¯æ¬¡ã®ããã«ãªããŸãã
db=# SELECT tablename, attname, correlation
db-# FROM pg_stats
db=# WHERE tablename = 'sale_fact';
tablename | attname | correlation
-----------+----------+--------------
sale | id | 1
sale | username | -0.005344716
sale | sold_at | -0.011389783
èªåçæãããåIDã®çžé¢ã¯1ã§ããåã®çžé¢ã¯
sold_atéåžžã«äœããé£ç¶ããå€ãããŒãã«å
šäœã«æ£ãã°ã£ãŠããŸãã
ãœãŒããããããŒã¿ãããŒãã«ã«ããŒããããšã圌女ã¯çžé¢é¢ä¿ãèšç®ããŸããã
tablename | attname | correlation
-----------+----------+----------------
sale_fact | id | 1
sale_fact | username | -0.00041992788
sale_fact | sold_at | 1
ããã§ãçžé¢
sold_atã¯çãããªã1ãŸãã
ã§ã¯ããªãããŒã¿ããŒã¹ã¯çžé¢ãäœããšãã«ããããããã¹ãã£ã³ã䜿çšããçžé¢ãé«ããšãã«ã€ã³ããã¯ã¹ã¹ãã£ã³ã䜿çšããã®ã§ããããã
- çžé¢ã1ã®å ŽåãããŒã¹ã¯ãèŠæ±ãããç¯å²ã®è¡ãé£ç¶ããããŒãžã«ããå¯èœæ§ãé«ããšå€æããŸãããæ¬¡ã«ãã€ã³ããã¯ã¹ã¹ãã£ã³ã䜿çšããŠè€æ°ã®ããŒãžãèªã¿åãããšããå§ãããŸãã
- çžé¢ã0ã«è¿ãå ŽåãããŒã¹ã¯ãèŠæ±ãããç¯å²ã®è¡ãããŒãã«å šäœã«æ£åšããŠããå¯èœæ§ãé«ããšå€æããŸãããæ¬¡ã«ãå¿ èŠãªè¡ãå«ãããŒãžã®ããããããã¹ãã£ã³ã䜿çšããæ¡ä»¶ã䜿çšããŠããããæœåºããããšããå§ãããŸãã
次ã«ããŒã¿ãããŒãã«ã«ããŒããããšãã¯ãèŠæ±ãããæ å ±ã®éãæ€èšããã€ã³ããã¯ã¹ãç¯å²ããã°ããã¹ãã£ã³ã§ããããã«äžŠã¹æ¿ããŸãã
CLUSTERã³ãã³ã
ç¹å®ã®ã€ã³ããã¯ã¹ã§ããã£ã¹ã¯äžã®ããŒãã«ããœãŒããããå¥ã®æ¹æ³ã¯ãCLUSTERã³ãã³ãã䜿çšããããšã§ãã
äŸãã°ïŒ
db=# TRUNCATE sale_fact;
TRUNCATE TABLE
-- Insert rows without sorting
db=# INSERT INTO sale_fact (username, sold_at)
db-# SELECT
db-# md5(random()::text) AS username,
db-# '2020-01-01'::date + (interval '1 day') * round(random() * 365 * 2) AS sold_at
db-# FROM
db-# generate_series(1, 100000)
INSERT 0 100000
db=# ANALYZE sale_fact;
ANALYZE
db=# SELECT tablename, attname, correlation
db-# FROM pg_stats
db-# WHERE tablename = 'sale_fact';
tablename | attname | correlation
-----------+-----------+----------------
sale_fact | sold_at | -5.9702674e-05
sale_fact | id | 1
sale_fact | username | 0.010033822
ããŒã¿ãã©ã³ãã ãªé åºã§ããŒãã«ã«ããŒããããããçžé¢
sold_atã¯ãŒãã«è¿ããªããŸãã
ã«ãã£ãŠããŒãã«ããåæ§æããã
sold_atã«ã¯ã次ã®ã³ãã³ãã䜿çšããŠãCLUSTERãã£ã¹ã¯äžã®ããŒãã«ãã€ã³ããã¯ã¹ã«åŸã£ãŠãœãŒãããŸãsale_fact_sold_at_ixã
db=# CLUSTER sale_fact USING sale_fact_sold_at_ix;
CLUSTER
db=# ANALYZE sale_fact;
ANALYZE
db=# SELECT tablename, attname, correlation
db-# FROM pg_stats
db-# WHERE tablename = 'sale_fact';
tablename | attname | correlation
-----------+----------+--------------
sale_fact | sold_at | 1
sale_fact | id | -0.002239401
sale_fact | username | 0.013389298
ããŒãã«ãã¯ã©ã¹ã¿ãŒåããåŸãçžé¢
sold_atã¯1ã«ãªããŸããã

CLUSTERã³ãã³ãã
泚æç¹ïŒ
- ç¹å®ã®åã§ããŒãã«ãã¯ã©ã¹ã¿ãŒåãããšãå¥ã®åã®çžé¢ã«åœ±é¿ãäžããå¯èœæ§ããããŸããããšãã°ãã§ã¯ã©ã¹ã¿ãªã³ã°ã
sold_atãåŸã®IDã®çžé¢é¢ä¿ãèŠãŠã¿ãŸãããã CLUSTERéããŠããããã³ã°æäœãªã®ã§ãã©ã€ãããŒãã«ã«ã¯é©çšããªãã§ãã ããã
ãããã®çç±ããããã§ã«ãœãŒããããŠãããã«äŸåããªãããŒã¿ãæ¿å ¥ããããšããå§ãã
CLUSTERãŸãã
BRINãšã®çžé¢æ§ã®é«ãåã€ã³ããã¯ã¹
ã€ã³ããã¯ã¹ã«é¢ããŠã¯ãå€ãã®éçºè ãBããªãŒã«ã€ããŠèããŠããŸãããã ããPostgreSQLã¯ãBRINãªã©ã®ä»ã®ã¿ã€ãã®ã€ã³ããã¯ã¹ãæäŸããŸãã
BRINã¯ãäžéšã®åãããŒãã«å ã®ç©ççãªäœçœ®ãšèªç¶ã«çžé¢ããéåžžã«å€§ããªããŒãã«ã§æ©èœããããã«èšèšãããŠããŸã
BRINã¯BlockRangeIndexã®ç¥ã§ããããã¥ã¡ã³ãã«ãããšãBRINã¯çžé¢æ§ã®é«ãåã§æé©ã«æ©èœããŸããåã®ç« ã§èŠãããã«ãèªåã€ã³ã¯ãªã¡ã³ãIDãšã¿ã€ã ã¹ã¿ã³ãã¯ããŒãã«ã®ç©ççæ§é ãšèªç¶ã«çžé¢ãããããBRINã¯ãããã«ãšã£ãŠããæçã§ãã
ç¹å®ã®æ¡ä»¶äžã§ã¯ãBRINã¯ãåçã®BããªãŒã€ã³ããã¯ã¹ãšæ¯èŒããŠããµã€ãºãšããã©ãŒãã³ã¹ã®ç¹ã§ããåªãããã³ã¹ãããã©ãŒãã³ã¹ããæäŸã§ããŸãã

ããªã³ã
BRINã¯ãããŒãã«å ã®ããã€ãã®é£æ¥ããããŒãžå ã®å€ã®ç¯å²ã§ããåã«æ¬¡ã®å€ãããããããããå¥ã ã®ããŒãžã«ãããšããŸãããïŒ
1, 2, 3, 4, 5, 6, 7, 8, 9
BRINã¯ã飿¥ããããŒãžã®ç¯å²ã§æ©èœããŸãã飿¥ãã3ã€ã®ããŒãžãæå®ãããšãã€ã³ããã¯ã¹ã¯ããŒãã«ã次ã®ç¯å²ã«åå²ããŸãã
[1,2,3], [4,5,6], [7,8,9]
ç¯å²ããšã«ãBRINã¯æå°å€ãšæå€§å€ãæ ŒçŽããŸãã
[1â3], [4â6], [7â9]
ãã®ã€ã³ããã¯ã¹ã䜿çšããŠãå€5ãæ¢ããŸãããã
- [1â3]-圌ã¯ç¢ºãã«ããã«ããŸããã
- [4â6]-ããã«ãããããããŸããã
- [7â9]-圌ã¯ç¢ºãã«ããã«ããŸããã
BRINã䜿çšããŠãæ€çŽ¢é åããããã¯4ã6ã«å¶éããŸããã
å¥ã®äŸãèŠãŠã¿ãŸããããåã®å€ã®çžé¢ããŒãã«è¿ããã€ãŸãäžŠã¹æ¿ããããŠããªãããã«ããŸãïŒ
[2,9,5], [1,4,7], [3,8,6]
飿¥ãã3ã€ã®ãããã¯ã«ã€ã³ããã¯ã¹ãä»ãããšã次ã®ç¯å²ãåŸãããŸãã
[2â9], [1â7], [3â8]
å€5ãæ¢ããŸãããïŒ
- [2-9]-ããã«ãããããããŸããã
- [1-7]-ããã«ãããããããŸããã
- [3â8]-ããã«ãããããããŸããã
ãã®å Žåãã€ã³ããã¯ã¹ã¯æ€çŽ¢ããŸã£ããçµã蟌ãŸãªãããã圹ã«ç«ã¡ãŸããã
pages_per_rangeãçè§£ãã
飿¥ããããŒãžã®æ°ã¯ããã©ã¡ãŒã¿ã«ãã£ãŠæ±ºå®ãããŸã
pages_per_rangeãç¯å²å
ã®ããŒãžæ°ã¯ãBRINã®ãµã€ãºãšç²ŸåºŠã«åœ±é¿ããŸãã
pages_per_rangeã€ã³ããã¯ã¹ãå°ãããŠç²ŸåºŠãäœããšã倧ããªå€ã«ãªããŸãã- å€
pages_per_rangeãå°ãããããšãã€ã³ããã¯ã¹ã倧ãããªããããæ£ç¢ºã«ãªããŸãã
ããã©ã«ã
pages_per_rangeã¯128ã§ãã

äœãpages_per_rangeã®BRINã
説æã®ããã«ã2ããŒãžã®ç¯å²ã§BRINãäœæãã5ã®å€ãæ¢ããŸãããã
- [1â2]-圌ã¯ç¢ºãã«ããã«ããŸããã
- [3â4]-圌ã¯ç¢ºãã«ããã«ããŸããã
- [5-6]-ããã«ãããããããŸããã
- [7â8]-圌ã¯ç¢ºãã«ããã«ããŸããã
- [9]-ããã§ã¯çµ¶å¯Ÿã«ããã§ã¯ãããŸããã
2ããŒãžã®ç¯å²ã§ã¯ãæ€çŽ¢ããããã¯5ãš6ã«å¶éã§ããŸããç¯å²ã3ããŒãžã®å Žåãã€ã³ããã¯ã¹ã¯æ€çŽ¢ããããã¯4ã5ã6ã«å¶éããŸã
ã2ã€ã®ã€ã³ããã¯ã¹ã®ãã1ã€ã®éãã¯ãç¯å²ã3ããŒãžã®å Žåã3ã€ã®ç¯å²ãæ ŒçŽããå¿ èŠãããããšã§ãã ãããã³ç¯å²å ã«2ããŒãžããå Žåããã§ã«5ã€ã®ç¯å²ãååŸãããã€ã³ããã¯ã¹ãå¢å ããŸãã
BRINãäœæãã
ããŒãã«
sales_factãåããåããšã«BRINãäœæããŸãããsold_atïŒ
db=# CREATE INDEX sale_fact_sold_at_bix ON sale_fact
db-# USING BRIN(sold_at) WITH (pages_per_range = 128);
CREATE INDEX
ããã©ã«ãã¯
pages_per_range = 128ã§ãã
次ã«ã販売æéãç §äŒããŠã¿ãŸãããã
db=# EXPLAIN (ANALYZE)
db-# SELECT *
db-# FROM sale_fact
db-# WHERE sold_at BETWEEN '2020-07-01' AND '2020-07-31';
QUERY PLAN
--------------------------------------------------------------------------------------------
Bitmap Heap Scan on sale_fact (cost=13.11..1135.61 rows=4319 width=41)
Recheck Cond: ((sold_at >= '2020-07-01'::date) AND (sold_at <= '2020-07-31'::date))
Rows Removed by Index Recheck: 23130
Heap Blocks: lossy=256
-> Bitmap Index Scan on sale_fact_sold_at_bix (cost=0.00..12.03 rows=12500 width=0)
Index Cond: ((sold_at >= '2020-07-01'::date) AND (sold_at <= '2020-07-31'::date))
Execution Time: 8.877 ms
ããŒã¹ã¯BRINã䜿çšããŠæ¥ä»æéãååŸããŸããããããã¯è峿·±ãããšã§ã¯ãããŸãã...
pages_per_rangeã®æé©å
å®è¡èšç»ã«ãããšãããŒã¿ããŒã¹ã¯ãã€ã³ããã¯ã¹ã䜿çšããŠèŠã€ããããŒãžãã23130è¡ãåé€ããŸãããããã¯ãã€ã³ããã¯ã¹ã«æå®ããç¯å²ããã®ã¯ãšãªã«å¯ŸããŠå€§ããããããšã瀺ããŠããå¯èœæ§ããããŸããç¯å²å ã®ããŒãžæ°ãååã®ã€ã³ããã¯ã¹ãäœæããŸãããã
db=# CREATE INDEX sale_fact_sold_at_bix64 ON sale_fact
db-# USING BRIN(sold_at) WITH (pages_per_range = 64);
CREATE INDEX
db=# EXPLAIN (ANALYZE)
db- SELECT *
db- FROM sale_fact
db- WHERE sold_at BETWEEN '2020-07-01' AND '2020-07-31';
QUERY PLAN
---------------------------------------------------------------------------------------------
Bitmap Heap Scan on sale_fact (cost=13.10..1048.10 rows=4319 width=41)
Recheck Cond: ((sold_at >= '2020-07-01'::date) AND (sold_at <= '2020-07-31'::date))
Rows Removed by Index Recheck: 9434
Heap Blocks: lossy=128
-> Bitmap Index Scan on sale_fact_sold_at_bix64 (cost=0.00..12.02 rows=6667 width=0)
Index Cond: ((sold_at >= '2020-07-01'::date) AND (sold_at <= '2020-07-31'::date))
Execution Time: 5.491 ms
ç¯å²å ã®64ããŒãžã§ãããŒã¿ããŒã¹ã¯ã€ã³ããã¯ã¹-9 434ã䜿çšããŠèŠã€ãã£ãè¡ãããå°ãªãåé€ããŸãããããã¯ãå®è¡ããI / Oæäœãå°ãªããŠæžã¿ãã¯ãšãªãå°ãéãå®è¡ãããçŽ8.9ã§ã¯ãªãçŽ5.5ããªç§ã§å®è¡ãããããšãæå³ããŸãã
ããŸããŸãªå€ã§ã€ã³ããã¯ã¹ããã¹ãããŠã¿ãŸããã
pages_per_rangeïŒ
| pages_per_range | ã€ã³ããã¯ã¹ãå確èªãããšãã«è¡ãåé€ããŸãã |
| 128 | 23130 |
| 64 | 9 434 |
| 8 | 874 |
| 4 | 446 |
| 2 | 446 |
æžå°
pages_per_rangeææ°ã¯ãããæ£ç¢ºã«ãªãããããèŠã€ãã£ãããŒãžããå°ãªãè¡ãåé€ããŸãã
éåžžã«å ·äœçãªã¯ãšãªãæé©åããããšã«æ³šæããŠãã ãããããã¯èª¬æã«ã¯åé¡ãããŸããããå®éã«ã¯ãã»ãšãã©ã®ã¯ãšãªã®ããŒãºãæºããå€ã䜿çšããããšããå§ãããŸãã
ã€ã³ããã¯ã¹ã®ãµã€ãºã®èŠç©ãã
BRINã®ãã1ã€ã®å€§ããªå©ç¹ã¯ããã®ãµã€ãºã§ããåã®ç« ã§ã¯ããã£ãŒã«ãã®
sold_atBããªãŒã€ã³ããã¯ã¹ãäœæããŸããããµã€ãºã¯2,224KBã§ããããŸãããã©ã¡ãŒã¿ã䜿çšããBRINãµã€ãºã¯pages_per_range=128ããã48 KBã§ã46åã®1ã«ãªããŸãã
Schema | Name | Type | Owner | Table | Size
--------+-----------------------+-------+-------+-----------+-------
public | sale_fact_sold_at_bix | index | haki | sale_fact | 48 kB
public | sale_fact_sold_at_ix | index | haki | sale_fact | 2224 kB
BRINãµã€ãºã圱é¿ãåã
pages_per_rangeãŸããããšãã°ãBRINã®pages_per_range=2ééã¯56 Kbã§ã48Kbããããã«äžåããŸãã
ã€ã³ããã¯ã¹ããé衚瀺ãã«ãã
PostgreSQLã«ã¯ã¯ãŒã«ãªãã©ã³ã¶ã¯ã·ã§ã³DDLæ©èœããããŸããOracleãšé·å¹Žã«ããããç§ã¯DDLã®ãããªã³ãã³ãã䜿çšããããšã«æ £ããŠãã
CREATEãDROPãšã®ååŒã®çµããã«ALTERããã ããPostgreSQLã§ã¯ããã©ã³ã¶ã¯ã·ã§ã³å
ã§DDLã³ãã³ããå®è¡ã§ãã倿Žã¯ãã©ã³ã¶ã¯ã·ã§ã³ãã³ããããããåŸã«ã®ã¿é©çšãããŸãã
æè¿ããã©ã³ã¶ã¯ã·ã§ã³DDLã䜿çšãããšãã€ã³ããã¯ã¹ãé衚瀺ã«ãªãå¯èœæ§ãããããšãçºèŠããŸãããããã¯ãã€ã³ããã¯ã¹ã®ãªãå®è¡èšç»ã確èªããå Žåã«åœ¹ç«ã¡ãŸãã
ããšãã°ãããŒãã«
sale_factã§ã¯ãåã«ã€ã³ããã¯ã¹ãäœæããŸããsold_atã7æã®è²©å£²ãã§ãããªã¯ãšã¹ãã®å®è¡èšç»ã¯æ¬¡ã®ããã«ãªããŸãã
db=# EXPLAIN
db-# SELECT *
db-# FROM sale_fact
db-# WHERE sold_at BETWEEN '2020-07-01' AND '2020-07-31';
QUERY PLAN
--------------------------------------------------------------------------------------------
Index Scan using sale_fact_sold_at_ix on sale_fact (cost=0.42..182.80 rows=4319 width=41)
Index Cond: ((sold_at >= '2020-07-01'::date) AND (sold_at <= '2020-07-31'::date))P
ã€ã³ããã¯ã¹ããªãå Žåã®ãã©ã³ãã©ã®ããã«ãªããã確èªããã«ã¯
sale_fact_sold_at_ixããã©ã³ã¶ã¯ã·ã§ã³å
ã«ã€ã³ããã¯ã¹ãé
眮ããŠãããã«ããŒã«ããã¯ããŸãã
db=# BEGIN;
BEGIN
db=# DROP INDEX sale_fact_sold_at_ix;
DROP INDEX
db=# EXPLAIN
db-# SELECT *
db-# FROM sale_fact
db-# WHERE sold_at BETWEEN '2020-07-01' AND '2020-07-31';
QUERY PLAN
---------------------------------------------------------------------------------
Seq Scan on sale_fact (cost=0.00..2435.00 rows=4319 width=41)
Filter: ((sold_at >= '2020-07-01'::date) AND (sold_at <= '2020-07-31'::date))
db=# ROLLBACK;
ROLLBACK
ãŸããã§ãã©ã³ã¶ã¯ã·ã§ã³ãéå§ããŸããã
BEGINãæ¬¡ã«ãã€ã³ããã¯ã¹ãåé€ããŠå®è¡èšç»ãçæããŸãããã©ã³ã§ã¯ãã€ã³ããã¯ã¹ãååšããªããã®ããã«ããŒãã«å
šäœã®ã¹ãã£ã³ã䜿çšãããããšã«æ³šæããŠãã ããããã®æç¹ã§ã¯ããã©ã³ã¶ã¯ã·ã§ã³ã¯ãŸã é²è¡äžã§ãããããã€ã³ããã¯ã¹ã¯ãŸã åé€ãããŠããŸãããã€ã³ããã¯ã¹ãåé€ããã«ãã©ã³ã¶ã¯ã·ã§ã³ãå®äºããã«ã¯ãã³ãã³ãã䜿çšããŠãã©ã³ã¶ã¯ã·ã§ã³ãããŒã«ããã¯ãROLLBACKãŸãã
ã€ã³ããã¯ã¹ããŸã ååšããããšã確èªããŸãããã
db=# \di+ sale_fact_sold_at_ix
List of relations
Schema | Name | Type | Owner | Table | Size
--------+----------------------+-------+-------+-----------+---------
public | sale_fact_sold_at_ix | index | haki | sale_fact | 2224 kB
ãã©ã³ã¶ã¯ã·ã§ã³DDLããµããŒãããªãä»ã®ããŒã¿ããŒã¹ã¯ãç°ãªãæ¹æ³ã§ç®æšãéæããå¯èœæ§ããããŸããããšãã°ãOracleã¯ããªãã«ã€ã³ããã¯ã¹ãããŒã¯ããããšãã§ããŸãç®ã«èŠããªããããªããã£ãã€ã¶ã¯ãããç¡èŠããŸãã
èŠåïŒããªãã¯ãã©ã³ã¶ã¯ã·ã§ã³å ã§ã€ã³ããã¯ã¹ãåé€ããå Žåãããã¯ç«¶äºåã®ããäºæ¥ã®éå¡ã«ã€ãªãã
SELECTãINSERTãUPDATEãšDELETEããŒãã«ã«ãã©ã³ã¶ã¯ã·ã§ã³ãã¢ã¯ãã£ãã«ãªããŸã§ããã¹ãç°å¢ã§ã¯æ³šæããŠäœ¿çšããçç£æœèšã§ã®äœ¿çšã¯é¿ããŠãã ããã
é·ãããã»ã¹ã1æéã®éå§æã«éå§ããããã«ã¹ã±ãžã¥ãŒã«ããªãã§ãã ãã
æè³å®¶ã¯ãæ ªäŸ¡ã10ãã«ã100ãã«ã1000ãã«ãªã©ã®çŸããã©ãŠã³ãå€ã«éãããšãå¥åŠãªããšãèµ·ããå¯èœæ§ãããããšãç¥ã£ãŠããŸããããã ã圌ãã¯ããã«ã€ããŠæžããããã®ã¯ïŒ
[...]è³ç£äŸ¡æ Œã¯äºæž¬ã§ããªãã»ã©å€åããå¯èœæ§ãããã1æ ªããã50ãã«ã100ãã«ãªã©ã®ã©ãŠã³ãå€ãè¶ ããŸããçµéšã®æµ ããã¬ãŒããŒã®å€ãã¯ãå ¬æ£ãªäŸ¡æ Œã§ãããšèããŠãããããäŸ¡æ ŒãæŠæ°ã«éãããšãã«è³ç£ã売買ããããšã奜ã¿ãŸãã
ãã®èгç¹ãããéçºè ã¯æè³å®¶ãšããã»ã©éãã¯ãããŸãããé·ãããã»ã¹ãã¹ã±ãžã¥ãŒã«ããå¿ èŠãããå Žåãéåžžã¯1æéãéžæããŸãã

å žåçãªå€éã®ã·ã¹ãã è² è·ã
ããã«ããããããã®æé垯ã«è² è·ãæ¥äžæããå¯èœæ§ããããŸãããããã£ãŠãé·ãããã»ã¹ãã¹ã±ãžã¥ãŒã«ããå¿ èŠãããå Žåãã·ã¹ãã ãä»ã®æéã«ã¢ã€ãã«ç¶æ ã«ãªãå¯èœæ§ãé«ããªããŸãã
ãŸããæ¯ååæã«éå§ããªãããã«ãã¹ã±ãžã¥ãŒã«ã§ã©ã³ãã ãªé å»¶ã䜿çšããããšããå§ãããŸããããããã°ããã®æéã«å¥ã®ã¿ã¹ã¯ãã¹ã±ãžã¥ãŒã«ãããŠããŠãã倧ããªåé¡ã«ã¯ãªããŸãããsystemdã¿ã€ããŒã䜿çšããŠããå Žåã¯ãRandomizedDelaySecãªãã·ã§ã³ã䜿çšã§ããŸãã
çµè«
ãã®èšäºã§ã¯ãç§ã®çµéšã«åºã¥ããŠããŸããŸãªçšåºŠã®èšŒæ ã®ãã³ããæäŸããŸããå®è£ ãç°¡åãªãã®ãããã°ãããŒã¿ããŒã¹ã®åäœãæ·±ãçè§£ããå¿ èŠããããã®ããããŸããããŒã¿ããŒã¹ã¯ææ°ã®ã·ã¹ãã ã®ããã¯ããŒã³ã§ãããããäœæ¥æ¹æ³ã®åŠç¿ã«è²»ããæéã¯ãã©ã®éçºè ã«ãšã£ãŠãè¯ãæè³ã§ãã