Quantcast
Channel: 日々の覚書
Viewing all articles
Browse latest Browse all 589

MySQL 5.6 InnoDB FTSのinnodb_ft_*_stopword_tableがもんにょりしている

$
0
0
日々の覚書: MySQL 5.6 InnoDB FTSのストップワードテーブルを設定する のつづき。

innodb_ft_server_stopword_tableはmysqld全体で1つのグローバル変数のみを持ち、それはつまり全てのテーブルと全てのインデックスでこのストップワードテーブルを共有することになる。

それに対して innodb_ft_user_stopword_tableはグローバル変数とセッション変数を持ち、たとえば sort_buffer_sizeみたいに振舞う。セッション変数が指定されていればセッション変数の値に従い、セッション変数が指定されていない場合はグローバル変数の値が使われるといった具合だ。

innodb_ft_user_stopword_tableが設定されている状態でFTインデックスを作るとその値がinformation_schema.innodb_ft_configに保存される(ので、コネクションを切ってセッション変数のinnodb_ft_user_stopword_tableが吹っ飛んでも運用に問題はない…当たり前だけど)


mysql [localhost] {msandbox} (d1) > CREATE TABLE stopwords (value varchar(32) PRIMARY KEY);
Query OK, 0 rows affected (0.11 sec)

mysql [localhost] {msandbox} (d1) > INSERT INTO stopwords VALUES ('MySQL');
Query OK, 1 row affected (0.07 sec)

mysql [localhost] {msandbox} (d1) > SET SESSION innodb_ft_user_stopword_table= 'd1/stopwords';
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {msandbox} (d1) > CREATE TABLE t1 (val TEXT, FULLTEXT KEY (val)) Engine = InnoDB;
Query OK, 0 rows affected (0.32 sec)

mysql [localhost] {msandbox} (d1) > INSERT INTO t1 VALUES ('I love MySQL');
Query OK, 1 row affected (0.02 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM t1 WHERE match(val) against ('MySQL');
Empty set (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM t1 WHERE match(val) against ('love');
+--------------+
| val |
+--------------+
| I love MySQL |
+--------------+
1 row in set (0.00 sec)

### ここでmysqld再起動 ###

mysql [localhost] {msandbox} (d1) > SET GLOBAL innodb_ft_aux_table= 'd1/t1';
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM information_schema.innodb_ft_config;
+---------------------------+--------------+
| KEY | VALUE |
+---------------------------+--------------+
| optimize_checkpoint_limit | 180 |
| synced_doc_id | 2 |
| stopword_table_name | d1/stopwords |
| use_stopword | 1 |
+---------------------------+--------------+
4 rows in set (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM information_schema.innodb_ft_index_table;
+------+--------------+-------------+-----------+--------+----------+
| WORD | FIRST_DOC_ID | LAST_DOC_ID | DOC_COUNT | DOC_ID | POSITION |
+------+--------------+-------------+-----------+--------+----------+
| love | 1 | 1 | 1 | 1 | 2 |
+------+--------------+-------------+-----------+--------+----------+
1 row in set (0.00 sec)

mysqldを再起動してるのはなんかこのへんの理由。まじめには調べてない。






ストップワードを追加してみる。

mysql [localhost] {msandbox} (d1) > INSERT INTO stopwords VALUES ('PostgreSQL');
Query OK, 1 row affected (0.03 sec)

mysql [localhost] {msandbox} (d1) > INSERT INTO t1 VALUES ('I love MySQL more than PostgreSQL');
Query OK, 1 row affected (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM t1 WHERE match(val) against ('MySQL');
Empty set (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM t1 WHERE match(val) against ('PostgreSQL');
+-----------------------------------+
| val |
+-----------------------------------+
| I love MySQL more than PostgreSQL |
+-----------------------------------+
1 row in set (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM t1 WHERE match(val) against ('love');
+-----------------------------------+
| val |
+-----------------------------------+
| I love MySQL |
| I love MySQL more than PostgreSQL |
+-----------------------------------+
2 rows in set (0.00 sec)

動的には反映されてないぽい。OPTIMIZE TABLEしたりmysqld再起動したりすると反映される。


### mysqld再起動 ###

mysql [localhost] {msandbox} (d1) > SET GLOBAL innodb_ft_aux_table= 'd1/t1';
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM information_schema.innodb_ft_index_table;
+------+--------------+-------------+-----------+--------+----------+
| WORD | FIRST_DOC_ID | LAST_DOC_ID | DOC_COUNT | DOC_ID | POSITION |
+------+--------------+-------------+-----------+--------+----------+
| love | 1 | 1 | 1 | 1 | 2 |
| love | 2 | 2 | 1 | 2 | 2 |
| more | 2 | 2 | 1 | 2 | 13 |
| than | 2 | 2 | 1 | 2 | 18 |
+------+--------------+-------------+-----------+--------+----------+
4 rows in set (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM t1 WHERE match(val) against ('MySQL');
Empty set (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM t1 WHERE match(val) against ('PostgreSQL');
Empty set (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM t1 WHERE match(val) against ('love');
+-----------------------------------+
| val |
+-----------------------------------+
| I love MySQL |
| I love MySQL more than PostgreSQL |
+-----------------------------------+
2 rows in set (0.00 sec)

mysql [localhost] {msandbox} (d1) > DELETE FROM stopwords WHERE value = 'MySQL';
Query OK, 1 row affected (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM t1 WHERE match(val) against ('MySQL');
Empty set (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM t1 WHERE match(val) against ('PostgreSQL');
Empty set (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM t1 WHERE match(val) against ('love');
+-----------------------------------+
| val |
+-----------------------------------+
| I love MySQL |
| I love MySQL more than PostgreSQL |
+-----------------------------------+
2 rows in set (0.00 sec)

mysql [localhost] {msandbox} (d1) > OPTIMIZE TABLE t1;
+-------+----------+----------+-------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------+----------+----------+-------------------------------------------------------------------+
| d1.t1 | optimize | note | Table does not support optimize, doing recreate + analyze instead |
| d1.t1 | optimize | status | OK |
+-------+----------+----------+-------------------------------------------------------------------+
2 rows in set (0.10 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM t1 WHERE match(val) against ('MySQL');
+-----------------------------------+
| val |
+-----------------------------------+
| I love MySQL |
| I love MySQL more than PostgreSQL |
+-----------------------------------+
2 rows in set (0.00 sec)

ちなみに、innodb_ft_server_stopword_tableとinnodb_ft_user_stopword_tableを両方指定したときはuser_stopwordの方が優先されるぽい。


mysql [localhost] {msandbox} (d1) > SET GLOBAL innodb_ft_server_stopword_table = 'd1/server_stopword';
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {msandbox} (d1) > SET SESSION innodb_ft_user_stopword_table = 'd1/user_stopword';
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {msandbox} (d1) > CREATE TABLE t3 (val varchar(32), FULLTEXT KEY (val)) Engine = InnoDB;
Query OK, 0 rows affected (0.03 sec)

mysql [localhost] {msandbox} (d1) > SET GLOBAL innodb_ft_aux_table= 'd1/t3';
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {msandbox} (d1) > SELECT * FROM information_schema.innodb_ft_config;
+---------------------------+------------------+
| KEY | VALUE |
+---------------------------+------------------+
| optimize_checkpoint_limit | 180 |
| synced_doc_id | 0 |
| stopword_table_name | d1/user_stopword |
| use_stopword | 1 |
+---------------------------+------------------+
4 rows in set (0.00 sec)

うかつにグローバル変数の方のinnodb_ft_user_stopword_tableを設定していると、それ以降つないだセッションは全部セッション変数のinnodb_ft_user_stopword_tableが自動的に設定されることになるから結構面倒。というかハマった。つらい。

Viewing all articles
Browse latest Browse all 589

Latest Images

Trending Articles