Clickhouse按照插入时的顺序返回数据max_threads参数的使用

Clickhouse默认是多线程处理数据,如果没有指定order by ,则不确认数据的返回顺序。如果希望按照数据插入时的顺序返回,则可以通过设置单线程来实现。本文介绍max_threads参数的使用例子。

另外,单独针对insert,还有个max_insert_threads参数可以使用,针对insert select 这类,可以避免对其它SQl的影响。

注意区别,max_insert_threads是在insert select时,如果并行做,就无法保证数据和原始的一致;而max_thread更多的是在查询时,如果多线程读取,也无法保证返回结果的先后顺序。也就是说。如果你max_insert_threads设置的高了,数据就会顺序不太一致,无论你max_threads是多少,顺序都会乱,它是源头。

max_threads

说明

Clickhouse查询处理线程的最大数目,不包括从远程服务器检索数据的线程。

默认值是物理CPU的核数。

样例

查看参数

其默认值时auto(4). 当前是4核的CPU。

localhost :) show settings like 'max_threads';

SHOW SETTINGS LIKE 'max_threads'

Query id: 87f32c2b-3e68-48d1-bd7f-3fcdc6106faf

┌─name────────┬─type───────┬─value─────┐
│ max_threads │ MaxThreads │ 'auto(4)' │
└─────────────┴────────────┴───────────┘

1 rows in set. Elapsed: 0.003 sec.

测试数据

分4次插入数据, 1,2,3,4,查询时结果与插入顺序不符, 2,1,3,4

localhost :) create table tt(id int)engine=MergeTree order by id;

CREATE TABLE tt
(
    `id` int
)
ENGINE = MergeTree
ORDER BY id

Query id: 3db84ca3-6525-402f-a7be-f82e89351492

Ok.

0 rows in set. Elapsed: 0.017 sec.

localhost :) insert into tt values(1);

INSERT INTO tt VALUES

Query id: 018bdcbc-1a0a-43d6-814c-88cb74b8aa79

Ok.

1 rows in set. Elapsed: 0.018 sec.

localhost :) insert into tt values(2);

INSERT INTO tt VALUES

Query id: 64d9d54a-1207-43ca-8a62-07867a35bb0d

Ok.

1 rows in set. Elapsed: 0.004 sec.

localhost :) insert into tt values(3);

INSERT INTO tt VALUES

Query id: 173b3de2-c4b9-43c6-9b11-22146def8099

Ok.

1 rows in set. Elapsed: 0.004 sec.

localhost :) insert into tt values(4);

INSERT INTO tt VALUES

Query id: 4421807c-5ce4-4e47-bfc9-f0a43d5801ab

Ok.

1 rows in set. Elapsed: 0.003 sec.

localhost :) select * from tt;

SELECT *
FROM tt

Query id: e82d00f5-0e95-40ee-b6b4-f72566609cdb

┌─id─┐
│  2 │
└────┘
┌─id─┐
│  1 │
└────┘
┌─id─┐
│  3 │
└────┘
┌─id─┐
│  4 │
└────┘

4 rows in set. Elapsed: 0.007 sec.

调整参数

localhost :) set max_threads=1;

SET max_threads = 1

Query id: 4ab34649-ee0c-40f7-9f5b-9b307f98a655

Ok.

0 rows in set. Elapsed: 0.004 sec.

localhost :) select * from tt;

SELECT *
FROM tt

Query id: e2b02912-5bdb-434e-b3a0-3152432290b9

┌─id─┐
│  1 │
└────┘
┌─id─┐
│  2 │
└────┘
┌─id─┐
│  3 │
└────┘
┌─id─┐
│  4 │
└────┘

4 rows in set. Elapsed: 0.009 sec.

max_insert_threads

说明

0或者1, 是串行操作insert ;
大于1的正整数,是并行度。

样例

如下用max_insert_threads=4,通过insert select 创建了tt2表,无论max_threads是多少,结果的顺序都是和原始的不同。
而采用max_insert_threads=1时,就保证了新表 tt3的数据顺序,和原始的是一致的。


localhost :) set max_insert_threads=4;

SET max_insert_threads = 4

Query id: 67211b8d-68fc-4e06-b979-92c1d1563ef2

Ok.

0 rows in set. Elapsed: 0.002 sec.

localhost :) create table tt2 engine=MergeTree() order by id as select * from tt;

CREATE TABLE tt2
ENGINE = MergeTree
ORDER BY id AS
SELECT *
FROM tt

Query id: 066a17d8-6dc3-49e1-9c1b-fda0b293cdff

Ok.

0 rows in set. Elapsed: 0.031 sec.

localhost :) select * from tt2;

SELECT *
FROM tt2

Query id: 8b593d4d-c13a-4f1b-84b7-d75b1c8fb14c

┌─id─┐
│  5 │
│  8 │
└────┘
┌─id─┐
│  6 │
│  7 │
└────┘

4 rows in set. Elapsed: 0.004 sec.

localhost :) set max_threads=1;

SET max_threads = 1

Query id: 1e90d5a1-1613-437d-9fd3-96d85b4a1712

Ok.

0 rows in set. Elapsed: 0.007 sec.

localhost :) select * from tt2;

SELECT *
FROM tt2

Query id: faf7c96d-5578-4e55-ae84-bca5c10e6121

┌─id─┐
│  6 │
│  7 │
└────┘
┌─id─┐
│  5 │
│  8 │
└────┘

4 rows in set. Elapsed: 0.004 sec.

localhost :) set max_insert_threads=1;

SET max_insert_threads = 1

Query id: 4d12869d-cbea-4a65-9261-fa28cd33bc2d

Ok.

0 rows in set. Elapsed: 0.002 sec.

localhost :) create table tt3 engine=MergeTree() order by id as select * from tt;

CREATE TABLE tt3
ENGINE = MergeTree
ORDER BY id AS
SELECT *
FROM tt

Query id: c1979f16-36b7-48ea-84f9-32a4b55008b7

Ok.

0 rows in set. Elapsed: 0.024 sec.

localhost :) select * from tt3;

SELECT *
FROM tt3

Query id: f7e7d021-d5c1-48dd-b32d-e60ca6e072b9

┌─id─┐
│  5 │
│  6 │
│  7 │
│  8 │
└────┘

4 rows in set. Elapsed: 0.005 sec.

如果我们再insert一些数据,则会看到区别。 在max_threads=1时,才可以保证查询结果顺序和原始的一样。

localhost :) insert into tt3  select * from tt;

INSERT INTO tt3 SELECT *
FROM tt

Query id: d48198f3-353e-4d03-9056-fa9545af190f

Ok.

0 rows in set. Elapsed: 0.005 sec.

localhost :) select * from tt3;

SELECT *
FROM tt3

Query id: 61a11e94-bb43-4c58-8e90-8f8439c02752

┌─id─┐
│  5 │
│  6 │
│  7 │
│  8 │
└────┘
┌─id─┐
│  5 │
│  6 │
│  7 │
│  8 │
└────┘

8 rows in set. Elapsed: 0.005 sec.

localhost :) insert into tt3 values(10),(11),(12),(13);

INSERT INTO tt3 VALUES

Query id: 203512ef-e2f3-4cb9-87ea-65ea304fb31b

Ok.

4 rows in set. Elapsed: 0.004 sec.

localhost :) select * from tt3;

SELECT *
FROM tt3

Query id: a2ceae92-f509-4190-8b66-f9c2bc30ee16

┌─id─┐
│ 10 │
│ 11 │
│ 12 │
│ 13 │
└────┘
┌─id─┐
│  5 │
│  6 │
│  7 │
│  8 │
└────┘
┌─id─┐
│  5 │
│  6 │
│  7 │
│  8 │
└────┘

12 rows in set. Elapsed: 0.006 sec.

localhost :) set max_threads=1;

SET max_threads = 1

Query id: 78fe8f0f-16a0-4058-bca0-cd740acdeb0d

Ok.

0 rows in set. Elapsed: 0.003 sec.

localhost :) select * from tt3;

SELECT *
FROM tt3

Query id: dd20e166-6f7a-4235-96a3-b9ca8b7342f8

┌─id─┐
│  5 │
│  6 │
│  7 │
│  8 │
└────┘
┌─id─┐
│  5 │
│  6 │
│  7 │
│  8 │
└────┘
┌─id─┐
│ 10 │
│ 11 │
│ 12 │
│ 13 │
└────┘

12 rows in set. Elapsed: 0.005 sec.

localhost :)