Clickhouse Performance

KNAT Network

Let's see if different versions of ClickHouse will performant differently on different machines.

Related post: https://nova.moe/performance-comparison-clickhouse/ (Chinese)

Overall performance between ClickHouse release versions

Y-Axis is the total used time of 44 queies, group by different type of machines, the lower the better.


How these tests are performed

Prepare the dataset for testing.

                mkdir clickhouse && cd clickhouse
                wget https://datasets.clickhouse.com/hits/partitions/hits_100m_obfuscated_v1.tar.xz
                tar xvf hits_100m_obfuscated_v1.tar.xz -C .
                rm -rf ./clickhouse_data/ && mkdir clickhouse_data && rsync -aP ./hits_100m_obfuscated_v1/* ./clickhouse_data/ && chmod -R 777 clickhouse_data
            

For each version, a docker-compose.yml file is created in the following syntax:

                version: '3'

                services:

                clickhouse:
                    image: knatnetwork/clickhouse-server:22.2.3.5-stable
                    restart: always
                    ports:
                    - "127.0.0.1:8123:8123"
                    - "127.0.0.1:9000:9000"
                    - "127.0.0.1:9009:9009"
                    volumes:
                    - ./clickhouse_data:/var/lib/clickhouse
                    - ./clickhouse/clickhouse-config.xml:/etc/clickhouse-server/config.d/logging.xml:ro
                    - ./clickhouse/clickhouse-user-config.xml:/etc/clickhouse-server/users.d/logging.xml:ro
                    ulimits:
                    nofile:
                        soft: 262144
                        hard: 262144
            

Then, we do 43 queries on the ClickHouse server, each query is executed three times to get the average time.

                SELECT count() FROM hits_100m_obfuscated;
                SELECT count() FROM hits_100m_obfuscated WHERE AdvEngineID != 0;
                SELECT sum(AdvEngineID), count(), avg(ResolutionWidth) FROM hits_100m_obfuscated ;
                SELECT sum(UserID) FROM hits_100m_obfuscated ;
                SELECT uniq(UserID) FROM hits_100m_obfuscated ;
                SELECT uniq(SearchPhrase) FROM hits_100m_obfuscated ;
                SELECT min(EventDate), max(EventDate) FROM hits_100m_obfuscated ;
                SELECT AdvEngineID, count() FROM hits_100m_obfuscated WHERE AdvEngineID != 0 GROUP BY AdvEngineID ORDER BY count() DESC;
                SELECT RegionID, uniq(UserID) AS u FROM hits_100m_obfuscated GROUP BY RegionID ORDER BY u DESC LIMIT 10;
                SELECT RegionID, sum(AdvEngineID), count() AS c, avg(ResolutionWidth), uniq(UserID) FROM hits_100m_obfuscated GROUP BY RegionID ORDER BY c DESC LIMIT 10;
                SELECT MobilePhoneModel, uniq(UserID) AS u FROM hits_100m_obfuscated WHERE MobilePhoneModel != '' GROUP BY MobilePhoneModel ORDER BY u DESC LIMIT 10;
                SELECT MobilePhone, MobilePhoneModel, uniq(UserID) AS u FROM hits_100m_obfuscated WHERE MobilePhoneModel != '' GROUP BY MobilePhone, MobilePhoneModel ORDER BY u DESC LIMIT 10;
                SELECT SearchPhrase, count() AS c FROM hits_100m_obfuscated WHERE SearchPhrase != '' GROUP BY SearchPhrase ORDER BY c DESC LIMIT 10;
                SELECT SearchPhrase, uniq(UserID) AS u FROM hits_100m_obfuscated WHERE SearchPhrase != '' GROUP BY SearchPhrase ORDER BY u DESC LIMIT 10;
                SELECT SearchEngineID, SearchPhrase, count() AS c FROM hits_100m_obfuscated WHERE SearchPhrase != '' GROUP BY SearchEngineID, SearchPhrase ORDER BY c DESC LIMIT 10;
                SELECT UserID, count() FROM hits_100m_obfuscated GROUP BY UserID ORDER BY count() DESC LIMIT 10;
                SELECT UserID, SearchPhrase, count() FROM hits_100m_obfuscated GROUP BY UserID, SearchPhrase ORDER BY count() DESC LIMIT 10;
                SELECT UserID, SearchPhrase, count() FROM hits_100m_obfuscated GROUP BY UserID, SearchPhrase LIMIT 10;
                SELECT UserID, toMinute(EventTime) AS m, SearchPhrase, count() FROM hits_100m_obfuscated GROUP BY UserID, m, SearchPhrase ORDER BY count() DESC LIMIT 10;
                SELECT UserID FROM hits_100m_obfuscated WHERE UserID = 12345678901234567890;
                SELECT count() FROM hits_100m_obfuscated WHERE URL LIKE '%metrika%';
                SELECT SearchPhrase, any(URL), count() AS c FROM hits_100m_obfuscated WHERE URL LIKE '%metrika%' AND SearchPhrase != '' GROUP BY SearchPhrase ORDER BY c DESC LIMIT 10;
                SELECT SearchPhrase, any(URL), any(Title), count() AS c, uniq(UserID) FROM hits_100m_obfuscated WHERE Title LIKE '%Яндекс%' AND URL NOT LIKE '%.yandex.%' AND SearchPhrase != '' GROUP BY SearchPhrase ORDER BY c DESC LIMIT 10;
                SELECT * FROM hits_100m_obfuscated WHERE URL LIKE '%metrika%' ORDER BY EventTime LIMIT 10;
                SELECT SearchPhrase FROM hits_100m_obfuscated WHERE SearchPhrase != '' ORDER BY EventTime LIMIT 10;
                SELECT SearchPhrase FROM hits_100m_obfuscated WHERE SearchPhrase != '' ORDER BY SearchPhrase LIMIT 10;
                SELECT SearchPhrase FROM hits_100m_obfuscated WHERE SearchPhrase != '' ORDER BY EventTime, SearchPhrase LIMIT 10;
                SELECT CounterID, avg(length(URL)) AS l, count() AS c FROM hits_100m_obfuscated WHERE URL != '' GROUP BY CounterID HAVING c > 100000 ORDER BY l DESC LIMIT 25;
                SELECT domainWithoutWWW(Referer) AS key, avg(length(Referer)) AS l, count() AS c, any(Referer) FROM hits_100m_obfuscated WHERE Referer != '' GROUP BY key HAVING c > 100000 ORDER BY l DESC LIMIT 25;
                SELECT sum(ResolutionWidth), sum(ResolutionWidth + 1), sum(ResolutionWidth + 2), sum(ResolutionWidth + 3), sum(ResolutionWidth + 4), sum(ResolutionWidth + 5), sum(ResolutionWidth + 6), sum(ResolutionWidth + 7), sum(ResolutionWidth + 8), sum(ResolutionWidth + 9), sum(ResolutionWidth + 10), sum(ResolutionWidth + 11), sum(ResolutionWidth + 12), sum(ResolutionWidth + 13), sum(ResolutionWidth + 14), sum(ResolutionWidth + 15), sum(ResolutionWidth + 16), sum(ResolutionWidth + 17), sum(ResolutionWidth + 18), sum(ResolutionWidth + 19), sum(ResolutionWidth + 20), sum(ResolutionWidth + 21), sum(ResolutionWidth + 22), sum(ResolutionWidth + 23), sum(ResolutionWidth + 24), sum(ResolutionWidth + 25), sum(ResolutionWidth + 26), sum(ResolutionWidth + 27), sum(ResolutionWidth + 28), sum(ResolutionWidth + 29), sum(ResolutionWidth + 30), sum(ResolutionWidth + 31), sum(ResolutionWidth + 32), sum(ResolutionWidth + 33), sum(ResolutionWidth + 34), sum(ResolutionWidth + 35), sum(ResolutionWidth + 36), sum(ResolutionWidth + 37), sum(ResolutionWidth + 38), sum(ResolutionWidth + 39), sum(ResolutionWidth + 40), sum(ResolutionWidth + 41), sum(ResolutionWidth + 42), sum(ResolutionWidth + 43), sum(ResolutionWidth + 44), sum(ResolutionWidth + 45), sum(ResolutionWidth + 46), sum(ResolutionWidth + 47), sum(ResolutionWidth + 48), sum(ResolutionWidth + 49), sum(ResolutionWidth + 50), sum(ResolutionWidth + 51), sum(ResolutionWidth + 52), sum(ResolutionWidth + 53), sum(ResolutionWidth + 54), sum(ResolutionWidth + 55), sum(ResolutionWidth + 56), sum(ResolutionWidth + 57), sum(ResolutionWidth + 58), sum(ResolutionWidth + 59), sum(ResolutionWidth + 60), sum(ResolutionWidth + 61), sum(ResolutionWidth + 62), sum(ResolutionWidth + 63), sum(ResolutionWidth + 64), sum(ResolutionWidth + 65), sum(ResolutionWidth + 66), sum(ResolutionWidth + 67), sum(ResolutionWidth + 68), sum(ResolutionWidth + 69), sum(ResolutionWidth + 70), sum(ResolutionWidth + 71), sum(ResolutionWidth + 72), sum(ResolutionWidth + 73), sum(ResolutionWidth + 74), sum(ResolutionWidth + 75), sum(ResolutionWidth + 76), sum(ResolutionWidth + 77), sum(ResolutionWidth + 78), sum(ResolutionWidth + 79), sum(ResolutionWidth + 80), sum(ResolutionWidth + 81), sum(ResolutionWidth + 82), sum(ResolutionWidth + 83), sum(ResolutionWidth + 84), sum(ResolutionWidth + 85), sum(ResolutionWidth + 86), sum(ResolutionWidth + 87), sum(ResolutionWidth + 88), sum(ResolutionWidth + 89) FROM hits_100m_obfuscated;
                SELECT SearchEngineID, ClientIP, count() AS c, sum(Refresh), avg(ResolutionWidth) FROM hits_100m_obfuscated WHERE SearchPhrase != '' GROUP BY SearchEngineID, ClientIP ORDER BY c DESC LIMIT 10;
                SELECT WatchID, ClientIP, count() AS c, sum(Refresh), avg(ResolutionWidth) FROM hits_100m_obfuscated WHERE SearchPhrase != '' GROUP BY WatchID, ClientIP ORDER BY c DESC LIMIT 10;
                SELECT WatchID, ClientIP, count() AS c, sum(Refresh), avg(ResolutionWidth) FROM hits_100m_obfuscated GROUP BY WatchID, ClientIP ORDER BY c DESC LIMIT 10;
                SELECT URL, count() AS c FROM hits_100m_obfuscated GROUP BY URL ORDER BY c DESC LIMIT 10;
                SELECT 1, URL, count() AS c FROM hits_100m_obfuscated GROUP BY 1, URL ORDER BY c DESC LIMIT 10;
                SELECT ClientIP AS x, x - 1, x - 2, x - 3, count() AS c FROM hits_100m_obfuscated GROUP BY x, x - 1, x - 2, x - 3 ORDER BY c DESC LIMIT 10;
                SELECT URL, count() AS PageViews FROM hits_100m_obfuscated WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT DontCountHits AND NOT Refresh AND notEmpty(URL) GROUP BY URL ORDER BY PageViews DESC LIMIT 10;
                SELECT Title, count() AS PageViews FROM hits_100m_obfuscated WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT DontCountHits AND NOT Refresh AND notEmpty(Title) GROUP BY Title ORDER BY PageViews DESC LIMIT 10;
                SELECT URL, count() AS PageViews FROM hits_100m_obfuscated WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT Refresh AND IsLink AND NOT IsDownload GROUP BY URL ORDER BY PageViews DESC LIMIT 1000;
                SELECT TraficSourceID, SearchEngineID, AdvEngineID, ((SearchEngineID = 0 AND AdvEngineID = 0) ? Referer : '') AS Src, URL AS Dst, count() AS PageViews FROM hits_100m_obfuscated WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT Refresh GROUP BY TraficSourceID, SearchEngineID, AdvEngineID, Src, Dst ORDER BY PageViews DESC LIMIT 1000;
                SELECT URLHash, EventDate, count() AS PageViews FROM hits_100m_obfuscated WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT Refresh AND TraficSourceID IN (-1, 6) AND RefererHash = halfMD5('http://example.ru/') GROUP BY URLHash, EventDate ORDER BY PageViews DESC LIMIT 100;
                SELECT WindowClientWidth, WindowClientHeight, count() AS PageViews FROM hits_100m_obfuscated WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND NOT Refresh AND NOT DontCountHits AND URLHash = halfMD5('http://example.ru/') GROUP BY WindowClientWidth, WindowClientHeight ORDER BY PageViews DESC LIMIT 10000;
                SELECT toStartOfMinute(EventTime) AS Minute, count() AS PageViews FROM hits_100m_obfuscated WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-02' AND NOT Refresh AND NOT DontCountHits GROUP BY Minute ORDER BY Minute;
            

Raw Data Download

Raw data is a JSON file,looks like below:

                {
                    "clickhouse_commit_date": "2022-02-17T10:21:02Z",
                    "clickhouse_commit_id": "97317e8bb51d2723b539b7863dd6471a68b11d01",
                    "clickhouse_version": "22.2.3.5-stable",
                    "date": "2022-04-18",
                    "image_name": "knatnetwork/clickhouse-server:22.2.3.5-stable",
                    "machine_arch": "ARM64",
                    "machine_type": "c6g.2xlarge",
                    "machine_vendor": "AWS",
                    "results": [
                        {
                            "query": "SELECT count() FROM hits_100m_obfuscated;",
                            "time": 0.0016631285349527996
                        },
                        {
                            "query": "SELECT count() FROM hits_100m_obfuscated WHERE AdvEngineID != 0;",
                            "time": 0.02081441879272461
                        },
                    ]
                }