|
drrtuy, Looks like some problem still remain.
Quick repeat:
1. Prepare table from regression test000 tpch1.lineitem
2. set group_concat_max_len=3221225472;
3. select l_linenumber,LENGTH(group_concat(l_comment)) from lineitem group by l_linenumber order by l_linenumber;
4. appeared: ERROR 1815 (HY000): Internal error: InetStreamSocket::readToMagic: Remote is closed, memory usage grow to 15GB.
tested on develop build from 31.01.23
|
|
both develop from 19 Feb 2023 and 22.08.7 raise error:
ERROR 1815 (HY000): Internal error: InetStreamSocket::readToMagic: Remote is closed
no sign of OOMKiller in syslog:
Feb 20 22:31:38 kirillperov-testvm-ub22 ExeMgr[2882]: 38.362023 |2147483685|0|0| D 16 CAL0041: Start SQL statement: select objectid,columnname from syscolumn where schema='tpch1' and tablename='lineitem' --columnRIDs/FE; ||
Feb 20 22:31:38 kirillperov-testvm-ub22 ExeMgr[2882]: 38.422169 |2147483685|0|0| D 16 CAL0042: End SQL statement
Feb 20 22:31:38 kirillperov-testvm-ub22 ExeMgr[2882]: 38.445229 |37|0|0| D 16 CAL0041: Start SQL statement: select l_linenumber,LENGTH(group_concat(l_comment)) from lineitem group by l_linenumber order by l_linenumber; |tpch1|
Feb 20 22:31:39 kirillperov-testvm-ub22 env[2882]: PrimProc: /usr/include/boost/smart_ptr/scoped_array.hpp:81: T& boost::scoped_array<T>::operator[](std::ptrdiff_t) const [with T = unsigned char; std::ptrdiff_t = long int]: Assertion `i >= 0' failed.
Feb 20 22:31:39 kirillperov-testvm-ub22 messagequeue[1208]: 39.239273 |0|0|0| W 31 CAL0000: Client read close socket for InetStreamSocket::readToMagic: Remote is closed %%10%%
Feb 20 22:31:39 kirillperov-testvm-ub22 systemd[1]: mcs-primproc.service: Main process exited, code=killed, status=6/ABRT
Feb 20 22:31:39 kirillperov-testvm-ub22 systemd[1]: mcs-primproc.service: Failed with result 'signal'.
Feb 20 22:31:39 kirillperov-testvm-ub22 systemd[1]: mcs-primproc.service: Consumed 5.747s CPU time.
Feb 20 22:31:39 kirillperov-testvm-ub22 mariadbd[1208]: ClientRotator caught exception: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 96 inet: 127.0.0.1 port: 8601
Feb 20 22:31:39 kirillperov-testvm-ub22 systemd[1]: mcs-primproc.service: Scheduled restart job, restart counter is at 1.
Feb 20 22:31:39 kirillperov-testvm-ub22 systemd[1]: Stopped mcs-primproc.
Feb 20 22:31:39 kirillperov-testvm-ub22 systemd[1]: mcs-primproc.service: Consumed 5.747s CPU time.
Feb 20 22:31:39 kirillperov-testvm-ub22 systemd[1]: Starting mcs-primproc...
Feb 20 22:31:39 kirillperov-testvm-ub22 env[3134]: Starting PrimitiveServer: st = 1, sq = 10, pw = 128, pq = 10240, nb = 1023706, nt = 32, nc = 1, ra = 512, db = 128, mb = 512, rd = 0, tr = 0, ss = 67108864, bp = 32
Feb 20 22:31:39 kirillperov-testvm-ub22 env[3125]: PrimProc main process has started
Feb 20 22:31:39 kirillperov-testvm-ub22 env[3134]: FairThreadPool started 32 thread/-s.
Feb 20 22:31:39 kirillperov-testvm-ub22 env[3134]: Starting ExeMgr: st = 50, qs = 20, mx = 95, cf = /etc/columnstore/Columnstore.xml
Feb 20 22:31:39 kirillperov-testvm-ub22 systemd[1]: Started mcs-primproc.
|
|
For QA: I have tested two scenarios mentioned in the comments in this ticket to confirm the PrimProc crash is fixed. One is from Sergey on 2023-03-17 and another one from Kirill on 2023-01-31. The lineitem query now executes without a PrimProc crash. On my system with 128gb RAM, the lineitem query with group_concat took about 7-8mins (when group_concat_max_len=3221225472 ).
Please reconfirm the above two test cases before and after the fix.
In addition, we need to confirm if the customer issue reported in CS0512018 is also fixed.
|
|
Build verified:
engine: 499859035c1af97ad1c8ee6d31392735ee348390
server: 11c83d9ae9eb249d00589cc6ab71e7f4e67ffa27
buildNo: 7532
1 GB DBT3 dataset
set group_concat_max_len=3221225472;
Reproduced the issues in release 23.02.2 and verified the fix in the mentioned build.
Verified in both Rocky 8 and Ubuntu 20.04
With fixed
VM memory 8gb
|
|
MariaDB [tpch1]> select l_linenumber,LENGTH(group_concat(l_comment)) from lineitem group by l_linenumber order by l_linenumber;
|
ERROR 1815 (HY000): Internal error: TupleAggregateStep::threadedAggregateRowGroups()[3] MCS-2003: Aggregation/Distinct memory limit is exceeded.
|
|
|
VM memory 16gb
|
|
MariaDB [tpch1]> select l_linenumber,LENGTH(group_concat(l_comment)) from lineitem group by l_linenumber order by l_linenumber;
|
+--------------+---------------------------------+
|
| l_linenumber | LENGTH(group_concat(l_comment)) |
|
+--------------+---------------------------------+
|
| 1 | 41242078 |
|
| 2 | 35359379 |
|
| 3 | 29465964 |
|
| 4 | 23560639 |
|
| 5 | 17676387 |
|
| 6 | 11794549 |
|
| 7 | 5899421 |
|
+--------------+---------------------------------+
|
7 rows in set (1 min 21.366 sec)
|
|
|
MariaDB [mytest]> set group_concat_max_len=3221225472;
|
Query OK, 0 rows affected (0.000 sec)
|
|
Result matched with InnoDB.
|
|
|
MariaDB [mytest]> CREATE TABLE t(i integer, t text(65000)) ENGINE=columnstore;
|
Query OK, 0 rows affected (0.220 sec)
|
|
MariaDB [mytest]> INSERT INTO t SELECT seq / 10, CONCAT('long enough string to fill the text, sequence is ', seq) FROM seq_1_to_1000000;
|
Query OK, 1000000 rows affected (3.284 sec)
|
Records: 1000000 Duplicates: 0 Warnings: 0
|
|
MariaDB [mytest]> set group_concat_max_len=3221225472;
|
Query OK, 0 rows affected (0.000 sec)
|
|
MariaDB [mytest]> SELECT i, GROUP_CONCAT(t) FROM t GROUP BY i;
|
.
|
.
|
.
|
|
| 74546 | long enough string to fill the text, sequence is 745455,long enough string to fill the text, sequence is 745456,long enough string to fill the text, sequence is 745457,long enough string to fill the text, sequence is 745458,long enough string to fill the text, sequence is 745459,long enough string to fill the text, sequence is 745460,long enough string to fill the text, sequence is 745461,long enough string to fill the text, sequence is 745462,long enough string to fill the text, sequence is 745463,long enough string to fill the text, sequence is 745464 |
|
+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
100001 rows in set (3.156 sec)
|
23.02.2
VM memory 8gb
|
|
MariaDB [tpch1]> select l_linenumber,LENGTH(group_concat(l_comment)) from lineitem group by l_linenumber order by l_linenumber;
|
ERROR 1815 (HY000): Internal error: An unexpected condition within the query caused an internal processing error within Columnstore. Please check the log files for more details. Additional Information: error in TupleAggregateSte
|
|
|
VM memory 16gb
|
|
MariaDB [tpch1]> select l_linenumber,LENGTH(group_concat(l_comment)) from lineitem group by l_linenumber order by l_linenumber;
|
ERROR 1815 (HY000): Internal error: InetStreamSocket::readToMagic: Remote is closed
|
|
PrimProc crashed
|
|
MariaDB [mytest]> CREATE TABLE t(i integer, t text(65000)) ENGINE=columnstore;
|
Query OK, 0 rows affected (0.220 sec)
|
|
MariaDB [mytest]> INSERT INTO t SELECT seq / 10, CONCAT('long enough string to fill the text, sequence is ', seq) FROM seq_1_to_1000000;
|
Query OK, 1000000 rows affected (3.284 sec)
|
Records: 1000000 Duplicates: 0 Warnings: 0
|
|
MariaDB [mytest]> set group_concat_max_len=3221225472;
|
Query OK, 0 rows affected (0.000 sec)
|
|
MariaDB [mytest]> SELECT i, GROUP_CONCAT(t) FROM t GROUP BY i;
|
ERROR 1815 (HY000): Internal error: InetStreamSocket::readToMagic: Remote is closed
|
|