[MDEV-535] Cassandra SE: Internal error: 'TimedOutException: Default TException errors Created: 2012-09-14  Updated: 2014-04-15  Resolved: 2014-04-15

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Won't Fix Votes: 0
Labels: None

Issue Links:
PartOf
is part of MDEV-431 Cassandra storage engine Closed

 Description   

It looks like CassandraSE has very small timeouts. Any operation that takes longer than a few seconds ends like this:

MariaDB [dbt3]> delete from customer ;
ERROR 1928 (HY000): Internal error: 'TimedOutException: Default TException.'
MariaDB [dbt3]> delete from lineitem ;
ERROR 1928 (HY000): Internal error: 'TimedOutException: Default TException.'
MariaDB [dbt3]> delete from nation ;
ERROR 1928 (HY000): Internal error: 'TimedOutException: Default TException.'
MariaDB [dbt3]> delete from orders ;

I added calls to

s->setConnTimeout(1000*1000);
s->setRecvTimeout(1000*1000);
s->setSendTimeout(1000*1000);

on a created TSocket object s, but this didn't seem to help.

This needs to be investigated.



 Comments   
Comment by Sergei Petrunia [ 2012-09-21 ]

Dunno if this related, but asynchronous client also experiences timeout failures. They seem to happen at random, examples:

on_batch_mutate_fail obj 1 reason TimedOutException: Default TException. at 1181256 rows
on_batch_mutate_fail obj 2 reason TimedOutException: Default TException. at 392616 rows
on_batch_mutate_fail obj 1 reason TimedOutException: Default TException. at 343295 rows (9 reactor loops)

Need to find that timeout setting. And may be even add reconnect/retry functionality.

Comment by Sergei Petrunia [ 2012-09-21 ]

$8 = (org::apache::cassandra::TimedOutException *) 0x7f19a43ea668

Interesting thing: this is not an exception from communication between mysqld process and cassandra. This exception is defined in cassandra.thrift file:

/** RPC timeout was exceeded. either a node failed mid-operation, or load was too high, or the requested op was too large. */
exception TimedOutException {
}

and it was passed to us from inside Cassandra!

Comment by Sergei Petrunia [ 2014-04-15 ]

Cassandra by its design may return timeouts, and in that case the user is expected to retry the operation. We could have better retry logic in Cassandra SE, if there was enough interest in making use of it.

Generated at Thu Feb 08 06:29:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.