Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
5.5.60-galera
-
None
-
docker mariadb:5.5.60
Description
Hi guys,
I encountered a pretty weird problem that is hard to interpret. I tried to reproduce it but with no luck.
Definitions of tables involved
CREATE TABLE `task_instance` ( |
`task_id` varchar(250) NOT NULL, |
`dag_id` varchar(250) NOT NULL, |
`execution_date` datetime(6) NOT NULL, |
`start_date` datetime(6) DEFAULT NULL, |
`end_date` datetime(6) DEFAULT NULL, |
`duration` float DEFAULT NULL, |
`state` varchar(20) DEFAULT NULL, |
`try_number` int(11) DEFAULT NULL, |
`hostname` varchar(1000) DEFAULT NULL, |
`unixname` varchar(1000) DEFAULT NULL, |
`job_id` int(11) DEFAULT NULL, |
`pool` varchar(50) DEFAULT NULL, |
`queue` varchar(50) DEFAULT NULL, |
`priority_weight` int(11) DEFAULT NULL, |
`operator` varchar(1000) DEFAULT NULL, |
`queued_dttm` datetime(6) DEFAULT NULL, |
`pid` int(11) DEFAULT NULL, |
`max_tries` int(11) DEFAULT '-1', |
PRIMARY KEY (`task_id`,`dag_id`,`execution_date`), |
KEY `ti_dag_state` (`dag_id`,`state`), |
KEY `ti_pool` (`pool`,`state`,`priority_weight`), |
KEY `ti_state_lkp` (`dag_id`,`task_id`,`execution_date`,`state`), |
KEY `ti_state` (`state`), |
KEY `ti_job_id` (`job_id`) |
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
|
CREATE TABLE `dag_run` ( |
`id` int(11) NOT NULL AUTO_INCREMENT, |
`dag_id` varchar(250) DEFAULT NULL, |
`execution_date` datetime(6) DEFAULT NULL, |
`state` varchar(50) DEFAULT NULL, |
`run_id` varchar(250) DEFAULT NULL, |
`external_trigger` tinyint(1) DEFAULT NULL, |
`conf` blob,
|
`end_date` datetime(6) DEFAULT NULL, |
`start_date` datetime(6) DEFAULT NULL, |
PRIMARY KEY (`id`), |
UNIQUE KEY `dag_id` (`dag_id`,`execution_date`), |
UNIQUE KEY `dag_id_2` (`dag_id`,`run_id`), |
KEY `dag_id_state` (`dag_id`,`state`) |
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
|
CREATE TABLE `log` ( |
`id` int(11) NOT NULL AUTO_INCREMENT, |
`dttm` datetime(6) DEFAULT NULL, |
`dag_id` varchar(250) DEFAULT NULL, |
`task_id` varchar(250) DEFAULT NULL, |
`event` varchar(30) DEFAULT NULL, |
`execution_date` datetime(6) DEFAULT NULL, |
`owner` varchar(500) DEFAULT NULL, |
`extra` text,
|
PRIMARY KEY (`id`) |
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
------------------------
|
LATEST DETECTED DEADLOCK
|
------------------------
|
191216 8:52:09
|
*** (1) TRANSACTION:
|
TRANSACTION 4AB6056BF, ACTIVE 0 sec fetching rows
|
mysql tables in use 2, locked 2
|
LOCK WAIT 5 lock struct(s), heap size 1248, 5 row lock(s)
|
MySQL thread id 256467507, OS thread handle 0x7f973d815700, query id 14205100829 127.0.0.1 airflow Sending data
|
UPDATE task_instance, dag_run SET task_instance.state=NULL WHERE task_instance.dag_id IN ('A_Dag', 'B_Dag') AND task_instance.state IN ('queued', 'scheduled') AND dag_run.dag_id = task_instance.dag_id AND dag_run.execution_date = task_instance.execution_date AND dag_run.state != 'running'
|
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
|
RECORD LOCKS space id 0 page no 112745 n bits 128 index `PRIMARY` of table `airflow`.`task_instance` trx id 4AB6056BF lock_mode X locks rec but not gap waiting
|
*** (2) TRANSACTION:
|
TRANSACTION 4AB6056BC, ACTIVE 0 sec updating or deleting
|
mysql tables in use 1, locked 1
|
4 lock struct(s), heap size 1248, 2 row lock(s), undo log entries 2
|
MySQL thread id 256620571, OS thread handle 0x7f973d531700, query id 14205100833 192.168.1.1 airflow Updating
|
UPDATE task_instance SET start_date='2019-12-16 08:51:35.486074', state='running', try_number=1, hostname='192.168.1.1', unixname='airflow', job_id=119942, operator='PythonOperator', pid=60059 WHERE task_instance.task_id = 'X_Task' AND task_instance.dag_id = 'X_Dag' AND task_instance.execution_date = '2019-12-16 08:50:00'
|
*** (2) HOLDS THE LOCK(S):
|
RECORD LOCKS space id 0 page no 112745 n bits 128 index `PRIMARY` of table `airflow`.`task_instance` trx id 4AB6056BC lock_mode X locks rec but not gap
|
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
|
RECORD LOCKS space id 0 page no 112079 n bits 232 index `ti_state` of table `airflow`.`task_instance` trx id 4AB6056BC lock_mode X locks rec but not gap waiting
|
*** WE ROLL BACK TRANSACTION (1)
|
general.log
191216 8:52:09
|
256467507 Query SELECT 1
|
256620571 Connect airflow@192.168.1.1 as anonymous on airflow
|
256620571 Query set autocommit=0
|
256467507 Query UPDATE task_instance, dag_run SET task_instance.state='failed' WHERE task_instance.dag_id IN ('A_Dag', 'B_Dag') AND task_instance.state IN ('up_for_retry') AND dag_run.dag_id = task_instance.dag_id AND dag_run.execution_date = task_instance.execution_date AND dag_run.state != 'running'
|
256620571 Query SELECT 1
|
256620571 Query SELECT task_instance.try_number AS task_instance_try_number, task_instance.task_id AS task_instance_task_id, task_instance.dag_id AS task_instance_dag_id, task_instance.execution_date AS task_instance_execution_date, task_instance.start_date AS task_instance_start_date, task_instance.end_date AS task_instance_end_date, task_instance.duration AS task_instance_duration, task_instance.state AS task_instance_state, task_instance.max_tries AS task_instance_max_tries, task_instance.hostname AS task_instance_hostname, task_instance.unixname AS task_instance_unixname, task_instance.job_id AS task_instance_job_id, task_instance.pool AS task_instance_pool, task_instance.queue AS task_instance_queue, task_instance.priority_weight AS task_instance_priority_weight, task_instance.operator AS task_instance_operator, task_instance.queued_dttm AS task_instance_queued_dttm, task_instance.pid AS task_instance_pid FROM task_instance WHERE task_instance.task_id = 'X_Task' AND task_instance.dag_id = 'X_Dag' AND task_instance.execution_date = '2019-12-16 08:50:00'
|
256467507 Query commit
|
256467507 Query rollback
|
256467507 Query SELECT 1
|
256620571 Query INSERT INTO log (dttm, dag_id, task_id, event, execution_date, owner, extra) VALUES ('2019-12-16 08:51:35.491292', 'X_Dag', 'X_Task', 'running', '2019-12-16 08:50:00', 'Airflow', NULL)
|
256467507 Query UPDATE task_instance, dag_run SET task_instance.state=NULL WHERE task_instance.dag_id IN ('A_Dag', 'B_Dag') AND task_instance.state IN ('queued', 'scheduled') AND dag_run.dag_id = task_instance.dag_id AND dag_run.execution_date = task_instance.execution_date AND dag_run.state != 'running'
|
256620571 Query UPDATE task_instance SET start_date='2019-12-16 08:51:35.486074', state='running', try_number=1, hostname='192.168.1.1', unixname='airflow', job_id=119942, operator='PythonOperator', pid=60059 WHERE task_instance.task_id = 'X_Task' AND task_instance.dag_id = 'X_Dag' AND task_instance.execution_date = '2019-12-16 08:50:00'
|
256620571 Query commit
|
256620571 Query rollback
|
256620571 Quit
|
BTW: The problem happens more frequent if task_instance is large.
BR//Zhidong
Attachments
Issue Links
- relates to
-
MDEV-23338 server crash on select for update
- Open