[MDEV-13995] MAX(timestamp) returns a wrong result near DST change Created: 2017-10-04  Updated: 2020-06-10  Resolved: 2018-12-10

Status: Closed
Project: MariaDB Server
Component/s: Data types, Temporal Types
Affects Version/s: 5.5, 10.0, 10.1, 10.2, 10.3
Fix Version/s: 10.4.1

Type: Bug Priority: Critical
Reporter: Alexander Barkov Assignee: Alexander Barkov
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
blocks MDEV-4912 Data type plugin API version 1 Closed
blocks MDEV-8894 Inserting fractional seconds into My... Closed
blocks MDEV-11829 Please add support for datetime with ... Open
blocks MDEV-15654 ADDTIME creates invalid timestamp val... Closed
blocks MDEV-16873 Compare two FLOAT expressions as floa... Open
is blocked by MDEV-13997 Change Item_bool_rowready_func2 to ca... Closed
is blocked by MDEV-15758 Split Item_bool_func::get_mm_leaf() i... Closed
is blocked by MDEV-17905 Add class Charset Closed
is blocked by MDEV-17906 Class Binary_string Closed
is blocked by MDEV-17907 Class Static_binary_string Closed
is blocked by MDEV-17928 Conversion from TIMESTAMP to VARCHAR ... Closed
Relates
relates to MDEV-14271 Dynamic SQL: TIMESTAMP parameter valu... Open
relates to MDEV-14278 Bad result for UNIX_TIMESTAMP(zero_ti... Open
relates to MDEV-19961 MIN(timestamp_column) returns a wrong... Closed
relates to MDEV-22828 Bad results near DST change: UNIX_TIM... Open
relates to MDEV-22843 Year 2038 issue executing prepared UP... Open
Sprint: 10.3.6-1

 Description   

This task is now the main show-stopper for MDEV-4912. So changing priority to Critical.

I create a table with a TIMESTAMP column and populate it with special TIMESTAMP values near summer/winter time change:

SET time_zone='Europe/Moscow';
CREATE OR REPLACE TABLE t1 (a TIMESTAMP);
SET timestamp=1288477526;  -- this is summer time
INSERT INTO t1 VALUES (NULL);
SET timestamp=1288477526+3599; -- this is winter time
INSERT INTO t1 VALUES (null);
SELECT a, UNIX_TIMESTAMP(a) FROM t1;

+---------------------+-------------------+
| a                   | UNIX_TIMESTAMP(a) |
+---------------------+-------------------+
| 2010-10-31 02:25:26 |        1288477526 |
| 2010-10-31 02:25:25 |        1288481125 |
+---------------------+-------------------+

Notice, the first record has a smaller UNIX_TIMESTAMP value, however the "YYYY-MM-DD hh:mm:ss" representation looks smaller for the second value.
This corner case happens because the first timestamp corresponds to summer time, while the second timestamp corresponds to winter time in the given time zone, when the clock went one hour backward.

Now I want to find the maximum value in the table t1 and store it into another table t2:

CREATE OR REPLACE TABLE t2 (a TIMESTAMP);
INSERT INTO t2 SELECT MAX(a) AS a FROM t1;
SELECT a, UNIX_TIMESTAMP(a) FROM t2;

+---------------------+-------------------+
| a                   | UNIX_TIMESTAMP(a) |
+---------------------+-------------------+
| 2010-10-31 02:25:26 |        1288477526 |
+---------------------+-------------------+

Notice, it erroneously took the record with the bigger "YYYY-MM-DD hh:mm:ss" representation, but with the smaller UNIX_TIMESTAMP value.

This is wrong. The expected behavior would be to take the record with 1288481125 timestamp value, which is later in time.

The server code should be changed to handle TIMESTAMP values using my_time_t representation, without conversion to datetime format "YYYY-MM-DD hh:mm:ss".



 Comments   
Comment by Alexander Barkov [ 2017-10-31 ]

This query:

SELECT UNIX_TIMESTAMP(MAX(a)) AS a FROM t1; 

returns

+------------+
| a          |
+------------+
| 1288481126 |
+------------+

Looks wrong. The expected result would be 1288481125.

Comment by Alexander Barkov [ 2017-11-01 ]

A similar problem is repeatable with comparison.

This script:

SET time_zone='Europe/Moscow';
CREATE OR REPLACE TABLE t1 (a TIMESTAMP);
CREATE OR REPLACE TABLE t2 (a TIMESTAMP);
SET timestamp=1288477526;  -- this is summer time
INSERT INTO t1 VALUES (NULL);
SET timestamp=1288477526+3599; -- this is winter time
INSERT INTO t2 VALUES (null);
SELECT UNIX_TIMESTAMP(t1.a), UNIX_TIMESTAMP(t2.a) FROM t1,t2;

returns:

+----------------------+----------------------+
| UNIX_TIMESTAMP(t1.a) | UNIX_TIMESTAMP(t2.a) |
+----------------------+----------------------+
|           1288477526 |           1288481125 |
+----------------------+----------------------+

The timestamp value in t1.a is obviously less than the value in t2.a.

However, this script returns empty set:

SELECT * FROM t1,t2 WHERE t1.a < t2.a;

If I now change the current session time zone, it correctly returns one row:

SET timestamp=DEFAULT;
SELECT * FROM t1,t2 WHERE t1.a < t2.a;

+---------------------+---------------------+
| a                   | a                   |
+---------------------+---------------------+
| 2010-10-30 22:25:26 | 2010-10-30 23:25:25 |
+---------------------+---------------------+

Comment by Alexander Barkov [ 2017-11-01 ]

The same problem is repeatable with SP variables:

DELIMITER $$
BEGIN NOT ATOMIC
  DECLARE a,b TIMESTAMP;
  SET time_zone='+00:00';
  SET a=FROM_UNIXTIME(1288477526);
  SET b=FROM_UNIXTIME(1288481125);
  SELECT a < b;
  SET time_zone='Europe/Moscow';
  SELECT a < b;
END;
$$
DELIMITER ;

+-------+
| a < b |
+-------+
|     1 |
+-------+
1 row in set (0.01 sec)
+-------+
| a < b |
+-------+
|     0 |
+-------+
1 row in set (0.01 sec)

The second result should return 1 rather than 0.

Comment by Alexander Barkov [ 2017-11-01 ]

The same problem is repeatable with BETWEEN:

This scipt correctly returns 1:

CREATE OR REPLACE TABLE t1 (a TIMESTAMP,b TIMESTAMP,c TIMESTAMP);
SET time_zone='+00:00';
INSERT INTO t1 VALUES (FROM_UNIXTIME(1288477526),FROM_UNIXTIME(1288481125),FROM_UNIXTIME(1288481126));
SELECT b BETWEEN a AND c FROM t1;

+-------------------+
| b BETWEEN a AND c |
+-------------------+
|                 1 |
+-------------------+

Now if I change the time zone and re-run the query, it erroneously returns 0:

SET time_zone='Europe/Moscow';
SELECT b BETWEEN a AND c FROM t1;

+-------------------+
| b BETWEEN a AND c |
+-------------------+
|                 0 |
+-------------------+

Comment by Alexander Barkov [ 2017-11-02 ]

The same problem is repeatable with stored functions returning TIMESTAMP values:

DELIMITER $$
CREATE OR REPLACE FUNCTION f1(uts INT) RETURNS TIMESTAMP
BEGIN
  DECLARE ts TIMESTAMP;
  DECLARE tz VARCHAR(64) DEFAULT @@time_zone;
  SET time_zone='+00:00';
  SET ts=FROM_UNIXTIME(uts);
  SET time_zone=tz;
  RETURN ts;
END;
$$
DELIMITER ;

SET time_zone='+00:00';
SELECT f1(1288477526) < f1(1288481125);

+---------------------------------+
| f1(1288477526) < f1(1288481125) |
+---------------------------------+
|                               1 |
+---------------------------------+

SET time_zone='Europe/Moscow';
SELECT f1(1288477526) < f1(1288481125);

+---------------------------------+
| f1(1288477526) < f1(1288481125) |
+---------------------------------+
|                               0 |
+---------------------------------+

The last query should return 1.

Comment by Alexander Barkov [ 2017-11-02 ]

The same problem is repeatable with LEAST.
This script correctly returns a in LEAST(a,b):

CREATE OR REPLACE TABLE t1 (a TIMESTAMP,b TIMESTAMP);
SET time_zone='+00:00';
INSERT INTO t1 VALUES (FROM_UNIXTIME(1288477526),FROM_UNIXTIME(1288481125));
SELECT *, LEAST(a,b) FROM t1;

+---------------------+---------------------+---------------------+
| a                   | b                   | LEAST(a,b)          |
+---------------------+---------------------+---------------------+
| 2010-10-30 22:25:26 | 2010-10-30 23:25:25 | 2010-10-30 22:25:26 |
+---------------------+---------------------+---------------------+

Now if I change the time zone, it wrongly returns the value of b in LEAST(a,b):

SET time_zone='Europe/Moscow';
SELECT *, LEAST(a,b) FROM t1;

+---------------------+---------------------+---------------------+
| a                   | b                   | LEAST(a,b)          |
+---------------------+---------------------+---------------------+
| 2010-10-31 02:25:26 | 2010-10-31 02:25:25 | 2010-10-31 02:25:25 |
+---------------------+---------------------+---------------------+

Now I add UNIX_TIMESTAMP to all three fields:

SET time_zone='Europe/Moscow';
SELECT UNIX_TIMESTAMP(a), UNIX_TIMESTAMP(b), UNIX_TIMESTAMP(LEAST(a,b)) FROM t1;

+-------------------+-------------------+----------------------------+
| UNIX_TIMESTAMP(a) | UNIX_TIMESTAMP(b) | UNIX_TIMESTAMP(LEAST(a,b)) |
+-------------------+-------------------+----------------------------+
|        1288477526 |        1288481125 |                 1288477525 |
+-------------------+-------------------+----------------------------+

The UNIX_TIMESTAMP value for LEAST is wrong. It's neither UNIX_TIMESTAMP(a) nor UNIX_TIMESTAMP(b).
The expected value should be 1288477526, i.e. the same with UNIX_TIMESTAMP(a).

Comment by Alexander Barkov [ 2017-11-04 ]

The same problem is repeatable with ORDER BY.
I create a table with two TIMESTAMP values near DST change (Moscow time) and do two SELECT..ORDER BY queries:

  • using the column a directly, and
  • using COALESCE(a):

SET time_zone='+00:00';
CREATE OR REPLACE TABLE t1 (a TIMESTAMP);
INSERT INTO t1 VALUES (FROM_UNIXTIME(1288477526)); -- this is summer time in Mowcow
INSERT INTO t1 VALUES (FROM_UNIXTIME(1288481125)); -- this is winter time in Moscow
SELECT a, UNIX_TIMESTAMP(a) FROM t1 ORDER BY a;
SELECT COALESCE(a) AS a, UNIX_TIMESTAMP(a) FROM t1 ORDER BY a;

Both SELECT queries return the same correct result:

+---------------------+-------------------+
| a                   | UNIX_TIMESTAMP(a) |
+---------------------+-------------------+
| 2010-10-30 22:25:26 |        1288477526 |
| 2010-10-30 23:25:25 |        1288481125 |
+---------------------+-------------------+

Now I change the time zone and rerun the two SELECT..ORDER BY queries:

SET time_zone='Europe/Moscow';
SELECT a, UNIX_TIMESTAMP(a) FROM t1 ORDER BY a;
SELECT COALESCE(a) AS a, UNIX_TIMESTAMP(a) FROM t1 ORDER BY a;

The first SELECT returns a correct result, according to the timestamp value:

+---------------------+-------------------+
| a                   | UNIX_TIMESTAMP(a) |
+---------------------+-------------------+
| 2010-10-31 02:25:26 |        1288477526 |
| 2010-10-31 02:25:25 |        1288481125 |
+---------------------+-------------------+

The second SELECT with COALESCE(a) returns a result with a wrong order (the result is ordered by DATETIME instead of TIMESTAMP):

+---------------------+-------------------+
| a                   | UNIX_TIMESTAMP(a) |
+---------------------+-------------------+
| 2010-10-31 02:25:25 |        1288481125 |
| 2010-10-31 02:25:26 |        1288477526 |
+---------------------+-------------------+

The second query should return results in the same order with the first query.

Comment by Alexander Barkov [ 2017-11-06 ]

GROUP BY is not affected and correctly returns two rows in this example script:

SET time_zone='+00:00';
CREATE OR REPLACE TABLE t1 (a TIMESTAMP);
INSERT INTO t1 VALUES (FROM_UNIXTIME(1288477526)),(FROM_UNIXTIME(1288481126));
SET time_zone='Europe/Moscow';
SELECT a, UNIX_TIMESTAMP(a) FROM t1 GROUP BY a;

+---------------------+-------------------+
| a                   | UNIX_TIMESTAMP(a) |
+---------------------+-------------------+
| 2010-10-31 02:25:26 |        1288477526 |
| 2010-10-31 02:25:26 |        1288481126 |
+---------------------+-------------------+

Comment by Alexander Barkov [ 2017-11-06 ]

Searched CASE is affected:

SET time_zone='+00:00';
CREATE OR REPLACE TABLE t1 (a TIMESTAMP, b TIMESTAMP);
INSERT INTO t1 VALUES (FROM_UNIXTIME(1288477526),FROM_UNIXTIME(1288481126));
SELECT UNIX_TIMESTAMP(a),UNIX_TIMESTAMP(b),CASE a WHEN b THEN 'eq' ELSE 'ne' END AS x FROM t1;
SET time_zone='Europe/Moscow';
SELECT UNIX_TIMESTAMP(a),UNIX_TIMESTAMP(b),CASE a WHEN b THEN 'eq' ELSE 'ne' END AS x FROM t1;

+-------------------+-------------------+----+
| UNIX_TIMESTAMP(a) | UNIX_TIMESTAMP(b) | x  |
+-------------------+-------------------+----+
|        1288477526 |        1288481126 | ne |
+-------------------+-------------------+----+

+-------------------+-------------------+----+
| UNIX_TIMESTAMP(a) | UNIX_TIMESTAMP(b) | x  |
+-------------------+-------------------+----+
|        1288477526 |        1288481126 | eq |
+-------------------+-------------------+----+

The second result is wrong. Both SELECT queries are expected to return ne.

Comment by Alexander Barkov [ 2017-11-06 ]

IN is affected:

SET time_zone='+00:00';
CREATE OR REPLACE TABLE t1 (a TIMESTAMP, b TIMESTAMP,c TIMESTAMP);
INSERT INTO t1 VALUES (FROM_UNIXTIME(1288477526),FROM_UNIXTIME(1288481126),FROM_UNIXTIME(1288481127));
SELECT UNIX_TIMESTAMP(a),UNIX_TIMESTAMP(b),a IN (b,c) AS x FROM t1;
SET time_zone='Europe/Moscow';
SELECT UNIX_TIMESTAMP(a),UNIX_TIMESTAMP(b),a IN (b,c) AS x FROM t1;

+-------------------+-------------------+---+
| UNIX_TIMESTAMP(a) | UNIX_TIMESTAMP(b) | x |
+-------------------+-------------------+---+
|        1288477526 |        1288481126 | 0 |
+-------------------+-------------------+---+

+-------------------+-------------------+---+
| UNIX_TIMESTAMP(a) | UNIX_TIMESTAMP(b) | x |
+-------------------+-------------------+---+
|        1288477526 |        1288481126 | 1 |
+-------------------+-------------------+---+

The second result is wrong. Both SELECT queries are expected to return 0.

Comment by Alexander Barkov [ 2018-06-28 ]

The same problem is repeatable with subqueries involved in comparison or IN.

This script correctly returns empty set in all three SELECT queries:

SET time_zone='+00:00';
CREATE OR REPLACE TABLE t1 (a TIMESTAMP, b TIMESTAMP);
INSERT INTO t1 VALUES (FROM_UNIXTIME(1288477526),FROM_UNIXTIME(1288481126));
SELECT * FROM t1 WHERE a = (SELECT MAX(b) FROM t1);
SELECT * FROM t1 WHERE a = (SELECT MIN(b) FROM t1);
SELECT * FROM t1 WHERE a IN ((SELECT MAX(b) FROM t1), (SELECT MIN(b) FROM t1));

Now if I change the time zone, all three queries return the row:

SET time_zone='Europe/Moscow';
SELECT * FROM t1 WHERE a = (SELECT MAX(b) FROM t1);
SELECT * FROM t1 WHERE a = (SELECT MIN(b) FROM t1);
SELECT * FROM t1 WHERE a IN ((SELECT MAX(b) FROM t1), (SELECT MIN(b) FROM t1));

MariaDB [test]> SELECT * FROM t1 WHERE a = (SELECT MAX(b) FROM t1);
+---------------------+---------------------+
| a                   | b                   |
+---------------------+---------------------+
| 2010-10-31 02:25:26 | 2010-10-31 02:25:26 |
+---------------------+---------------------+
 
+---------------------+---------------------+
| a                   | b                   |
+---------------------+---------------------+
| 2010-10-31 02:25:26 | 2010-10-31 02:25:26 |
+---------------------+---------------------+
 
+---------------------+---------------------+
| a                   | b                   |
+---------------------+---------------------+
| 2010-10-31 02:25:26 | 2010-10-31 02:25:26 |
+---------------------+---------------------+

Comment by Oleksandr Byelkin [ 2018-11-29 ]

Ideas we discussed:
1. special type for raw native data
2. probably union to have no problem with short types with fixed length like int, double, etc)
3. using raw is not good as it easy can be mixed with row (better native)

Comment by Alexander Barkov [ 2018-12-10 ]

sanja, please review a new patch version:
The patch is here:
https://github.com/MariaDB/server/commits/bb-10.4-mdev13995
https://github.com/MariaDB/server/commit/d9b70161d4b84c50ea15dafbcdbf61a326b018c1

Comment by Oleksandr Byelkin [ 2018-12-10 ]

OK to push!

Generated at Thu Feb 08 08:10:05 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.