The initially suggested ROW_NUMBER was intuitively understandable, but ERROR_INDEX in the final implementation is not. It is not necessarily an error, and it is certainly not its index.
_can be discussed_
{quote}
A note unrelated to the previous one
The initially suggested {{ROW_NUMBER}} was intuitively understandable, but {{ERROR_INDEX}} in the final implementation is not. It is not necessarily an error, and it is certainly not its index.
{code:sql}
MariaDB [test]> insert ignore into t1 (pk) values (1),(1);
Query OK, 1 row affected, 1 warning (0.017 sec)
Records: 2 Duplicates: 1 Warnings: 1
MariaDB [test]> show warnings;
+---------+------+---------------------------------------+
| Level | Code | Message |
+---------+------+---------------------------------------+
| Warning | 1062 | Duplicate entry '1' for key 'PRIMARY' |
+---------+------+---------------------------------------+
1 row in set (0.000 sec)
+------+---------------------------------------+
| @num | @msg |
+------+---------------------------------------+
| 2 | Duplicate entry '1' for key 'PRIMARY' |
+------+---------------------------------------+
1 row in set (0.000 sec)
{code}
If something is an "index" here, it's the condition number (1), but the failing row number is 2. And it's a warning.
{quote}
_can be discussed_
{quote}
The initially suggested {{ROW_NUMBER}} was intuitively understandable, but {{ERROR_INDEX}} in the final implementation is not. It is not necessarily an error, and it is certainly not its index.
{code:sql}
MariaDB [test]> insert ignore into t1 (pk) values (1),(1);
Query OK, 1 row affected, 1 warning (0.017 sec)
Records: 2 Duplicates: 1 Warnings: 1
MariaDB [test]> show warnings;
+---------+------+---------------------------------------+
| Level | Code | Message |
+---------+------+---------------------------------------+
| Warning | 1062 | Duplicate entry '1' for key 'PRIMARY' |
+---------+------+---------------------------------------+
1 row in set (0.000 sec)
ROW_NUMBER sounds good to me in principle, but I'm wondering if it can cause any parser or other difficulties, given that we also have ROW_NUMBER as a window function.
Also, if we want to be "user-focused" from the start, maybe we should ask for an independent opinion, for example, the documentation team could provide valuable input? And of course karlsson as an original creator of the feature task.
Elena Stepanova
added a comment - ROW_NUMBER sounds good to me in principle, but I'm wondering if it can cause any parser or other difficulties, given that we also have ROW_NUMBER as a window function.
Also, if we want to be "user-focused" from the start, maybe we should ask for an independent opinion, for example, the documentation team could provide valuable input? And of course karlsson as an original creator of the feature task.
My reasoning is based on GET DIAGNOSTICS: it seems to me the closest syntax, so I'd try to be consistent with it.
It uses the term CONDITION for errors and warnings, for example:
-- first condition
GET DIAGNOSTICS CONDITION 1 @sqlstate = RETURNED_SQLSTATE
It also uses the unfortunate word NUMBER to mean the conditions count, not a single condition index.
Federico Razzoli
added a comment - - edited I suggest:
Use the term CONDITION instead of ERROR .
Avoid NUMBER .
My reasoning is based on GET DIAGNOSTICS : it seems to me the closest syntax, so I'd try to be consistent with it.
It uses the term CONDITION for errors and warnings, for example:
-- first condition
GET DIAGNOSTICS CONDITION 1 @sqlstate = RETURNED_SQLSTATE
It also uses the unfortunate word NUMBER to mean the conditions count , not a single condition index.
I think you've misinterpreted the question a little bit.
The current syntax already uses the term CONDITION and will continue to do so. It is not similar to GET DIAGNOSTICS, it is exactly the same statement.
GET DIAGNOSTICS CONDITION 1 @n = <property name>
The question is about a name for the new property. In your example above the property name is RETURNED_SQLSTATE. For the newly implemented property, the name is currently ERROR_INDEX, that is
GET DIAGNOSTICS CONDITION 1 @n = ERROR_INDEX
and we're discussing whether we could have a better name than ERROR_INDEX here.
You are right though that ROW_NUMBER can be interpreted as "count of rows". I don't know if it's a reasonable interpretation or not, but it can happen.
Elena Stepanova
added a comment - - edited f_razzoli ,
I think you've misinterpreted the question a little bit.
The current syntax already uses the term CONDITION and will continue to do so. It is not similar to GET DIAGNOSTICS , it is exactly the same statement.
GET DIAGNOSTICS CONDITION 1 @n = <property name >
The question is about a name for the new property. In your example above the property name is RETURNED_SQLSTATE . For the newly implemented property, the name is currently ERROR_INDEX , that is
GET DIAGNOSTICS CONDITION 1 @n = ERROR_INDEX
and we're discussing whether we could have a better name than ERROR_INDEX here.
You are right though that ROW_NUMBER can be interpreted as "count of rows". I don't know if it's a reasonable interpretation or not, but it can happen.
Thanks for explaining the parts I misunderstood. My suggestion about ERROR_INDEX is that something like CONDITION_INDEX could be more consistent (condition seems to mean error or warning).
Federico Razzoli
added a comment - Thanks for explaining the parts I misunderstood. My suggestion about ERROR_INDEX is that something like CONDITION_INDEX could be more consistent ( condition seems to mean error or warning ).
I'm afraid that CONDITION_INDEX would be entirely misleading though. What is most likely thought as a condition index is the number which goes after the CONDITION word in GET DIAGNOSTICS statement, and it's very different from the value which is currently implemented as ERROR_INDEX. Please consider the example:
MariaDB [test]> createtable t (pk intprimarykey, a char(3), check(a isnotnull));
Query OK, 0 rows affected (0.025 sec)
MariaDB [test]> insertinto t values (1,'foo'),(2,'bar');
Query OK, 2 rows affected (0.006 sec)
Records: 2 Duplicates: 0 Warnings: 0
MariaDB [test]> insertignoreinto t values (3,'qux'),(1,'foo'),(2,null),(2,'foobar');
For the condition no. 1, the value in question is 2 (2nd row in the VALUES list violates the PK constraint). And for all other conditions, no. 2-4, the value in question is 3 (3rd row in the VALUES list violates the PK restriction, the CHECK constraint, and exceeds the char length):
MariaDB [test]> get diagnostics condition 1 @n1 = error_index;
Query OK, 0 rows affected (0.001 sec)
MariaDB [test]> get diagnostics condition 2 @n2 = error_index;
Query OK, 0 rows affected (0.001 sec)
MariaDB [test]> get diagnostics condition 3 @n3 = error_index;
Query OK, 0 rows affected (0.001 sec)
MariaDB [test]> get diagnostics condition 4 @n4 = error_index;
Query OK, 0 rows affected (0.001 sec)
MariaDB [test]> select @n1, @n2, @n3, @n4;
+------+------+------+------+
| @n1 | @n2 | @n3 | @n4 |
+------+------+------+------+
| 2 | 3 | 3 | 3 |
+------+------+------+------+
1 row inset (0.001 sec)
So, only n3 value coincides with the condition index, and it is indeed a pure coincidence. In reality it will rarely happen, it will be as uncorrelated as this:
MariaDB [test]> createtable t (pk intprimarykey);
Query OK, 0 rows affected (0.029 sec)
MariaDB [test]> insertinto t values (20);
Query OK, 1 row affected (0.006 sec)
MariaDB [test]> insertinto t select seq from seq_5_to_50;
MariaDB [test]> get diagnostics condition 1 @n1 = error_index;
Query OK, 0 rows affected (0.001 sec)
MariaDB [test]> select @n1;
+------+
| @n1 |
+------+
| 16 |
+------+
1 row inset (0.000 sec)
So, the condition index is 1, the violating PK value is 20, and the property value in question is 16.
Elena Stepanova
added a comment - I'm afraid that CONDITION_INDEX would be entirely misleading though. What is most likely thought as a condition index is the number which goes after the CONDITION word in GET DIAGNOSTICS statement, and it's very different from the value which is currently implemented as ERROR_INDEX. Please consider the example:
MariaDB [test]> create table t (pk int primary key , a char (3), check (a is not null ));
Query OK, 0 rows affected (0.025 sec)
MariaDB [test]> insert into t values (1, 'foo' ),(2, 'bar' );
Query OK, 2 rows affected (0.006 sec)
Records: 2 Duplicates: 0 Warnings: 0
MariaDB [test]> insert ignore into t values (3, 'qux' ),(1, 'foo' ),(2, null ),(2, 'foobar' );
Query OK, 1 row affected, 4 warnings (0.004 sec)
Records: 3 Duplicates: 2 Warnings: 4
So, at this point we have 4 conditions:
MariaDB [test]> show warnings;
+ ---------+------+-------------------------------------------------+
| Level | Code | Message |
+ ---------+------+-------------------------------------------------+
| Warning | 1062 | Duplicate entry '1' for key 'PRIMARY' |
| Warning | 4025 | CONSTRAINT `CONSTRAINT_1` failed for `test`.`t` |
| Warning | 1265 | Data truncated for column 'a' at row 3 |
| Warning | 1062 | Duplicate entry '2' for key 'PRIMARY' |
+ ---------+------+-------------------------------------------------+
4 rows in set (0.001 sec)
For the condition no. 1, the value in question is 2 (2nd row in the VALUES list violates the PK constraint). And for all other conditions, no. 2-4, the value in question is 3 (3rd row in the VALUES list violates the PK restriction, the CHECK constraint, and exceeds the char length):
MariaDB [test]> get diagnostics condition 1 @n1 = error_index;
Query OK, 0 rows affected (0.001 sec)
MariaDB [test]> get diagnostics condition 2 @n2 = error_index;
Query OK, 0 rows affected (0.001 sec)
MariaDB [test]> get diagnostics condition 3 @n3 = error_index;
Query OK, 0 rows affected (0.001 sec)
MariaDB [test]> get diagnostics condition 4 @n4 = error_index;
Query OK, 0 rows affected (0.001 sec)
MariaDB [test]> select @n1, @n2, @n3, @n4;
+ ------+------+------+------+
| @n1 | @n2 | @n3 | @n4 |
+ ------+------+------+------+
| 2 | 3 | 3 | 3 |
+ ------+------+------+------+
1 row in set (0.001 sec)
So, only n3 value coincides with the condition index, and it is indeed a pure coincidence. In reality it will rarely happen, it will be as uncorrelated as this:
MariaDB [test]> create table t (pk int primary key );
Query OK, 0 rows affected (0.029 sec)
MariaDB [test]> insert into t values (20);
Query OK, 1 row affected (0.006 sec)
MariaDB [test]> insert into t select seq from seq_5_to_50;
ERROR 1062 (23000): Duplicate entry '20' for key 'PRIMARY'
MariaDB [test]> show warnings;
+ -------+------+----------------------------------------+
| Level | Code | Message |
+ -------+------+----------------------------------------+
| Error | 1062 | Duplicate entry '20' for key 'PRIMARY' |
+ -------+------+----------------------------------------+
1 row in set (0.000 sec)
MariaDB [test]> get diagnostics condition 1 @n1 = error_index;
Query OK, 0 rows affected (0.001 sec)
MariaDB [test]> select @n1;
+ ------+
| @n1 |
+ ------+
| 16 |
+ ------+
1 row in set (0.000 sec)
So, the condition index is 1, the violating PK value is 20, and the property value in question is 16.
May be CONDITION_ROW ? But it's RETURNED_SQLSTATE not CONDITION_SQLSTATE.
ROW_NUMBER has an additional benefit of not adding a new keyword into the grammar, so not blowing up the parser (even though the effect of one new keyword is tiny).
Sergei Golubchik
added a comment - May be CONDITION_ROW ? But it's RETURNED_SQLSTATE not CONDITION_SQLSTATE .
ROW_NUMBER has an additional benefit of not adding a new keyword into the grammar, so not blowing up the parser (even though the effect of one new keyword is tiny).
I'm adding one more argument against my previous suggestion: ROW_NUMBER would be, at least, similar to Db2 (DB2_ROW_NUMBER).
PostgreSQL, SQL Server, Oracle and Informix don't seem to have an exact equivalent, so Db2 is the only "reference" I've found.
Federico Razzoli
added a comment - I'm adding one more argument against my previous suggestion: ROW_NUMBER would be, at least, similar to Db2 ( DB2_ROW_NUMBER ).
PostgreSQL , SQL Server , Oracle and Informix don't seem to have an exact equivalent, so Db2 is the only "reference" I've found.
Could you please make a decision and if you decide to change it, push the fix? I'd rather not have it postponed till the last moment.
Elena Stepanova
added a comment - serg , rucha174 ,
Could you please make a decision and if you decide to change it, push the fix? I'd rather not have it postponed till the last moment.
rucha174, please, rename ERROR_INDEX to ROW_NUMBER similar to DB2.
And in MDEV-26635 we'll try to make it 0 when no rows were involved, again, as in DB2.
Sergei Golubchik
added a comment - rucha174 , please, rename ERROR_INDEX to ROW_NUMBER similar to DB2.
And in MDEV-26635 we'll try to make it 0 when no rows were involved, again, as in DB2.
I think it would be good to do the renaming before any other feature-related fixes go to 10.7 main branch. It's bad enough that we'll have a scattered git history of this feature (bugfixes detached from the initial push), but now the patches and commit comments also contain the syntax which is about to change, it will be difficult to search and understand later.
Elena Stepanova
added a comment - I think it would be good to do the renaming before any other feature-related fixes go to 10.7 main branch. It's bad enough that we'll have a scattered git history of this feature (bugfixes detached from the initial push), but now the patches and commit comments also contain the syntax which is about to change, it will be difficult to search and understand later.
Hi elenst , I could change it to ROW_NUMBER since it is more intuitive.