Details
-
Task
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
None
Description
Unify sql_yacc.yy and sql_yacc_ora.yy
Maintaining two independent bison grammar files is difficult.
We should eventually:
- either get rid of the second bison grammar file sql_yacc_ora.yy
- or generate sql_yacc_ora.yy from sql_yacc.yy, for example with help of CMake utilities
In order to make this possible, we should start with making the two grammar as close to each other as possible.
As the measure of similarity we can use the following shell command:
(
|
diff -u sql_yacc.yy sql_yacc_ora.yy |grep @@|wc -l
|
diff -u sql_yacc.yy sql_yacc_ora.yy |wc -l
|
diff -u sql_yacc.yy sql_yacc_ora.yy |wc -c
|
)
|
It returns three numbers:
- the number of chunks in unified diff
- the number of lines in unified diff
- the number of bytes in unified diff
As of 2017-04-18 the above command returns:
112
|
3039
|
94157
|
The difference between two grammar files include:
General
- sql_yacc.yy contains 4 global functions, e.g. turn_parser_debug_on()
- sql_yacc.yy around 11 LEX methods, e.g. LEX::case_stmt_action_expr()
- Different number of shift/reduce conflicts
- Different set of %type directives
- sql_yacc_ora.yy has different rules for allowed keywords in identifiers: SP variables and labels: ident_directly_assignable, keyword_directly_assignable, keyword_directly_not_assignable, keyword, keyword_sp, keyword_sp_data_type
Data types
- sql_yacc_ora.yy additionally understands RAW, NUMBER, VARCHAR2 data types
- sql_yacc_ora.yy translates DATE to DATETIME
- sql_yacc_ora.yy translates BLOB to LONGBLOB
- sql_yacc_ora.yy splits field_type into field_type_numeric, field_type_temporal, field_type_string, field_type_lob, field_type_misc. This is to allow VARCHAR without length in SP parameters.
- row_type_body vs field_type_row
- sp_param_name_and_type in sql_yacc_ora.yy uses a new rule sp_param_type_with_opt_collate instead of type_with_opt_collate, because in stored procedure parameters length for the VARCHAR and VARCHAR2 data type is optional. sp_param_field_type_string, field_type, opt_field_length_default_sp_param_varchar, opt_field_length_default_sp_param_char
- sql_yacc_ora.yy allows to cast as VARCHAR and VARCHAR2: cast_type
Statements
- sql_yacc_ora.yy does not have begin in verb_clause
- sql_yacc_ora.yy does not have a separate rule sp_unlabeled_block_not_atomic. BEGIN starts an anonymous block rather than a transaction
- sql_yacc_ora.yy additionally has set_assign in statement
- sql_yacc_ora.yy additionally has raise_stmt in statement
- sql_yacc_ora.yy extends TRUNCATE with an optional storage clause: truncate, opt_truncate_table_storage_clause
- sql_yacc_ora.yy allows to call stored procedures from other stored routines without using the keyword CALL: sp_statement
Flow control
- sql_yacc_ora.yy additionally supports FOR loops: sp_labeled_control, sp_unlabeled_control.
- WHILE expr DO..END WHILE vs WHILE expr LOOP..END LOOP
- sql_yacc_ora.yy supports GOTO: sp_proc_stmt_goto, sp_proc_stmts1_implicit_block, trigger_tail, sp_tail
- ELSEIF vs ELSIF
Declarations
- sql_yacc_ora.yy supports variable declarations after cursor declarations
- Different position of sp_opt_inout in sp_pdparam
- sql_yacc_ora.yy supports IN OUT as two separate words in sp_opt_inout
- Declarations in sql_yacc_ora.yy reside in a separate block section DECLARE..BEGIN, or AS..BEGIN. In sql_yacc.yy DECLARE is a directive in the beginning of BEGIN..END block.
- sql_yacc_ora.yy has the EXCEPTION section in SP sp_labeled_block, sp_block_statements_and_exceptions, opt_exception_clause,
- sql_yacc_ora.yy supports user defined exception declaration: signal_value, sp_hcond
- TYPE OF name vs name%TYPE and ROW TYPE OF name vs name%ROWTYPE
Functions and operators
- Different semantics and syntax for the function DECODE: column_default_non_parenthesized_expr, decode_when_list
- Different Item types created to handle concatenation: simple_expr
- sql_yacc_ora.yy does not support % as a synonym for MOD. % is used for cursor attributes. '%' has a %left precedence in sql_yacc.yy. '%' is not in a %left directive in sql_yacc_ora.yy. bit_expr
- sql_yacc_ora.yy has implicit cursor attributes: sp_cursor_name_and_offset, explicit_cursor_attr, simple_expr, function_call_keyword
- sql_yacc_ora.yy has explicit cursor attributes: function_call_keyword
Other syntax differences
- sql_yacc_ora.yy support UNIQUE_SYM as a synonym to DISTICT
- sql_yacc_ora.yy additionally supports an optional SP name after the last END keyword, for easier SP code readability: opt_sp_name, sp_tail_standalone, sf_tail_standalone.
- SQLEXCEPTION_SYN vs OTHERS_SYM in sp_hcond
- Different label format: sp_block_label, sp_labeled_control,
- Different PS parameter syntax ? vs :1, :name: param_marker, colon_with_pos, simple_ident_q2, internal_variable_name_directly_assignable
- sql_yacc_ora.yy supports empty SP parameter list without parentheses. This adds rules sp_no_param, opt_sp_parenthesized_fdparam_list, opt_sp_parenthesized_pdparam_list
- sql_yacc_ora.yy removes the rule opt_savepoint and embeds it directly to rollback to remove a shift/reduce conflict
Directions
We can use the following directions to unify the two grammar files:
1. Move the different code from .yy files to .h or .cc files.
Examples:
- global functions, e.g. turn_parser_debug_on()
- LEX methods, e.g. LEX::case_stmt_action_expr()
- Different Item types created to handle concatenation
2. Port missing features from sql_yacc_ora.yy to sql_yacc.yy
- GOTO
- variable declarations after cursor declarations
3. Move the different grammar rules to the end of *.yy files, so they reside in one chunk which can be easily removed or replaced. This will be helpful if we decide to generate sql_yacc_ora.yy from sql_yacc.yy.
4. Token substitution in MYSQLlex() in sql_lex.cc when sql_mode=ORACLE.
Examples:
- BEGIN_SYM could be substituted to ORA_BEGIN_SYM, to handle BEGIN as a transaction start and BEGIN as an anonymous block start.
- The token % can be substituted to ORA_PERCENT_SIGN, to handle the MOD operator vs cursor attributes.
5. Raise syntax errors conditionally, depending on sql_mode
Examples:
- We could unify the grammar for anonymous blocks by adding the DECLARE_SYM branch to sql_yacc.yy, but then generate a syntax error if sql_mode is not ORACLE:
sp_unlabeled_block:
...
| DECLARE_SYM
{
if (!(thd->variables.sql_mode & MODE_ORACLE)
{
thd->syntax_error();
MYSQL_YYABORT;
}
}
6. Copy safe grammar changes between sql_yacc_ora.yy and sql_yacc.yy
Examples:
- The field_type rules in sql_yacc_ora.yy was split into parts:
field_type:
field_type_numeric
| field_type_temporal
| field_type_string
| field_type_lob
| field_type_misc
;
This was done to be able to have VARCHAR without length as an SP parameter. This split is not really necessary in sql_yacc.yy, but it is not harmful and can be safely copied from sql_yacc_ora.yy to sql_yacc.yy.
- The field_type_row rule in sql_yacc_ora.yy includes ROW_SYM:
field_type_row:
ROW_SYM '(' row_field_definition_list ')' { $$= $3; }
;
In sql_yacc.yy it was replaced to row_type_body without ROW_SYM:
row_type_body:
'(' row_field_definition_list ')' { $$= $2; }
;
This was done to avoid shift/reduce conflicts in this rule:
sp_decl_variable_list:
sp_decl_idents_init_vars type_with_opt_collate sp_opt_default {...}
| sp_decl_idents_init_vars TYPE_SYM OF_SYM optionally_qualified_column_ident sp_opt_default {...}
| sp_decl_idents_init_vars ROW_SYM TYPE_SYM OF_SYM optionally_qualified_column_ident sp_opt_default {...}
| sp_decl_idents_init_vars ROW_SYM row_type_body sp_opt_default {...}
;
to handle the following declarations without conflicts:
DECLARE
var ROW (a INT);
var ROW TYPE OF t1.x;
This change is not really necessary for sql_yacc_ora.yy, but it's not harmful and can be copied from sql_yacc.yy to sql_yacc_ora.yy.
Attachments
Issue Links
- blocks
-
MDEV-21073 Collect different grammar rules into a single chunk
- Closed
- is blocked by
-
MDEV-14415 Add Oracle-style FOR loop to sql_mode=DEFAULT
- Closed
-
MDEV-17652 Add sql_mode specific tokens for some keywords
- Closed
-
MDEV-17660 sql_mode=ORACLE: Some keywords do not work as label names: history, system, versioning, without
- Closed
-
MDEV-17661 Add sql_mode specific tokens for the keyword DECODE
- Closed
-
MDEV-17664 Add sql_mode specific tokens for ':' and '%'
- Closed
-
MDEV-17666 sql_mode=ORACLE: Keyword ELSEIF should not be reserved
- Closed
-
MDEV-17669 Add sql_mode specific tokens for the keyword DECLARE
- Closed
-
MDEV-17687 Add sql_mode specific tokens for keywords BLOB, CLOB, NUMBER, RAW, VARCHAR2
- Closed
-
MDEV-17693 Shift/reduce conflicts for NAMES,ROLE,PASSWORD in the option_value_no_option_type grammar
- Closed
-
MDEV-17694 Add method LEX::sp_proc_stmt_statement_finalize()
- Closed
-
MDEV-18767 Port "MDEV-16294: INSTALL PLUGIN IF NOT EXISTS / UNINSTALL PLUGIN IF EXISTS" to sql_yacc_ora.yy
- Closed
-
MDEV-18779 Port sp_suid implementation from sql_yacc_ora.yy to sql_yacc.yy
- Closed
-
MDEV-18789 Port "MDEV-7773 Aggregate stored functions" to sql_yacc_ora.yy
- Closed
-
MDEV-18796 Synchronize PS grammar between sql_yacc.yy and sql_yacc_ora.yy
- Closed
-
MDEV-18813 PROCEDURE and anonymous blocks silently ignore FETCH GROUP NEXT ROW
- Closed
-
MDEV-19533 Add methods make() and append_uniq() to Row_definition_list
- Closed
-
MDEV-19535 sql_mode=ORACLE: 'SELECT INTO @var FOR UPDATE' does not lock the table
- Closed
-
MDEV-20913 sql_mode=ORACLE: INET6 does not work as a routine parameter type and return type
- Closed
-
MDEV-20924 Unify grammar rules: field_type_string and sp_param_field_type_string
- Closed
-
MDEV-20985 Add LEX methods stmt_drop_{function|procedure}() and stmt_alter_{function|procedure}_start()
- Closed
-
MDEV-21023 Move LEX methods and related functions from sql_yacc.yy to sql_lex.cc
- Closed
-
MDEV-21043 Collect different bison %type declarations into a single chunk
- Closed
-
MDEV-21064 Add a new class sp_expr_lex and a new grammar rule expr_lex
- Closed
-
MDEV-21110 Unify turn_parser_debug_on() in sql_yacc.yy and sql_yacc_ora.yy
- Closed
- relates to
-
MDEV-10142 PL/SQL parser
- Closed
-
MDEV-10485 "Unreserve" MariaDB reserved keywords that are not reserved in the other databases
- Open
-
MDEV-10764 PL/SQL parser - Phase 2
- Open