[MCOL-1] Query Failed after a redistributeDB while ddl/dml/queries were active Created: 2016-04-27  Updated: 2023-10-26  Resolved: 2017-03-01

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: None
Fix Version/s: 1.1.0

Type: Task Priority: Minor
Reporter: David Hill (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

amazon system with 1um and multiple pms using local storage


Sprint: 2016-22, 2016-24, 2016-25, 2017-01, 2017-2, 2017-3, 2017-4, 2017-5

 Description   

Started with a 1um / 2pm system. I added a 3rd pm.
test scenario:

1. started a script that continually did a few queries on um1

2. started a script that ran the dbhealth.sh (DDl/DML test) continuely on um1

3. ran ./redistributeDB start // from pm1

So while redistributeDB was running, the scripts were successfully passin, but when it complete, the query started failing with:

 
MariaDB [tpch100g]> select count(*) from lineitem;
ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information. Started working after restartsystem was performed

In my next test, I added another pm and re-ran with just doing the query script only. This time the query continued to worked after completion... So this means that no ETL DDl DML changes should be made to the DB while the redistributeDB is running.

We could add in the code to suspend/resume database writes during this time, BUT cpimport doesn't look at the setting. only DML/DDLproc do.



 Comments   
Comment by David Hill (Inactive) [ 2016-04-27 ]

This will be addressed with a Documentation change to tell the user not to do any database changes via cpimport, DDl, or DML while the redistributeDB command is running..

So no code changes required..

Comment by Dipti Joshi (Inactive) [ 2016-05-31 ]

hillSince redistributeDB is not part of storage engine or server code using different component for this

Comment by David Hill (Inactive) [ 2016-05-31 ]

FYI - redistributeDB is an Enterprise tool, so it will NOT be included with the MariaDB ColumnStore product since we aren't providing the InfiniDB Enterprise tools.
At least not in the Alpha phase..

Comment by David Thompson (Inactive) [ 2016-11-18 ]

David.Hall - make sure this is covered in the documentation for redistribute.

Comment by David Thompson (Inactive) [ 2016-11-29 ]

Based on our discussion - first assuming that redistribute works in read only mode, we should issue an error with a descriptive action to go enable read only mode. This is consistent with other command behavior. If it does not work in read only mode then we can just document the limitation and we should also understand if actual writes can be done in parallel.

Comment by David Hall (Inactive) [ 2016-12-15 ]

Regarding read only mode. cpimport does look at the setting and should fail if read only.

All the tests for read only are early in the processing, not at the access level, so redistributeData will work with read only set.

Comment by David Hall (Inactive) [ 2017-01-14 ]

Empirical tests prove that redistributeData works fine with suspenDatabaseWrites set on (Read Only mode). Since redistributeData works in an asynchronous model, it is not appropriate for redistributeData to set this by itself, as there's no way for it to reset it when done. The solution is to add code to check and inform the user what needs to be done.

Comment by David Hall (Inactive) [ 2017-01-17 ]

Added a check for read only mode and won't let redistributeData continue without it.

Comment by David Hall (Inactive) [ 2017-01-20 ]

There's a problem with using isReadOnly() to determine is suspendwrites was called. Use getSystemSuspended() instead.

Comment by Daniel Lee (Inactive) [ 2017-03-01 ]

Build tested: GitHub source

[root@localhost mariadb-columnstore-server]# git show
commit 3da188e5c8a2630019ea810fb8c1bd3ece5e058b
Merge: 5d9686c 53c1df7
Author: Andrew Hutchings <andrew@linuxjedi.co.uk>
Date: Fri Feb 10 15:07:31 2017 +0000

Merge pull request #31 from jbfavre/fix_deb_package_dependency

MCOL-562 Fix Debian package dependencies

[root@localhost mariadb-columnstore-server]# cd mariadb-columnstore-engine/
[root@localhost mariadb-columnstore-engine]# git show
commit 16cef50caedd9ec7585b04c096996a9441bdf2d5
Author: David Hill <david.hill@mariadb.com>
Date: Wed Mar 1 10:39:11 2017 -0600

change the check for prompt back to the previous code

mcsadmin> redistributedata start
redistributedata Wed Mar 1 16:50:16 2017
redistributeData START
Source dbroots: 1
Destination dbroots: 1

WriteEngineServer returned status 1: Cleared.
WriteEngineServer returned status 2: Redistribute is started.
mcsadmin> resumeDatabaseWrites
resumedatabasewrites Wed Mar 1 16:58:11 2017

This command resumes the DDL/DML writes to the MariaDB ColumnStore Database
Do you want to proceed: (y or n) [n]: y

Resume MariaDB ColumnStore Database Writes Request successfully completed
mcsadmin> redistributedata start
redistributedata Wed Mar 1 16:58:19 2017
redistributeData START
Source dbroots: 1
Destination dbroots: 1

The system must be in read only mode for redistribeData to work
You must run suspendDatabaseWrites before running redistributeData
Be sure to run resumeDatabaseWrites when redistributeData status shows complete

Generated at Thu Feb 08 02:17:46 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.