[MCOL-3992] insert is too slow with S3 Created: 2020-04-06  Updated: 2021-04-19  Resolved: 2021-01-04

Status: Closed
Project: MariaDB ColumnStore
Component/s: Storage Manager
Affects Version/s: 1.4.3
Fix Version/s: 5.6.1

Type: Bug Priority: Major
Reporter: Allen Lee (Inactive) Assignee: Ben Thompson (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: performance
Environment:

Red Hat Enterprise Linux
MariaDB Column Store 10.4.12-6


Attachments: File gc.sql    
Issue Links:
Relates
relates to MCOL-3751 Reduce duration or type of locks held... Closed

 Description   

Customer reported that when inserting on columnstore S3- 100 records taking 1 hour to complete. They tried cimport to import data from S3 into attached schema.



 Comments   
Comment by Patrick LeBlanc (Inactive) [ 2020-05-11 ]

I gave Faisal a few suggestions re SM configuration at the time; it is unclear whether they helped or to what degree.

My suspicion is that SM flushes -> S3 are causing excessive contention in this use case. DML writes 8KB blocks to the version buffer constantly in this case, and the version buffer is a large monolithic file. When SM flushes (every 10s or when the write-cache is exhausted), it currently has to lock at the file level, which will block all DML writes while it is happening. This isn't a problem when not using DML, because the 'regular' database file are not large or monolithic.

I believe there are multiple overlapping tickets for reducing lock granularity in SM and for improving DML efficiency; we should find them and make the proper associations.

Comment by Patrick LeBlanc (Inactive) [ 2020-05-19 ]

Had a thought while brainstorming a problem. If we always write to a tmp location, then move a file into the proper place on success, the window for a race on the data is greatly reduced. This should allow us to reduce some critical sections substantially.

Generated at Thu Feb 08 02:46:58 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.