[MCOL-3561] Processing uninit'd data in PrimProc Created: 2019-10-15  Updated: 2023-03-06  Resolved: 2023-03-06

Status: Closed
Project: MariaDB ColumnStore
Component/s: N/A
Affects Version/s: 1.2.5
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Patrick LeBlanc (Inactive) Assignee: Unassigned
Resolution: Won't Do Votes: 0
Labels: None


 Description   

Running valgrind over PrimProc to validate my changes. VG is quite noisy but it's correct about all the things we are doing wrong.

This ticket is for fixing the list of columns involved in processing FE1's (pre-join functions & expressions) and FE2's (post-join functions & expressions). PrimProc copies the list of columns required to evaluate those, but that list is apparently not correct, because VG complains that the FE instances are processing uninit'd data. If the list is correct, then this wouldn't be uninit'd data. IIRC, the list comes from joblistfactory.

Found these errors running the queries in working_tpch/misc.

An example of one of the complaints.
{{==7705== Conditional jump or move depends on uninitialised value(s)
==7705== at 0xC22DA57: internal_ascii_loop (loop.c:298)
==7705== by 0xC22DA57: __gconv_transform_internal_ascii (skeleton.c:609)
==7705== by 0xC2C4EF4: wcsrtombs (wcsrtombs.c:110)
==7705== by 0xC24AB20: wcstombs (wcstombs.c:34)
==7705== by 0x6FF9CE8: wcstombs (stdlib.h:154)
==7705== by 0x6FF9CE8: idb_wcstombs (utils_utf8.h:165)
==7705== by 0x6FF9CE8: funcexp::Func_lcase::getStrVal[abi:cxx11](rowgroup::Row&, std::vector<boost::shared_ptr<execplan::ParseTree>, std::allocator<boost::shared_ptr<execplan::ParseTree> > >&, bool&, execplan::CalpontSystemCatalog::ColType&) (func_lcase.cpp:80)
==7705== by 0x6516038: execplan::FunctionColumn::getStrVal[abi:cxx11](rowgroup::Row&, bool&) (functioncolumn.h:210)
==7705== by 0x65258B2: execplan::PredicateOperator::getBoolVal(rowgroup::Row&, bool&, execplan::ReturnedColumn*, execplan::ReturnedColumn*) (predicateoperator.h:462)
==7705== by 0x6FBC63F: getBoolVal (parsetree.h:273)
==7705== by 0x6FBC63F: evaluate (funcexp.h:112)
==7705== by 0x6FBC63F: funcexp::FuncExpWrapper::evaluate(rowgroup::Row*) (funcexpwrapper.cpp:119)
==7705== by 0x175294: primitiveprocessor::BatchPrimitiveProcessor::execute() (batchprimitiveprocessor.cpp:1565)
==7705== by 0x176336: primitiveprocessor::BatchPrimitiveProcessor::operator()() (batchprimitiveprocessor.cpp:2217)
==7705== by 0x18905E: primitiveprocessor::BPPSeeder::operator()() (bppseeder.cpp:288)
==7705== by 0xB28D2CA: threadpool::PriorityThreadPool::threadFcn(threadpool::PriorityThreadPool::Priority) (prioritythreadpool.cpp:191)
==7705== by 0x9AF7BCC: ??? (in /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.65.1)}}

In my version of the code, batchprimitiveprocessor.cpp:1565 is if (fe1->evaluate(&fe1In)).

Other errors point to the call where the fe2 instances are evaluated.



 Comments   
Comment by Patrick LeBlanc (Inactive) [ 2019-10-17 ]

I looked into this a little more yesterday, and it may be a lower-level bug than that. I suspect that if the bug had to do with the list of columns sent to PrimProc, the error would happen sooner. It seems that they often happen in a handful of string processing functions (func_lcase() in the example above).

Anyway, the approach here should be to validate the string processing functions that pop up in valgrind, then work your way down the stack.

Generated at Thu Feb 08 02:43:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.