Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-5666

PP freezes waiting for SM to answer over a socket

    XMLWordPrintable

Details

    Description

      There is stuck query processing situation on a cluster with MinIO S3 implementation.
      PP waits for data to be returned from SM indefinitely. Here is some information collected:

      TID 3950697:
      #0  0x00007fd55013345c pthread_cond_wait@@GLIBC_2.3.2
      #1  0x0000556694d242eb primitiveprocessor::prefetchBlocks(unsigned long, int, unsigned int*)
      #2  0x0000556694d24e50 primitiveprocessor::loadBlocks(long*, BRM::QueryContext, int, int, unsigned char**, unsigned int*, bool, unsigned int, unsigned int, bool*, bool, std::tr1::unordered_map<long, BRM::VSSData, std::tr1::hash<long>, std::equal_to<long>, std::allocator<std::pair<long const, BRM::VSSData> > >*)
      #3  0x0000556694d08822 void primitiveprocessor::ColumnCommand::_loadData<8>()
      #4  0x0000556694d0545d primitiveprocessor::ColumnCommand::issuePrimitive()
      #5  0x0000556694d01ef4 primitiveprocessor::ColumnCommand::projectIntoRowGroup(rowgroup::RowGroup&, unsigned int)
      #6  0x0000556694ceb50f primitiveprocessor::BatchPrimitiveProcessor::execute()
      #7  0x0000556694cee1a1 primitiveprocessor::BatchPrimitiveProcessor::operator()()
      #8  0x0000556694cfb48d primitiveprocessor::BPPSeeder::operator()()
      #9  0x00007fd54fed8ac3 threadpool::FairThreadPool::threadFcn(threadpool::PriorityThreadPool::Priority)
      #10 0x0000556694d42c97 thread_proxy
      #11 0x00007fd55012d1ca start_thread
      #12 0x00007fd54eb3de73 __clone
      ..
      TID 3950869:
      #0  0x00007fd550136ab4 read
      #1  0x00007fd54def394b idbdatafile::SocketPool::send_recv(messageqcpp::ByteStream&, messageqcpp::ByteStream*)
      #2  0x00007fd54dee912e idbdatafile::SMComm::pread(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void*, unsigned long, long)
      #3  0x0000556694d54fac (anonymous namespace)::thr_popper(dbbc::ioManager*)
      #4  0x0000556694d59970 boost::detail::thread_data<dbbc::LambdaKludge>::run()
      #5  0x0000556694d42c97 thread_proxy
      #6  0x00007fd55012d1ca start_thread
      #7  0x00007fd54eb3de73 __clone
      

      As one sees PP processing threads wait for async data load. That in turn waits for SM to return a message.
      SM threads backtrace doesn't have threads that answers that mentioned `idbdatafile::SMComm` communication.

      Attachments

        Activity

          People

            drrtuy Roman
            drrtuy Roman
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.