[MDEV-38485] Pluggable Full-Text Search Framework with BM25 Ranking - Jira

XML

Word

Printable

Details

Type: New Feature
Status: Open (View Workflow)
Priority: Critical
Resolution: Unresolved
Fix Version/s: ROADMAP
Component/s: None
Labels:
None

Epic Link:
Multi-Valued Indexes
Sprint:
Q2/2026 Server Development

Description

Introduce a pluggable full-text search framework in MariaDB with native BM25 relevance scoring, enabling modern, extensible, and more accurate text search compared to existing natural language and boolean modes.

Problem Statement

MariaDB’s current full-text search capabilities rely on basic natural language and boolean modes that:

Do not account for document length normalization or inverse document frequency

Are difficult to extend or customize

As a result:

Search results are often poorly ranked for content-heavy applications

Users must integrate external search engines for acceptable relevance

MariaDB is less competitive for applications requiring high-quality text search

User & Use Case

Primary Users

MariaDB developers
Database administrators
Platform engineers building content-driven applications

Primary Use Cases

Ranking blog posts, documentation, or articles by relevance
Searching user-generated content (comments, reviews, messages)
Enabling in-database search for applications that cannot rely on external search services

Secondary Use Cases

Hybrid relational + search workloads
AI-assisted search pipelines that require deterministic relevance scoring at the database layer

Competitive Research & Market Context

PostgreSQL

Supports BM25-like ranking via ts_rank and extensions
Strong extensibility through custom ranking functions

Limitations: complexity of configuration, fragmented ecosystem, limited pluggability at the index engine level

MySQL

Supports full-text search with basic ranking
No native BM25 implementation
Limited extensibility and weak relevance tuning

Elasticsearch / OpenSearch

Native BM25 with advanced relevance tuning
Highly configurable and scalable

Limitations: operational complexity, separate infrastructure, eventual consistency, cost

Key Market Gaps

In-database BM25 with first-class support
Pluggable architecture without requiring external search systems
Simpler operational model compared to dedicated search engines

Feature Behavior & Scope

In Scope

Pluggable full-text index framework
Native BM25 ranking implementation
Configurable scoring parameters (e.g., k1, b)
SQL-level configuration and usage
Compatibility with existing full-text index syntax where feasible

Behavior

Users can create a full-text index specifying BM25 as the ranking algorithm
Query execution uses BM25 scoring by default for supported indexes
Framework allows future ranking models or third-party implementations to be plugged in

Acceptance Criteria

A pluggable full-text index framework is implemented
Native BM25 ranking is available as a first-party implementation
Users can configure BM25 parameters at index or global level
Query plans clearly indicate when BM25-based ranking is used

Attachments

Issue Links

relates to

MDEV-25848 Support for Multi-Valued Indexes

Open

Activity

People

Assignee:: Unassigned

Reporter:: Adam Luciano

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2026-01-05 18:27

Updated:: 2026-01-08 21:29

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.