venturebeat
Databricks' Instructed Retriever beats traditional RAG data retrieval by 70% — enterprise metadata was the missing link

A core element of any data retrieval operation is the use of a component known as a retriever. Its job is to retrieve the relevant content for a given query. In the AI era, retrievers have been used as part of RAG pipelines. The approach is straightforward: retrieve relevant documents, feed them to an LLM, and let the model generate an answer based on that context.While retrieval might have seemed like a solved problem, it actually wasn't solved for modern agentic AI workflows.In research published this week, Databricks introduced Instructed Retriever, a new architecture that the company claims delivers up to 70% improvement over traditional RAG on complex, instruction-heavy enterprise question-answering tasks. The difference comes down to how the system understands and uses metadata. [...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat
Enterprises are measuring the wrong part of RAG

Enterprises have moved quickly to adopt RAG to ground LLMs in proprietary data. In practice, however, many organizations are discovering that retrieval is no longer a feature bolted onto model inferen [...]

Match Score: 304.67

venturebeat
Databricks research shows multi-step agents consistently outperform single-turn RAG when answers span databases and documents

Data teams building AI agents keep running into the same failure mode. Questions that require joining structured data with unstructured content, sales figures alongside customer reviews or citation co [...]

Match Score: 293.74

venturebeat
Databricks: 'PDF parsing for agentic AI is still unsolved' — new tool replaces multi-service pipelines with single function

There is a lot of enterprise data trapped in PDF documents. To be sure, gen AI tools have been able to ingest and analyze PDFs, but accuracy, time and cost have been less than ideal. New technology fr [...]

Match Score: 216.13

venturebeat
This tree search framework hits 98.7% on documents where vector search fails

A new open-source framework called PageIndex solves one of the old problems of retrieval-augmented generation (RAG): handling very long documents.The classic RAG workflow (chunk documents, calculate e [...]

Match Score: 184.98

venturebeat
Six data shifts that will shape enterprise AI in 2026

For decades the data landscape was relatively static. Relational databases (hello, Oracle!) were the default and dominated, organizing information into familiar columns and rows.That stability eroded [...]

Match Score: 173.15

venturebeat
Databricks built a RAG agent it says can handle every kind of enterprise search

Most enterprise RAG pipelines are optimized for one search behavior. They fail silently on the others. A model trained to synthesize cross-document reports handles constraint-driven entity search poor [...]

Match Score: 156.43

venturebeat
How xMemory cuts token costs and context bloat in AI agents

Standard RAG pipelines break when enterprises try to use them for long-term, multi-session LLM agent deployments. This is a critical limitation as demand for persistent AI assistants grows.xMemory, a [...]

Match Score: 156.10

venturebeat
Databricks set to accelerate agentic AI by up to 100x with ‘Mooncake’ technology — no ETL pipelines for analytics and AI

Many enterprises running PostgreSQL databases for their applications face the same expensive reality. When they need to analyze that operational data or feed it to AI models, they build ETL (Extract, [...]

Match Score: 146.40

venturebeat
From shiny object to sober reality: The vector database story, two years later

When I first wrote “Vector databases: Shiny object syndrome and the case of a missing unicorn” in March 2024, the industry was awash in hype. Vector databases were positioned as the next big thing [...]

Match Score: 144.73