Agent Bricks: Databricks Knowledge Assistant.
deep dive ... ish
So, recently, I had the pleasure of poking and prodding that new kid on the block, Agent Bricks from Databricks. We live in the age of Agents, that much is clear, in case you’ve missed the news. You can be an AI skeptic or lover; I care not, but Agents are coming and are already solving and automating all sorts of use cases.
I’m no AI doomer; I think the future is bright for good Data Engineers. We build Data Platforms and massage the data that enables such tools to be built at scale.
Everyone is pumping out their Agentic tooling and frameworks as fast as those engine… I mean, Claude can spit out that code. Agentic AI is becoming commoditized; there’s no longer a technical barrier to building AI systems.
The question is, can something useful be built?
Talk about ease of use, building Agents, and Agent Bricks fulfills that promise.
If you’ve been reading my content for any length of time, you will know that I’ve given a reasonable tip of the hat to all things AI, Agents, ChatBots, Vectors, LLMs, and other such tomfoolery. Call it self-preservation if you want, but I keep my options open.
…
Yeah, yeah, I hear ya, “Stop your jabbering and get to it, we want some Agents!” Jeez, fine.
A one-day conference for data engineers and architects.
- Speakers to include Joe Reis
- No vendors, no salesfolk
- Completely free
- March 26, in SF
- Limited to 100 attendees; register for free here
Agent Bricks from … Databricks.
We have reached the point in our collective AI/LLM journey that the pace of innovation, if you want to call it that, is hard to keep up with. It’s easy to get suck in the echo chamber and get caught up in the marketing hype of people who’ve got some money or skin in the game.
The future of Data Engineering lies in being able to build, and speak to, the infastructure that underpins Agentic systems.
So, back to Agent Bricks, Databricks provides a whole host of “out-of-the-box” Agents that can be built from within the UI. Talk about approachable, eh??
Agent Bricks consist of the following categories to choose from …
Document Parsing
- Parse and visualize document structure with AI.
Information Extraction
- Extract key information and insights into structured JSON.
Knowledge Assistant
- Turn your docs into an expert AI Chatbot.
AI/BI Genie
- Turn your tables into an expert AI Chatbot.
Supervisor Agent
- Design a AI system with Genie, agents, tools.
Custom LLM
- Specialize an LLM to perform custom text tasks.
Code Your Own Agent
- Build with OOS libraries and Agent Frameworks. This makes perfect sense. You have the ability to build pretty much any Agentic system you want, custom or otherwise, within these categories, including multi-Agent systems.
One could ask, so what’s so special about this, big deal. Yes, and no. The compaines that will win the AI and Agentic race to the bottom, are the one’s that simplify and obvuscate away the tricky parts … aka … the infastructure.
This is one thing Databricks is good at: think about how they brought Spark to the masses and made it as easy as a few clicks. All they have done is apply that to Agents.
Building a Knowledge Assistant.
Maybe you haven’t had the chance to build any Agents in a production-like environment, no personal projects to work on, relegated to being an onlooker to those new-fangled AI Engineers doing all the fun stuff.
Well, never fear, we can still learn and build things.
I will use my personal Databricks account (you can have one too, for a few bucks a month) and build a Knowledge Assistant Agent using a few of my blog posts. I’m sure you have some PDFs or something lying around, or could think of a similar use case.
What we really have is an obfuscated RAG system.
This is probably one of the most under-appreciated Agent systems yet, one that most businesses would benefit from building. A wealth of internal knowledge, once documented, became a 24/7 Agentic Expert.
Building a Knowledge Assistant with Agent Bricks is simple and effective.
Build a Databricks Volume or Cloud Storage with our documents
Built a Knowledge Assistant
Sync our documents to the Knowledge Assistant
Done
A couple of side notes, although this is constantly changing.
1. txt, pdf, md, ppt/pptx, and doc/docx.
2. Files larger than 50MB won't be indexes.
3. The first inital sync can take some time.
4. You can an adjust the agent's behavior based on natural language feedback.
5. You can import pre-labled datasetsBasically, you’ve got everything you need here to build a reliable and easy-to-use Knowledge Assistant (RAG) that can be used in and of itself or integrated into a multi-agent system!
Create the Knowledge Base
Ok, so let’s see how easy it is to create a managed Databricks Volume to store some documents, in my case, a few blog posts as TXT files.
Remember, as we do this simple exercise, we are thinking bigger picture and trying to learn from what we are doing. In a real production system (I’ve built ones that look similar to this), we might have all sorts of internal knowledge docs in various forms that wewould need to gather and store in a logical way.
This is a problem in and of itself, managing and organizing large amounts of various documents, pdf, txt, doc, ppx, etc.
A Knowledge Assistant is only as good as the work put into it. You need to spend the time to gather, or write down, knowledge that has been spread around … gather it into a single place where it can be managed … and then connect it into Agentic systems. This is no small feat.
Ok, so we have our docs in a Managed Databricks Volume. Let’s create our Knowledge Assistant and connect our data.
Create the Knowledge Assistant
Now the important and easy part, where the magic happens. If this part is simple and uncomplicated, that is a good thing. It shows how far we have come in a few short years in building RAG and Agentic systems that focus on delivering value rather than complex infrastructure.
I told you this would be the easy part. Click a few buttons, point it at our Volume, which serves as the source of our knowledge, and wait for the sync to complete.
Once we have built the Agent, there is nothing more much to do, besides put it to use. You can add optional instructions to let the Knowledge Assistant know how you expect it to act.
Also, you can chat with the Knowledge Assistant in the Databricks Playground. You can see here that I asked about Databricks cost savings; it pulls one of the blogs I wrote and summarizes that information for me.
You know the other great part? We can interact with this Agent via Python and an Endpoint.
Of course, we could integrate this ourselves or use an Agent Bricks Supervisor Agent for some multi-agent workflow if we wanted. Easy as pie.
Look, I know everyone wants to look cool and make everything seem to be more complicated than it is. And sure, if we were having this conversation 2 years ago, it would have been much harder and more involved to build RAG systems.
Don’t get me wrong, you can still build Agentic systems the hard way, as I have done recently.
But AI infrastructure and systems are just like any other technology we deal with. Over time, as things solidify in the community and frameworks, things start to harden and become more “commoditized.”
Agent Bricks is the perfect example of that. The barrier to building helpful and useful Agentic systems is so low that it’s hard to understand why people are not building useful tools.
Knowledge Assistants (classic RAG) are among the easiest and most obvious entry points for using AI in the business. Every team and company has a lot of knowledge scattered across different places and brains. There were enough excuses a few years ago, but now it’s almost TOO easy to build these Agents.
What I also want you to think about is the infrastructure and engineering behind all of this.
Building “Chat” interfaces
Serving RAG and LLM as endpoints
The compute required
Building storage for knowledge bases
Managing and cataloging documents
Embedding documents
Instructions and context
Observability and monitoring
… of course the list goes on.
Agentic AI is within reach and not rocket science
I encourage you to explore these different Agentic offerings, Agent Bricks, and others. Try to understand what it takes to collect, gather, and massage data into usage formats that can be ingested into these AI systems.
Understand the pros and cons of each approach. What infrastructure is required for each layer of the AI workload? What tools, frameworks, and companies can take away some of the burden from each layer? What price do you pay for such ease? Do you loose flexbiliity?
We live in a brave new world full of exciting new technologies. It’s not the time to shy away or stick our collective heads in the mud, but to push forward into the future.













