I'm on the Python advocacy team at Microsoft, so I've been experimenting a bit with the new framework. It works pretty well, and is comparable to Langchainv1 and Pydantic-AI, but has tighter integrations with Microsoft-specific technologies. All the frameworks have very similar Agent() interfaces as well as graph-based approaches (Workflow, Langgraph, Graph).
I can flesh that out if it's helpful. I find it fascinating to see where agent frameworks converge and diverge. Generally, the frameworks are converging, which is great for developers, since we can learn a concept in one framework and apply it to another, but there are definitely differences as you get into the edge cases and production-level sophistication.
At Microsoft, that's all baked into Azure AI Search - hybrid search does BM25, vector search, and re-ranking, just with setting booleans to true.
It also has a new Agentic retrieval feature that does the query rewriting and parallel search execution.
So few developers realize that you need more than just vector search, so I still spend many of my talks emphasizing the FULL retrieval stack for RAG.
It's also possible to do it on top of other DBs like Postgres, but takes more effort.
I am working on search but rather for text-to-image retrieval, nevertheless, I am curious if by that's all baked into Azure AI search you also meant synthetic query generation from the grandparent comment. If so, what's your latency for this? And do you extract structured data from the query? If so, do you use LLMs for that?
Moreover I am curious why you guys use bm25 over SPLADE?
Yes, AI Search has a new agentic retrieval feature that includes synthetic query generation: https://techcommunity.microsoft.com/blog/azure-ai-foundry-bl...
You can customize the model used and the max # of queries to generate, so latency depends on those factors, plus the length of the conversation history passed in. The model is usually gpt-4o or gpt-4.1 or the -mini of those, so it's the standard latency for those.
A more recent version of that feature also uses the LLM to dynamically decide which of several indices to query, and executes the searches in parallel.
That query generation approach does not extract structured data. I do maintain another RAG template for PostgreSQL that uses function calling to turn the query into a structured query, such that I can construct SQL filters dynamically.
Docs here:
https://github.com/Azure-Samples/rag-postgres-openai-python/...
Got it, I think this might make sense for a "conversation" type of search not for an instant search feature because lowest latency is gonna be too high IMO.
Fair point on latency, we (Azure AI Search) target both scenarios with different features. For instant search you can just do the usual hybrid + rerank combo, or if you want query rewriting to improve user queries, you can enable QR at a moderate latency hit. We evaluated this approach at length here: https://techcommunity.microsoft.com/blog/azure-ai-foundry-bl...
Of course, agentic retrieval is just better quality-wise for a broader set of scenarios, usual quality-latency trade-off.
We don't do SPLADE today. We've explored it and may get back to it at some point, but we ended up investing more on reranking to boost precision, we've found we have fewer challenges on the recall side.
I know :( But I think vector DBs and vector search got so hyped that people thought you could switch entirely over to them. Lots of APIs and frameworks also used "vector store" as the shorthand for "retrieval data source", which didn't help.
AI Search team's been working with the Sharepoint team to offer more options, so that devs can get best of both worlds. Might have some stuff ready for Ignite (mid November).
No we have a Microsoft graph connector which inserts externalitems into graph, copilot is able to surface these, probably via the same semantic search database
The capability was there for years, but it was expensive - something like $0.60 per 1000 items indexed, then sometimes after copilot was added it became free for up to 50 million items, and now it's free for unlimited items - you just can't beat that for price... https://techcommunity.microsoft.com/blog/microsoft365copilot...
I believe that Azure AI Search currently uses lucene for BM25, hnswlib for vector search, and the Bing re-ranking model for semantic ranking. (So, no, it does not, though features are similar)
Love this! Relatedly, does anyone have a suggestion for an outdoor solar-powered web camera that I could point at the critters in my garden? I'd love to stream a MonarchCam or MantisCam some day.
Ooo bobcats! I live in the bay area near Tilden Park, and I spent a while on iNaturalist trying to figure out where the bobcats hang out, as my 6 year old is very interested in wild cats. I realized sadly that bobcats are usually out at morning/evening, when we are not in the parks. Still used the bobcat stalking as an excuse to take a walk in Tilden today though.
What's your approach to finding the bobcat locations for your shot?
I'm going up to Point Reyes with a guide and a tracker, so that I'll at least have a pretty good chance of seeing the cats. Getting good shots is on me though!
"Look in your community. Find users of your product or users of your competitor’s product. "
I'm a current DevRel-er myself, and someone recently reached out looking to fill a DevRel role. I told them that I wouldn't actually be a good fit for their product (a CLI tool, and I'm not as die-hard of a CLI user as other devs), and suggested they look within their current user community. That's not always possible, especially for new products, but if a tool is sufficiently used, it's really nice to bring in someone who's genuinely used and loved the product before starting the role.
My hiring history:
* Google Maps DevRel, 2006-2011: I first used Google Maps in my "summer of mashups", just making all kinds of maps, and even used it in a college research project. By the time I started the role, I knew the API quite well. Still had lots to learn in the GIS space, as I was coming from web dev, but at least I had a lot of project-based knowledge to build on.
* Microsoft, 2023-present: My experience was with VS Code and GitHub, two products that I used extensively for software dev. Admittedly, I'd never used Azure (only Google App Engine and AWS) so I had to train up on that rapidly. My experience with the other clouds has helped me with this MS cloud fortunately.
I was on the Wave team! Our servers didn't have enough capacity, we launched too soon. I was managing the developer-facing server for API testing, and I had to slowly let developers in to avoid overwhelming it.
Neat, thanks for sharing this tidbit of history. Hey, what did the team think of the decision to build it on GWT at the time? (From the outside, seemed like an enabling approach but a bit like building an engine and airframe all at once).
Hm, I didn't work on the frontend but I don't particularly remember griping..GWT had been around for ~5 years at that point, so it wasn't super new: https://en.wikipedia.org/wiki/Google_Web_Toolkit
I always personally found it a bit odd, as I preferred straight JS myself, but large companies have to pick some sort of framework for websites, and Google already used Java a fair bit.
It was fun! Now we still see Wave-iness in other products: Google Docs uses the Operational Transforms (OT) algorithm for collab editing (or at least it did, last I knew), and non-Google products like Notion, Quip, Slack, Loop from Microsoft, all have some overlap.
We struggled with having too many audiences for Wave - were we targeting consumer or enterprise? email or docs replacement? Too much at once.
How do you determine if the tools access private data? Is it based solely on their tool description (which can be faked) or by trying them in a sandboxed environment or by analyzing the code?
It is based on what the MCP server reports to us. As with most current LLM clients we assume that the user has checked the MCP servers they're using for authenticity.
Both humans and coding agents have their strengths and weaknesses, but I've been appreciating help from coding agents, especially with languages or frameworks where I have less expertise, and the agent has more "knowledge", either in its weights or in its ability to more quickly ingest documentation.
One weakness of coding agents is that sometimes all it sees are the codes, and not the outputs. That's why I've been working on agent instructions/tools/MCP servers that empower it with all the same access that I have.
For example, this is a custom chat mode for GitHub Copilot in VS Code:
https://raw.githubusercontent.com/Azure-Samples/azure-search...
I give it access to run code, run tests and see the output, run the local server and see the output, and use the Playwright MCP tools on that local server. That gives the agent almost every ability that I have - the only tool that it lacks is the breakpoint debugger, as that is not yet exposed to Copilot. I'm hoping it will be in the future, as it would be very interesting to see how an agent would step through and inspect variables.
I've had a lot more success when I actively customize the agent's environment, and then I can collaborate more easily with it.
I have a repository here with similar examples across all those frameworks: https://github.com/Azure-Samples/python-ai-agent-frameworks-...
I started comparing their features in more details in a gist, but it's WIP: https://gist.github.com/pamelafox/c6318cb5d367731ce7ec01340e...
I can flesh that out if it's helpful. I find it fascinating to see where agent frameworks converge and diverge. Generally, the frameworks are converging, which is great for developers, since we can learn a concept in one framework and apply it to another, but there are definitely differences as you get into the edge cases and production-level sophistication.