> One of the main benefits of Semgrep is its unified DSL that works across all supported languages.
> People can disagree, but I'm not sure that tree-sitter S-expressions as an upgrade over a DSL.
100% agree — a DSL is a better user experience for sure. But this is a deliberate choice we made of not inventing a new DSL and using tree-sitter natively. We've directly addressed this and agree that the S-expressions are gnarly; but we're optimizing for a scenario that you wouldn't need to write this by hand anyway.
It's a trade-off. We don't want to spend time inventing a DSL and port every language's idiosyncrasies to that DSL — we'd rather improve our runtime and add support for things that other tools don't support, or support only on a paid tier (like cross-file analysis — which you can do on Globstar today).
> I'd love to hear how this project differs from Bearer, which is also written in Go and based on tree-sitter? https://github.com/Bearer/bearer
The primary difference is that we're optimizing for users to write their custom rules easily. We do plan to ship built-in checkers [1] so we cover at least OWASP Top 10 across all major programming languages. We're also truly open-source using the MIT license.
> Regardless, considering there is a large existing open-source collection of Semgrep rules, is there a way they can be adapted or transpiled to tree-sitter S-expressions so that they may be reused with Globstar?
I'm pretty sure there should be a way to make that work. We believe writing checkers (and having a long list of built-in checkers) will be a commodity in a world where AI can generate S-expressions (or tree-sitter node queries in Go) for any language with very high accuracy (which is where we have an advantage as compared to tools that use a custom DSL). To that extent, we're focused on improving the runtime itself so we can support complex use cases from our YAML and Go interfaces. If the community can help us port rules from other sources to our built-in checkers, we'd love that!
Thanks! We still have a long way to go and a pretty extensive roadmap.
> Is there a general way to apply/remove/act on taint in Go checkers? I may not be digging deeply enough but it seems like the example just uses some `unsafeVars` map that is made with a magic `isUserInputSource` method. It's hard for me to immediately tell what the capabilities there are, I bet I'm missing a bit.
Assuming you're looking at the guide [1], the `isUserInputSource` is just a partial example and not a magic method (we probably should have used a better example there).
The AST for each node along with the context are exposed in the `analysis.Pass` object [2]. We don't have an example for taint analysis, but here's an example [3] of state tracking that can be used to achieve this. This is a little tedious at the moment and you'll have to do the heavy-lifting in the Go code — but this is on our roadmap to improve. We want to expose a lot more helpers to make doing things like taint analysis easily.
Here's another idea [4] we're exploring to make the YAML interface more powerful: adding support for utilities (like entropy calculation) that you can call and perform a comparison.
Not at the moment, but we'll put something up soon.
We're focused on keeping globstar light-weight, so a hosted runtime is not in the roadmap (although we'll add support for running Globstar checkers natively on our commercial product DeepSource). You should be able to write any checkers in Globstar that you can write in the other tools you've listed.
Our goal is to make it very easy to write these checkers — so we'd be optimizing the runtime and our Go API for that.
Supporting C / C++ is in our roadmap. It needs some additional work to handle preprocessor directives [1] [2], which is why we didn't focus on it for the initial release.
Congrats on the launch! I tried this on a migration project I'm working on (which involves a lot of rote refactoring) and it worked very well. I think you've nailed the ergonomics for terminal-based operations on the codebase.
I've been using Zed editor as my primary workhorse, and I can see codebuff as a helper CLI when I need to work. I'm not sure if a CLI-only interface outside my editor is the right UX for me to generate/edit code — but this is perfect for refactors.
Amazing, glad it worked well for you! I main VSCode but tried Zed in my demo video and loved the smoothness of it.
Totally understand where you're coming from, I personally use it in a terminal tab (amongst many) in any IDE I'm using. But I've been surprised to see how different many developers' workflows are from one another. Some people use it in a dedicated terminal window, others have a vim-based setup, etc.
This is impressive! We use Metabase and I've been wanting this exact user experience for quite some time. So far, I've been dumping our Postgres schema into a Claude project and asking it to generate queries. This works surprisingly well, save for the tedious copy-paste between the two tabs. The Chrome extension workflow makes perfect sense.
Is there a way to select which model is being used? Anecdotally, I've found that Claude 3.5 Sonnet works incredibly well with even the most complex queries in one shot, which is not something I've seen with GPT-4o.
Haha, yes! We were doing the exact same thing. Also, there is so much context you can't capture with just table schema that you can if you integrate the extension deep into the tool. It also unlocks cross-app contexts (we're working on a way to import context from a doc to a metabase query, or from a sheet/dashboard to a jupyter notebook etc.
> Is there a way to select which model is being used?
Not at the moment, but this is in our pipeline! We will enable this (and the ability to edit the prompts, etc.) very soon.
There's a lot of value in making information buried in videos accessible in an organization. What tools like Notion did to documentation, I think there's a big opportunity to do that for videos. Congratulations on the launch!
Thank you for your insightful comment! We agree that there's a big opportunity to make information buried in videos accessible within organizations. With Spiti, we aim to revolutionize how teams unlock and utilize the knowledge lost in videos.
Thank you for your support.We're here to listen and assist with any questions or feedback!
> People can disagree, but I'm not sure that tree-sitter S-expressions as an upgrade over a DSL.
100% agree — a DSL is a better user experience for sure. But this is a deliberate choice we made of not inventing a new DSL and using tree-sitter natively. We've directly addressed this and agree that the S-expressions are gnarly; but we're optimizing for a scenario that you wouldn't need to write this by hand anyway.
It's a trade-off. We don't want to spend time inventing a DSL and port every language's idiosyncrasies to that DSL — we'd rather improve our runtime and add support for things that other tools don't support, or support only on a paid tier (like cross-file analysis — which you can do on Globstar today).