Hacker Newsnew | past | comments | ask | show | jobs | submit | nico's commentslogin

Does this need depth data capture as well? The “casual captures” makes it seem like it only needs images, but apparently they are using depth data as well

Also, can it run on Apple silicon?


Nope, only needs depth for ground truth.

its designed to be run on top of a SLAM system that outputs a sparse point cloud.

on page 4 on the top right you can see how the point cloud is used to then feed into the object generator: https://cdn.jsdelivr.net/gh/facebookresearch/ShapeR@main/res...


I think it does use depth data from parameters in docs: python infer_shape.py --input_pkl <sample.pkl> (possibly achievable using software like MapAnything). I believe CUDA only.

Yeah they confirm that at the bottom of the linked page

> Furthermore, by leveraging tools like MapAnything to generate metric points, ShapeR can even produce metric 3D shapes from monocular images without retraining.


Nice. Would be great to have this on Slack or Whatsapp

No affiliation and didn't get around to use it yet, but clawdbot[1] does WhatsApp, Telegram, Slack, Discord, Signal, iMessage, Microsoft Teams, and WebChat.

[1]: https://github.com/clawdbot/clawdbot


A much simpler setup can achieve "I want [coding agent messages] on slack"

I wrote an mcp for claude code to talk to you (and receive messages back) over slack, though it has the caveat of being unable to initiate a session. It was pretty easy to do with claude but probably not the best approach.

Ya, I agree that's something. Maybe in future!

Very insightful, thank you

How do you implement the permission scoping to the step? Do you have any shareable code or examples?


Are you doing that as your own personal tooling? Are you open sourcing it? Would be happy to take a look and maybe contribute as well

I am doing this already for an internal tool that accesses a bigquery data warehouse. Long term this will be a feature my company sells.

I will not open source it since it is a paid feature, but I use sqlglot for most of the query parsing, validating and rewriting.


> Don't

Do you mean "Don't give it more autonomy", or "Don't use it to access servers/dbs" ?

I definitely want to be cautious, but I don't think I can go back to doing everything manually either


Why aren't you using the tools we already have: ansible, salt, chef, puppet, bcfg2, cfengine... every one of which was designed to do systems administration at scale.

"Why would you use a new tool when other tools already exist?".

Agents are here. Maybe a fad, maybe a mainstay. Doesn't hurt to play around with them and understand where you can (and can't) use them


Play and production need to be separate domains. Otherwise, you don't have production, you only have play.

Okay...? Agreed. I still don't think the answer to "How are you guys giving LLMs access to your DBs?" is "Don't".

Nowhere did OP or any of the comments in the chain specify they were testing Claude in production.


You have to choose between laziness or having systems that the LLM can't screw up. You can't have both.

You can have it write code that you review (with whatever level of caution you wish) and then run that on real data/infrastructure.

You get a lot of leverage that way, but it's still better than letting AI use your keys and act with full autonomy on stuff of consequence.


I mean, both, but in this case I'm saying "don't use it to access any kind of production resource", with a side order of "don't rely on simple sandboxing (e.g. command patterns) to prevent things like database deletions".

You are right, and that's great for queries

How do you provide db access? For example, to access an RDS db, you have to connect from within the AWS/EC2 environment, which means either providing the agent ssh access to a server, from which it can run psql, or creating a tunnel

Additionally, with multiple apps/dbs, that means having to do the setup multiple times. It would be nice to be able to only configure the agent instead of all the apps/dbs/servers


You can't provide an existing ssh tunnel with a port for said database yourself, locally?

"aws iam service accounts"

This is probably the safest thing to do, also the most time consuming

It would be nice to just be able to solve it through instructions to the agent, instead of having to apply all the other things for each application/server/database that I'd like to give it access to


The restrictions have to be enforced by the non-LLM deterministic control logics (in the OS/database/software, or the agent's control plane). It cannot be just verbal instructions and you expect the LLM not to generate certain sequences of tokens.

What I imagine is you might instruct an agent to help you set up the restrictions for various systems to reduce the toil. But you should still review what the agent is going to do and make sure nothing stupid is done (like: using regexes to filter out restricted commands).


That would be nice. If only the agent had the ability to limit itself to your instructions.

Shouldn't you already be using low privilege accounts for stuff like gathering information about prod?

Overprivileged accounts is a huge anti-pattern for humans too. People make mistakes. Insider threats happen. Part of ops is making it so users don't have privileges to do damage without appropriate authorization.


Yeah but this is like exposing `sudo eval $input` as a web service and asking the clients to please, please, not do anything bad.

Can create scripts or use stuff like Nix, Terraform, Ansible or whatever to automate the provisioning of restricted read only accounts for your servers and DBs.


That's equivalent to client-side security.

Do you use Claude Code? Do you say "Yes, and don't ask again" for all the commands, since you don't mind breaking things inside the container?

> claude --dangerously-skip-permissions

But do not run this on prod servers! You cannot prompt your way into the agent not doing something stupid from time to time.

Also blacklisting commands doesn't work (they'll try different approaches until something works).


Great pointers, thank you

How would you go about allowing something like `ssh user@server "ls somefolder/"` but disallowing `ssh user@server "rm"`?

Similarly, allow `ssh user@server "mysql \"SELECT...\""`, but block `ssh user@server "mysql \"[UPDATE|DELETE|DROP|TRUNCATE|INSERT]...\""` ?

Ideally in a way that it can provide more autonomy for the agent, so that I need to review fewer commands


Sounds like this might help: https://www.gnu.org/software/bash/manual/html_node/The-Restr...

I'm not familiar with rbash, but it seems like it can do (at least some of) what you want.


I don't know; I've never done something like that. If no one else answers, you can always ask Claude itself (or another chatbot). This kind of thing seems tricky to get right, so be careful!

Yup definitely tricky. Unfortunately Claude sucks at answering questions about itself, I've usually had better luck with ChatGPT. Will see how it goes

If you control the ssh server it can be configured to only allow what you want. Certainly tedious but I would consider it worth while as it stands with agents being well, agentic.

Would love something like this for Fusion 360. Being able to just prompt the UI to create or edit objects. It would be cool if (like with coding agents in which you can add context using @filepath), you could use the mouse to click/select context objects for the prompt to execute with

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: