No, that logic doesn't follow. If your application is so hopelessly vulnerable as to benefit from such naive filtering of the text "/etc/hosts, then your application is still going to be vulnerable in precisely the same ways, with just slightly modified inputs.
It is net zero for security and net negative for user experience, so having it is worse than not having it.
The way I assume it works in practice on a real team is that after some time, most of your team will have no idea how the WAF works and what it protects against, where and how it is configured… but they know it exists, so they will no longer pay attention to security because “we have a tool for that”, especially when they should have finished that feature a week ago…
Are LLM "jailbreaks" still even news, at this point? There have always been very straightforward ways to convince an LLM to tell you things it's trained not to.
That's why the mainstream bots don't rely purely on training. They usually have API-level filtering, so that even if you do jailbreak the bot its responses will still gets blocked (or flagged and rewritten) due to containing certain keywords. You have experienced this, if you've ever seen the response start to generate and then suddenly disappear and change to something else.
Well, yeah. The filtering is a joke. And, in reality, it's all moot anyways - the whole concept of LLM jailbreaking is mostly just for fun and demonstration. If you actually need an uncensored model, you can just use an uncensored model (many open source ones are available). If you want an API without filtering, many companies offer APIs that perform no filtering.
It's not really security theater because there is no security threat. It's some variation of self importance or hyperbole, claiming that information poses a "danger" to make AI seem more powerful than it is. All of these "dangers" would essentially apply to wikipedia.
> It seems like a short term solution to this might be to filter out any prompt content that looks like a policy file
This would significantly reduce the usefulness of the LLM, since programming is one of their main use cases. "Write a program that can parse this format" is a very common prompt.
Amazon Prime Video is already this. You can subscribe to Max, Peacock, Crunchyroll, etc. from within the Prime Video app, and watch content normally exclusive to those services.
I'm not sure how this contradicts what they said. AI would likely lower the number of paid opportunities.
Additionally, art requires practice. Sure, some "lower-tier" artists may produce work that AI could replace without anyone noticing. But by removing that step, we risk having fewer truly great artists emerging.
If you expect to live off typing letters and numbers on a keyboard, (or off the labour of others, while you siphon up the lion's share of their productive surplus), you are doing it wrong.
That's the point: for almost everyone it's not a career. It's a hobby. Like some people have a career researching physics because they're extremely good at it and society has decided it makes sense to have a few. Then there's people like me who learn what they can of it in their free time, but I do something else as a career because realistically very few people have need of someone who's familiar with the Dirac equation or whatever. Among the general population I'm probably in the 99th percentile of math/physics knowledge/ability, but I don't do that for work because we don't need 1% of the population working on such things. And that's for a skill that causes most people to get anxiety; the demand mismatch is probably even greater for things that average people actually enjoy.
But it also has the potential to make the experience of creative pursuits better. e.g. have it listen to your playing of an instrument and give feedback on how to improve your technique. Or have it be an always available multi-instrumentalist partner for a jam session. You start playing and it just rolls with it and maybe inspires you in a way you wouldn't have thought of alone.
People are so weird about how to view ML/advanced signal processing. Don't look at things thorough the myopic lens of "prompt ChatGPT and it responds poorly". Look at it as an auto-complete, or a better form of on-the-fly procedural generation. Remember e.g. Audiosurf creating levels from your music? Make it happen on the fly. Maybe you could even create an interactive game where one person plays an instrument and the other does some kind of beat-sabre or dance-dance-revolution thing based on it analyzing and anticipating what's going to be played. The game scores you on how well the group was able to get into a groove together or something.
It feels to me like people get upset about ML encroaching on creative endeavors because they're not sufficiently creative to see how it could augment those fields and be a tool to make these things more interesting instead. Corporations will use it for cheaper slop, but slop was already what they wanted from humans anyway. For people that are actually interested in the artistic or social side, they'll have new tools.
> The sheer amount of observability data you can collect in wide events grows incredibly fast and most of it ends up never being read.
That just means you have to be smart about retention. You don't need permanent logs of every request that hits your application. (And, even if you do for some reason, archiving logs older than X days to colder, cheaper storage still probably makes sense.)
> That just means you have to be smart about retention.
It's not a problem of retention. It's a problem caused by the sheer volume of data. Telemetry data must be stored for over N days in order to be useful, and if you decide to track telemetry data of all tyoes involved in "wide events" throughout this period then you need to make room to persist it. If you're bundling efficient telemetry types like metrics with data intensive telemetry like logs in events them the data you need to store quickly adds up.
Agree. The new wide event pipeline should fully utilize cheaper storage options-object storage like S3. Includes both cold and hot data and maintains performance.
HDD-based persistent disks usually have much lower IO latency comparing to S3 (microseconds vs hundreds of milliseconds). This may help improving query performance a lot.
sc1 HDD-based volumes are cheaper than S3, while st1-based volumes are only 2x more expensive than S3 ( https://aws.amazon.com/ebs/pricing/ ). So there is little economical sense in using S3 over HDD-based persistent volumes.
I'm totally in favor of cold storage. Just beware of how you are storing data, the granularity of the files and how frequent you think you'd want to access that data eventually in the future, because what kills in these services is the API cost. Oh and deleting data also trigger API costs AFAIK so there is that too...
> supernormalize everything into the relations object(id, type) and edit(time, actor_id, object_id, key, value)
I frankly hate this sort of thing whenever I see it. Software engineers have a tendency to optimize for the wrong things.
Generic relations reduce the number of tables in the database. But who cares about the number of tables in the database? Are we paying per table? Optimize for the data model actually being understandable and consistently enforced (+ bonus points for ease of querying).
> At this point you might wonder if Haskell has some sort of pipelining operator, and yes, it turns out that one was added in 2014! That’s pretty late considering that Haskell exists since 1990.
The tone of this (and the entire Haskell section of the article, tbh) is rather strange. Operators aren't special syntax and they aren't "added" to the language. Operators are just functions that by default use infix position. (In fact, any function can be called in infix position. And operators can be called in prefix position.)
The commit in question added & to the prelude. But if you wanted & (or any other character) to represent pipelining you have always been able to define that yourself.
Some people find this horrifying, which is a perfectly valid opinion (though in practice, when working in Haskell it isn't much of a big deal if you aren't foolish with it). But at least get the facts correct.
It's a horrible practice with adverse incentives, and one of the reasons I'm glad I no longer work there
(and easily gameable, anyways - people would just DM each other patches they were unsure of, before submitting an actual CR)
reply