Good morning HN! This tool lets you play with awk, grep, sed, and jq commands right in your browser. Start from the examples and explore from there!
To get these tools running in the browser, I compiled them to WebAssembly (see https://github.com/biowasm/biowasm for details). That way, the commands you type run instantaneously, and it doesn't cost me an arm and a leg to host servers that execute arbitrary commands from users :)
Hey HN, I recently came across a very useful tutorial about jq (https://earthly.dev/blog/jq-select/) and created an interactive tutorial based on it. It features a command-line sandbox where you can safely explore and run arbitrary jq commands (it runs jq in your browser using WebAssembly).
Check it out if you want to learn how to filter, process, and wrangle JSON data. And if you happen to be getting started in bioinformatics, check out the rest of the tutorials on sandbox.bio :)
Thanks! It's generally easiest to do for C/C++/Rust/AssemblyScript code (other languages often need to ship things like garbage collection or an interpreter alongside it). But even with those languages, it's not always trivial to support features in the browser like file systems, threads, SIMD, etc.
It may be worth noting that, while there's some built in marketing/advertising by going through a publisher, you probably shouldn't overestimate how much. You're still going to be doing most of the marketing, whether self-published or through a publisher.
With me, plugging the book at conferences. Trying to get people I know who read it to put reviews on Amazon. (In my experience, lots more people say they will than actually do it.) I was able to arrange a book signing at one event. Email list/newsletter if you have one. Social media. Website/blog. Etc.
One of the listed features is direct uploads to S3. I haven’t looked, but I’d wager it’s either supported or easy to add uploading to other S3-compatible services!
Whether or not this is possible, it seems like a serious DDOS risk. Some one with malicious intent could give you a huge hosting bill. It might be difficult to apply IP-based rate-limiting without a server.
With S3, your server supplies the client with signed time-limited upload endpoint URLs (which is a feature S3 supports). So the server is effectively authorizing the client to upload direct to S3. You could do this only for "logged in" users, or use your own IP-based rate-limiting, or whatever else you wanted to do. Front-end direct-to-cloud doesn't necessarily mean "without a server", as you can set it up so it has to be authorized by the server, and this is generally how you'd do it.
I don't know if GCS is built into uppy at present (contrary to another comment, I don't believe GCS could be called "S3-compatible"), but I suspect there's a way to use uppy hooks to add it. As long as GCS also allows storage locations that allow upload only to signed time-limited URLs, the same approach could be used.
Where you put the file on the cloud storage and what you do with it is, I believe, not uppy's concern. But if you are for instance using the ruby shrine file attachment library (which is built out with examples to support uppy, and direct-to-S3, as a use case) -- shrine strongly encourages you to use a two stage/two location flow, where (eg) any front-end-uploaded things are in a temporary 'cache' storage, which on S3 you might want to use lifecycle rules to automatically delete things from if older than X. The files might only moved to a more permanent storage on some other event.
Once you get into it, it turns out all the concerns of file handling can get pretty complicated. But having the front-end upload directly to cloud storage can be a pretty great thing, depending your back-end architecture, for preventing any of your actual app 'worker' processes/threads from being taken up handling a file upload, dealing with slow clients, etc. Can make proper sizing and scaling of your back-end a lot more straightforward and resource-limited.
There's two ways to directly upload to S3 buckets (or GCS):
1) You allow append-only access for the world, maybe in combination with an expiry policy. Indeed only useful for a few use cases I'd say
2) You deploy signing of requests, and you only sign for those who are logged in, or otherwise match criteria important to your app. A bit more hassle, and still requires server-side code (whether traditionally hosted or 'serverless'), but at least your servers aren't receiving the actual uploads, taking away a potential SPOF and bottle-neck.
That said, I'm not sure how serious you are about handling file uploads, but uploading directly to buckets often means uploading to a single region (on aws, a bucket may be hosted in us-east-1 for instance, meaning high latency for folks in e.g. Australia). This may or may not be problematic for your use case, but it did bring us complaints when we had that.
>That said, I'm not sure how serious you are about handling file uploads, but uploading directly to buckets often means uploading to a single region (on aws, a bucket may be hosted in us-east-1 for instance, meaning high latency for folks in e.g. Australia). This may or may not be problematic for your use case, but it did bring us complaints when we had that.
S3 acceleration uses the cloudfront distributed edge locations. As the data arrives at an edge location, data is routed to Amazon S3 over an optimized network path. This costs more money though.
These are valid considerations, but not tied to S3 (or uploading).
It's probably a happy problem if you end up worrying about S3 as a very DDoS-able part of your system.
Running up hosting bills is an scenario that can be addressed with various technical means (like a sibling comment explains). Many people seem to judge the risk * probability too small to put a lot of preemptive effort into it. It's basically a question of how much damage would be done until your monitoring catches it. AWS has also been known to "forgive" bills that were caused by malicious attackers in some situations.
WebAssembly is an awesome technology and I'm super excited about it, but IMHO it's probably best to assume you don't need it unless you can convince yourself otherwise.
That said, here are a few examples of where it shines (warning: shameless plugs incoming):
- It's super helpful in porting games from languages like C/C++ to the web without having to rewrite the game from scratch. Here's a post I wrote about how to port a game like Asteroids to the web: https://medium.com/@robaboukhalil/porting-games-to-the-web-w...
- WebAssembly is really good for building playgrounds out of command line tools. Here's how I built one for the jq CLI: https://opensource.com/article/19/4/command-line-playgrounds..., the benefit being more secure than hosting it on the backend + much faster runtime
- I'm also excited to see how WebAssembly will impact work outside the browser: https://hacks.mozilla.org/2019/03/standardizing-wasi-a-webas..., especially with Function-as-a-service offerings that support Wasm like Cloudflare/Fastly workers, where the benefit is that I can write my function in any language + the time to initialize the function could be much faster
Another shameless plug: if you want a practical intro to WebAssembly, I self-published a book about it recently: http://levelupwasm.com/!
This is my second self-published book and I've loved this approach. It lets me focus on writing and marketing, and I get to craft every aspect of the book.
Did you know what typical minimum page count requirements are? I have a non-technical, more program management focused book with about 100 pages. Was told by a publisher it's too short... wondering how you approached that when self publishing, since your post mentions it's about 100-150 pages. Did you get any feedback from readers that it is too short?
I have not received feedback that my book was too short. In my own experience, I've rarely ever completed a technical book--in fact, I've often wished they were shorter!