Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Rethinking of CGI as a selfhosted lambda server (reddec.net)
121 points by reddec on June 1, 2020 | hide | past | favorite | 64 comments


nginx unit (https://unit.nginx.org/configuration/#process-management) does something very similar. with `spare = 0` config option it will start app process lazily and shutdown when idle.

There is support for Ruby, Python, PHP, Java, Node, Go runtimes.


You can also use socket activation in uWSGI for this, together with a reasonable value for die-on-idle.

What's nice about it is that applications run unmodified.

uWSGI really is the swiss army toolchain for web operations.


Interesting. Could be useful for high-load, however for quick setup feels like a little bit complicated. Are any known UI, or scheduler settings?

Following is a general response for similar questions.

I pretty sure it's possible to do the same thing as trusted-cgi by combining several tools: cron + nginx/nginx-unit + some deployment + stats + whatever.

It could be faster, more flexible, but... I am not sure that I will personally want to setup it each time for new boxes. Or write auto-install scripts. I would like to write logic, not make environment.

Benefits trusted-cgi are: low resource usage (very low), easy setup, web UI and focus on including several features in one package.

It's possible to find better solutions for each parts, but I would like hear product that providing complete solution instead of providing "bricks". I am a big fan UNIX way, but sometimes we need an assembled thing, instead of another piece of constructor.


When I published webhub @ href.com in 1995 we called these 'custom runners'. Looks like you'd have approved of that concept :)


Thanks for tip: several ideas from there look very promising


OP, your site is unusable on mobile. I wasn’t able to read about your project as scrolling causes it to force half the text and diagrams out of frame.


I hope, I fixed it


fine for me, firefox/android


Having first met Apache when Perl CGI scripts were bleeding edge fancy "web app" technology, I hoped to read a satirical article, but it's actually an earnest suggestion to go full circle.

It does less than proper CGI, due to giving up HTTP (particularly response headers, status codes and content types) in order to process constrained JSON, and it does it in a more complicated way (UI? Scheduler? Git repositories?), but it's still better than other options.


To be fair I am not trying to say that CGI is a bleeding edge technology)

I met a problem: low-end devices that should host a big number of small pieces of logic. To solve the problem I looked back to the history and found CGI and adapted it to the specific case.

As someone said before here in comments: it's just reinvented "good old technology", but I would like to say that I can't find anything bad in it. Old technology means a number of documentations, it means easy to explain.

Ofcourse there are number of issues caused the technology became outdated, but in some cases (defined in target audience) it acceptable. Even sometimes better than newer solutions.

More of that, if you will look at the modern technology (serverless functions/lambda functions) you probably will find the same ideas but wrapped into modern solutions (containers and cloud).


Nice idea, bring back the old Common Gateway Interface for web development.


You’ve just invented the next big thing: Serverful Architectures!


If you think about it, the old virtualhost hosting providers were the original serverless providers: you didn’t run a server, you just uploaded your code and they ran it on their infrastructure.


You mean SPAs that renders on servers?


Won't happen for anything serious, because a process per request doesn't scale. FastCGI is probably as close as anything modern will ever get.


I think the word you're after is latency, because scalability isn't the issue. CGI-hosted web apps tend to use fronting HTTP caches plus memcache-like distributed caches or RDBMSs for app data once a request has actually resulted in a process being spawned. Historically, what's slow about pure CGI/process-per-request architectures are dynamic language runtimes that need to parse the entire CGI implementation code for each request (like old mod_php/mod_perl). That source of slowness can entirely be eliminated by using natively compiled CGIs. IMO FastCGI, or any other architecture inviting huge long running userland processes without GC, always end in robustness problems, memory fragmentation and grave security issues due to lack of process isolation, and still have about the same overhead as process creation in process-per-request architectures. What may help is a way to supply CGI params (PATH_INFO, QUERY_STRING, etc.) not via environment variables, but via eg. sockets, such that a number of pooled CGI processes can be started ahead of time, before a request is coming in.


Indeed we ran a ~2m user webmail service as a CGI written in C++ ~20 years ago. We addressed latency aggressively by statically linking and never explicitly freeing memory except if we really had to - the processes were short lived; better to let the OS just dispose of everything at once.

The process overhead was not a big deal even on 20yo hardware, and it saved us from dealing with all kinds of awful isolation issues. We discussed fastcgi or the like and dismissed it because the latency savings were much smaller than one might expect exactly for the reasons you mention: The problem was much less the process creation overhead than the overhead of dynamic runtimes.

People also seem to have forgotten what was expected back then. The time it takes to load Gmail for example would have been totally unacceptable. Our biggest latency limitation was not the web server / CGI, but optimizing the mail storage backends, so that is where we spent our effort.


The problem with fastCGI is that it's only very slightly more complex to write a web server compared to a fcgi server


I would like to take the opportunity to praise PHP (for once) for having PHP-FPM built in since PHP 5.4: https://php-fpm.org/


In my opinion, PHP edged out Perl for a similar reason, in mod_php. mod_perl you had to write thought-out classes for, configured the server, etc, where mod_php you just uploaded the .php files and got (comparatively) blazing fast performance that Perl CGI couldn't match.


The only reason PHP edged out Perl is that one could mix-match HTML and PHP in the same page. Which is ironic when professional PHP tries so hard today to look like JEE.


IMHO not quite, perl's Mason was actually very easy to use. Thing was that mod_perl was a major PITA, you had to restart the server after each change to the code, it was integrated deeply into the apache request flow giving you millions of ways to shoot yourself in the foot, shared state meant that it was super easy to kill memory, etc. It was just too complicated and messy, also often unstable. And on the other hand you had PHP that was simple, had clean state flow and fast execution, and was much more html/web oriented: tons of ready to use functions, easy access to GET and POST variables, cookies, etc. In that moment it was a blessing.


Exactly. You could as well use a load balancer to proxy incoming requests to your own pool of backend "functions"/processes each linked to eg. nghttp2. Would have the benefit that your "functions" can be executed independently, or from the command line, like CGIs.


The point is to use low-end machines for a number of logic that 99% will do nothing. As example: webhooks. In this scenario there is no need for high performance/low latency but there is a high demand of low resource usage. However there's a way to scale it horizontally using shared storage. But it should out of scope.

Btw I am author of it and I will be happy to answer on any questions. Thanks for your interest!


There also was SCGI as something easier to implement than FastCGI, but allowing for persistent processes. So I'm sure one could come up with yet another CGI version that's slightly better tuned for a different need and still retains the massive name-brand recognition of "CGI".


It scales if you let userland do thread scheduling (instead of having green threads on top of OS threads)

Something like https://docs.microsoft.com/en-us/windows/win32/procthread/us...

See also https://news.ycombinator.com/item?id=6726357


I'm not sure how threading is relevant. CGI inherently requires a new process per request, regardless of any choices you may make about threading.


> because a process per request doesn't scale

Tell me more about why. The fork time isn't that many multiples more than spawning a thread. COW ensures you aren't using much more memory. You can always cap it at a pool of ~64 processes to handle requests.


It's easier than you might expect not to need to scale beyond what a 2017 OS can handle with a process per dynamic request.


moving from CGi to FastCGI is pretty easy


I am pretty sure I have some old scripts using python's cgi module in production somewhere. They get about seven hits a day and have never gotten more than ten, nor will they ever, so scale here is irrelevant. Sad to hear that it has been deprecated.


the cycle of technology. What is old will be new again.


Exactly! Just adapted for a new reality. And I found it quite exciting to adapt old proven technology to a new stack


I did something very similar for handling webhooks (but can also be used for anything else), check it out at https://github.com/adnanh/webhook

People have used it in various segments, from actual deployments on push, to home automation, someone even wrote a guide on how to control the Xiaomi Robot Vacuum using the Amazon Dash button :-)


Great project! However, I think our projects similar but not the same. Looks like we can exchange some ideas from each-other


Yeah, my guess is that they share the same concept - expose triggers via HTTP(s) to run user commands. The webhook handling put on some additional requirements on my project: to go a little bit beyond piping the input to the script :-)


Right. Trusted-cgi also adds security checks (with various checks), "actions", scheduler and web UI =)

I would like to suggest to stop measure benefits of both projects and just "steal" them from each other, making end-user experience better for both projects ;-)


Losing the ability to stream the request and response is a big downgrade from normal CGI. I'm struggling to see how this improves at all on it.


Why would you lose it? As far as I understand bodies are still streamed through stdin/stdout. Are you concerned about the headers? You should have all of those anyway before dispatching.


...also added the constraint of non-deterministic micro-timeouts. This is some hackery, which is fun, but not something I would want to run.


Isn't this just OpenFaaS?

OpenFaaS has a single-binary distributable that doesn't require Docker/k8s they released called faasd:

https://github.com/openfaas/faasd


> I also tried self-hosted solutions based on k3s but it too heavy for 1GB server (yep, it is, don’t believe in marketing).

ow.


inetd was like what, eBPF hacks and sidecars?


Is this compatible with AWS Lambdas on some level? Could it be? What about FastCGI/WSGI/... compatibility? I'm thinking it could be really nice for hobby projects and for developing on a laptop.


I've been hacking on some go code to put together a set of services to run lambda functions locally.

It's a hodgepodge of different services.

https://github.com/silasb/lambda-engine https://github.com/silasb/lambda-scheduler https://github.com/silasb/deno-aws-lambda-example

Since this is a POC it's probably not worth looking at or trying to use as the instructions are in bad shape.


Any script/application that can parse STDIN and write something to STDOUT is capable for the project.

Hobby projects or projects with low number of requests are ideally fits. And it could be run on very low-end devices: raspberr pi, cheapest VPS, etc..


Openfaas.com enables cgi-style applications to be run on k8s


Heavy, very heavy. Even with k3s. I tried it first. Even plain k3s will run slowly on 1GB machine (a target audience)


Yeah, I was thinking about this as a migration path for scaling-out CGI scripts


We have gone full circle c:


Im still trying to understand the difference between a .php file and a lamda "serverless" application.


Well a php file (that isn't a library, so one that produces output) is one of the things that you could write to be a lambda function. But not every lambda function can be expressed as a php file. So I could download some video filtering lambda functions and they can be written in anything, then I can link them against my application, which I write in PHP.

The language/platform/dependency freedom is what takes it from literal CGI script to lambda function.


It took a moment of thinking to decide which of "literal CGI script" or "lambda function" you were attributing "language/platform/dependency freedom" to.

I would've attributed them to CGI.


Ah, that's fair. I was mostly attempting to highlight that a php script is a subset of lambda functions (from a conceptual viewpoint). Of course it's also a subset of CGI scripts as well (in a much more literal viewpoint).


This does not sound right. The platform independence and freedom is the big win for lambda where it is basically the perfect vendor lock-in? With serverless you give up a lot of independence.


This is a strange observation to make about an opensource "selfhosted lambda server".


a) This site is very hard to navigate; on mobile the text jumps around whenever you touch and it's usually cut off past the edge of the screen

b) I couldn't figure out what this is. At first I thought it was computer graphics rendering (what "CGI" usually means), but after skimming several paragraphs I don't think that's what it is? But I'm still not sure.

c) Clicking the github link tries to download a .zip, which beyond making it harder to figure out what this is, was mildly distressing.

Edit: re: the downvotes, presumably the author would want to know that their site is broken on mobile and that the content is inaccessible to the average reader who might be curious to learn about it, in ways that could be easily remedied.


I think your downvotes are for your comment about CGI; while the term is ambiguous, in the context of serving the web it refers to Common Gateway Interface, the original method for serving anything other than static files.

https://en.wikipedia.org/wiki/Common_Gateway_Interface

For many HNers, this is an understood term.


I don't think it's unreasonable to ask that acronyms in general be expanded towards the top, especially when there's a commonly-known overload in the same general subject area. Personally, I've been working in the field for six years and studied in it for six years before that, and I'm pretty sure I've never heard that term. Simply writing out the full phrase "Common Gateway Interface" would a) tell me it's not what I thought, and b) give me something to google so I can learn about the subject and/or decide if I'm interested in continuing to read.


There's some advantage to abbreviations/acronyms (since titles have a length limit), but in general, I agree. I propose the first place we start is to agree that "crypto" doesn't always mean "cryptocurrency" :-)


If you don't already know what 'CGI' means in this context, I'm skeptical that "Common Gateway Interface" would actually resolve much confusion.


It would tell me that it doesn't have to do with graphics, and that it probably has to do with networking. That's a whole lot more than is made clear as it is.


It's about the Lambda product of Amazon Web Services. No relation to Church's lambda calculus or lambda constructs in programming. The article does not make that clear at all.

(One of the privileges of being a really big company is that you can name your products using common terms and take over that term. Like, "Windows". Even though Sun used that first.)


I did actually follow the "Lambda" portion of it, for what it's worth




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: