OpenBSD, C, httpd and SQLite – Web App stack

bitofhope · on June 9, 2018

I find the site a little tongue-in-cheek, but I genuinely like the ideas behind it. I honest-to-$DEITY prefer programming in C to any sort of Javascript. C is kinda hellish for large, complex projects, which is good. The world needs fewer large and complex websites. OpenBSD also provides very good security features that mitigate the inherent security challenges of writing C. pledge(2) is just a very good idea.

I generally think websites should do their processing in the backend, whenever possible. Running code to generate the website should not be the user's problem. This also comes with the advantage of not being constrained to *Script languages. C is alright, but if you're of the defeatist camp who thinks writing C safely is impossible, you can adapt the BCHS philosophy just as well to C++, Rust, Go, D, Python, AWK, Common Lisp…

rileymat2 · on June 9, 2018

C is also hellish for string processing.

From memory management to buffer overflows to understanding the encoding types.

mpweiher · on June 9, 2018

Objective-C with a reasonable Foundation is such a great language for this...

It has essentially the predictability and simplicity of C, just with a minimal amount of dynamic binding to make it comfortable. And NSString variants tend to take care of string handling and encodings.

okket · on June 9, 2018

At that level, you can just switch to the clang compiler based Swift language, which is much nicer/modern. And look, there are already Swift web frameworks like https://vapor.codes

_pfxa · on June 9, 2018

I really dislike the Swift/Scala/Rust/Dart/Kotlin etc. style, heavy and centralised syntax. ObjC is not very beautiful, but it has an elegant way of combining composable primitives (functions and objects). The same kinda sorta goes for Perl too, which I think is a nicer alternative to C in this stack.

mpweiher · on June 9, 2018

Er...not really. First of all, the context was "C is pretty nice for this". Objective-C adds just a tiny bit on top of C, including safe/correct string handling, and is otherwise very similar in its characteristics.

Swift is an entirely new language, more like Rust or Kotlin. It's also not based on the clang compiler, clang is a C/Objective-C/C++ front end for LLVM.

As to "nicer/modern", well...it does clean up some of the effects of having a hybrid (syntax duplication). Other than that, it is in many ways a step back towards C++ style languages, static/brittle, incredibly complex and incredibly slow to compile.

earenndil · on June 9, 2018

Small nit: swift is built in llvm, much like clang is, but it's not based on it.

mhd · on June 9, 2018

If only WebObjects would've arrived without an unnecessary Java rewrite.

peatmoss · on June 9, 2018

I also remember when Apple was pushing Java as an option for writing Cocoa apps. What little I did back then never felt like a first class citizen compared to ObjC.

bonaldi · on June 9, 2018

Is there any reasonable way to get an objective-C equipped with an NSString variant on Linux? GnuStep seems mostly abandoned.

jstimpfle · on June 9, 2018

I recently wrote a couple dozen lines of code around Windows Fibers for cooperative multitasking. This makes for much nicer code than the callback style approach from Javascript.

But anyway for a larger codebase that is about GUI interaction I figure you want more control than what these models provide. You don't want to write event handlers that "push" actions. You want to pull them and process them in a known context.

String handling in C is in general much better than its reputation. You just need to write a couple lines that format to allocated buffers. And you want to error out in case of OOM. Can't check each allocation on the spot. And you need some sort of pool that can release all allocations of the same lifetime at once. (I don't know that this approach is practical for typical web applications, which are extremely string-heavy. It is indeed quite practical for games or compilers.)

ronreiter · on June 9, 2018

If you want simplicity as your #1 objective then I would recommend Flask and Python. If speed is your #1 objective then you can try Go.

Maintaining a C webapp is terrible.

You don't have any primitives, you need to manage your own memory which is extremely dangerous, and the amount of best practices you need to learn so you won't shoot yourself in n the foot and create an exploitable/unstable application is immense.

hedora · on June 9, 2018

Python is the opposite of simple.

If you want good performance, then, for non-trivial sites, you cannot afford to spawn the interpeter on each request, so you have to use some asynchronous tooling. Similarly, spawning connections is much heavier weight than it should be, so you end up with backend connection pools.

At work, we used fabric for this, and it blew up in our face at scale, and we had to rewrite.

In order to get off the ground, you also need to use pip, which causes an inordinate number of problems vs. the bsd ports tree or .deb files. Managing dependencies and versions between the host os and pip space still seems to be an open problem. You can use virtual environments for each script, but then patching zero days becomes infeasible.

If you want basic type safety, you need to rely on external tooling. In my experience, third party python libraries evolve types and api’s without warning. This shows up at runtime, so it introduces security holes.

I could go on for hours.

vbernat · on June 9, 2018

> If you want good performance, then, for non-trivial sites, you cannot afford to spawn the interpeter on each request, so you have to use some asynchronous tooling.

Which Python framework is spawning one interpreter on each request? Most frameworks rely on WSGI where requests are served from a fixed number of interpreters. This approach achieves some honest performance. See for example https://klen.github.io/py-frameworks-bench/.

> At work, we used fabric for this, and it blew up in our face at scale, and we had to rewrite.

What is fabric? In Python, this is an automation tool for SSH. Unlikely to solve the problem of serving web requests.

cup-of-tea · on June 10, 2018

That "framework" would be CGI. You have to call a program to get its standard output.

xtrapolate · on June 9, 2018

> "you cannot afford to spawn the interpeter on each request"

You mention this as a case against using Python, but this is entirely unrelated to Python as a framework. In other words, had they used PHP or Node or %s, they'd still have to face the issue of optimizing the request-to-response path.

This relates much more to the issues that often come with the common practice of separating your webserver, from your business logic (Flask, in our case).

> "so you have to use some asynchronous tooling"

Solutions such as mod_cgi and wsgi have been the standard for years at this point. That's, in many ways, a solved problem.

> "In order to get off the ground, you also need to use pip, which causes an inordinate number of problems vs. the bsd ports tree or .deb files."

Any specific issues you can expand on? Because this is a rather vague form of criticism.

Using pip is ridiculously easy. I've been using Python extensively in various production environments and rarely had to point a finger at pip. Package and dependency management in Python are solid (that isn't to say things can't improve).

> "You can use virtual environments for each script, but then patching zero days becomes infeasible."

Again, this is an entirely subjective form of criticism. I haven't had any issues with keeping my virtualenvs up to date.

> "If you want basic type safety, you need to rely on external tooling."

Again, you just seem to favor strongly typed languages and that's fine, but that's hardly a point against using Python here. Type safety is simply not a Python thing, much like weak typing is not a C thing.

> "In my experience, third party python libraries evolve types and api’s without warning. This shows up at runtime, so it introduces security holes."

Developers will sometimes break APIs. How is this a Python-specific issue? These things are literally happening everywhere.

bqe · on June 9, 2018

> Again, this is an entirely subjective form of criticism. I haven't had any issues with keeping my virtualenvs up to date.

What tools do you use keep all of your virtualenvs up to date?

xtrapolate · on June 9, 2018

I can recommend Snyk[0] (I am not affiliated, have used their products with clients before). They've got a fully functional free tier, and they have a ton of relevant features. And yes, there are other, competing products out there you could use.

[0] https://snyk.io/docs/snyk-for-python

cbingham · on June 9, 2018

"Developers will sometimes break APIs. How is this a Python-specific issue? These things are literally happening everywhere."

That's actually not happening literally everywhere, that's part of the development culture surrounding (not only) Python. In the C world, maintaining a stable API is the norm, not the exception.

Yes, PHP and Node.JS and Ruby all are just as bad or worse than Python. That's the whole point!

xtrapolate · on June 9, 2018

> "Yes, PHP and Node.JS and Ruby all are just as bad or worse than Python. That's the whole point!"

The original comment I was replying to was taking a stab at Python in particular. All I'm saying, is that in the open-source community, in that specific regard, Python doesn't stand out. Many developers care about backwards compatibility, many (dare I say, most) others simply don't.

I've had a large Java repo that broke recently, after we've updated some of our Maven deps to slightly newer versions (that was a very large, well known and used package). So those things happen in the "Java world" too.

dsies · on June 9, 2018

So, what did you rewrite your application in?

hedora · on June 9, 2018

Considering C++ or Rust.

(Rewrote big chunks in python multiple times...)

xtrapolate · on June 9, 2018

> "Considering C++ or Rust."

You've mentioned many issues you've had with your Python code base. Would you kindly expand on how/why C++ or Rust solve those in your particular case?

sam0x17 · on June 9, 2018

You should really check out Crystal / Amber Framework. Based on what you've tried you might really like it.

hedora · on June 9, 2018

Also, here is a page with the tooling this stack provides:

https://learnbchs.org/tools.html

This set of tooling is much more powerful than what I’ve seen deployed in large python environments.

BHCS has static and dynamic analysis out of the box, automatic penetration testing, hardware-assited, os level sandboxing and privilege whitelisting.

The framework provides performant and hardened process management, and hardened json and sql processing out of the box.

sam0x17 · on June 9, 2018

I would recommend Crystal via the Amber Framework. Really no trade-offs there as you get ruby-like syntax with C performance, type checking, and a compiled language.

eftychis · on June 9, 2018

Still surprised, when I hear from friends about C, C# etc codebases (for web development). There are out there.

Edit: Reading through depends on what you need. I suggest you take some time to look at the tooling provided and what you need (if you have prior experience on the field great). Don't let the hype carry you. Yet, good idea to listen, read or chat with people that had a similar problem to solve (with a grain of salt).

engrefoobiz · on June 9, 2018

I use C# in combination with Razor views and kestrel server. Why is that strange? To me it seem so natural.

icedchai · on June 9, 2018

If simplicity is your objective, use PHP. Nothing makes building web apps easier. Yeah, the language sucks...

laumars · on June 10, 2018

PHP makes building web pages easy. I'm not so convinced it makes building web applications easy as they require another level of performance, security and tooling that a web page does not. While you can obviously write a web app in PHP (heck, you could write one in CGI/Bash if you felt inclined) there are far better suited languages to get that job done. And ultimately once you factor in the extended requirements and maintainability, what was previously seen as easy can quickly become problematic

icedchai · on June 10, 2018

Agreed. However, there are frameworks like Laravel that make this pretty manageable.

notaplumber · on June 9, 2018

Important pieces of the puzzle are Kristaps' kcgi (perhaps kwebapp/ksql).

https://kristaps.bsd.lv/kcgi/

And also OpenBSD pledge / "sandboxing" on other OSs

https://man.openbsd.org/pledge

https://learnbchs.org/pledge.html

https://kristaps.bsd.lv/kcgi/tutorial6.html

krylon · on June 9, 2018

I really hope other systems will adopt pledge. Of all the selective privilege dropping mechanisms I know of, pledge seems to be the least complicated to use.

pjmlp · on June 9, 2018

Mojave has a new security aware runtime for its applications.

Bringing sandboxing to all applications and requiring listing of entitlements and digital signing.

Microsoft has decided if the apps don't come to the store, the store goes to the apps, and are merging Win32 with UWP sandboxing concepts via Windows containers and the new MSIX package format.

Android sandboxing is still the most complicated one.

psergeant · on June 9, 2018

> because the open internet is damn inhospitable.

And the answer to this is writing text-processing functions that you expose to the world, in C... skeptical face

kuon · on June 9, 2018

We could argue that modern static analysis tools and the compiler itself are good at catching a lot of errors, and that high level languages have bugs too. C is a dangerous beast, but with good practices it isn't that dangerous. Of course you have to be extra careful when dealing with user content, like JSON payload, but many decoders in other languages are written in C.

Rust is a good solution, but it isn't trivial to learn, and might not suite all situations. I think it's great to have a C solution like this.

unrealhoang · on June 9, 2018

I don’t think any non-gc language is trivial to learn, C also. Manual memory management is hard, and either you learn from the system (C way) or learn it upfront from the language (Rust way), it’s still the same amount of concept you have to digest to write practical software.

addicted · on June 9, 2018

It’s not just that C is hard. I can wrangle together pretty much any program in C since I used to program in C about a decade ago. However, having been so long, my greatest fear is that I wouldn’t even be aware of the footguns I have introduced in my code. C is littered with best practices which are absolutely not obvious, and unless you follow them can be massive security risks. And to a slightly lesser extent the same is true of C++ as well (although my friends tell me newer versions of C++ are much better at this. I worked with versions prior to C++11).

tannhaeuser · on June 9, 2018

That's true, but the original idea of Unix is to have many small program invocations work together, not to erect a monolithical long running daemon that does it all in a single address space. Memory management for one-shot command line apps isn't hard at all and can often get away with static allocations. Even if you're screwing up your heap, process isolation will take care of recycling memory when your program terminates.

dannypgh · on June 9, 2018

Eh, that's the original idea of the Unix shell, sure. But Unix has many ideas. Long running daemons have been a part of Unix systems since the beginning.

tannhaeuser · on June 9, 2018

BSD 4.3 introduced inetd along with TCP/IP to mainstream Unix. Quoting from [1]:

> When a TCP packet or UDP packet arrives with a particular destination port number, inetd launches the appropriate server program to handle the connection. For services that are not expected to run with high loads, this method uses memory more efficiently, since the specific servers run only when needed. Furthermore, no network code is required in the service-specific programs, as inetd hooks the sockets directly to stdin, stdout and stderr of the spawned process.

[1]: https://en.wikipedia.org/wiki/Inetd

dannypgh · on June 9, 2018

I'm familiar with inetd, but I don't understand your point. Unix is more than a dozen years older than 4.3BSD. Claiming that long lived daemons is somehow anti-Unix is absurd.

Just ask init or getty.

tannhaeuser · on June 9, 2018

My point was that the idea of small, self-contained apps that do one thing, and do it well, was in no way limited to shell programming.

Init and getty are small, self-contained system demons that are part of the O/S rather than application servers inviting long-running, single address-space processes for business logic.

sfifs · on June 9, 2018

Many small programs simply doesn't scale to the modern web when just the baggage associated with spinning up processes will kill you at any reasonable load. UNIX has nice ideas but recognise they were founded in the shared computer between a few dozen users environment of the 70s, not a web server serving thousands of requests a second of today

bitofhope · on June 9, 2018

What do you mean by modern web? I don't see how modernity implies more load, at least for fork() bottlenecked programs specifically.

If you're serving thousands of requests per second, you probably need a pretty beefy server anyway, regardless of the stack you're running. Forking makes good use of multiprocessor capabilities if nothing else.

sfifs · on June 10, 2018

Modern language runtimes used in the server space like Go basically multiplex green threads across processes to minimize context switches between concurrent execution paths. By leveraging the fact that I/O is much slower than processor ops and switching green threads on I/O without necessarily always switching OS process, they easily scale to hundreds of thousands of concurrent requests. I don't really know about Erlang's model but it also has green threads. There's a reason why high traffic sites use languages like these.

https://talks.golang.org/2012/concurrency.slide#1

laumars · on June 9, 2018

No, it really doesn't. Forking is expensive. Running a multithreaded (even if it's green threads rather than OS threads) http server and web application makes far better use of multi-processor capabilities.

Shell scripting is a handy tool but it is also slow so you wouldn't write hot paths in shell scripts. And for the same reasons you wouldn't write busy web servers in CGI. Both are fork() heavy.

tonyarkles · on June 9, 2018

Isn't this basically what "serverless" is?

sedachv · on June 9, 2018

Great observation. And in fact "serverless" is the way that Unix network servers used to work: https://en.wikipedia.org/wiki/Inetd

CGI is also the way the web used to work. Spawning processes has only gotten faster since then. The entire "process spawning doesn't scale to the modern web" argument is completely and totally bogus. Today, spawning a process in Linux is only 10-20 microseconds slower than creating a thread: http://www.bitsnbites.eu/benchmarking-os-primitives/

The performance problems are elsewhere.

sfifs · on June 10, 2018

What you are missing here is the fact that I/O is orders of magnitude slower than processor. Most of the time in servers is spent waiting for I/O and multiplexing green threads on processes without context switches while some execution paths are waiting on I/O gives you far higher capacity on the same hardware. See the other comment thread on GP comment.

sfifs · on June 10, 2018

serverless is primarily a decentralisation/small-scale/unpredictable workloads play where you trade off fixed costs of managing a dedicated instance for higher variable per call cost. At any reasonable scale and predictability, running a dedicated server is cheaper.

The folks who started at a small scale in serverless but see traffic growing and kind of are in the middle space before jumping to dedicated servers have an entire art form of keeping their functions "hot" since both Lambda and AppEngine actually keep your functions hot loaded when once it's spun up for some time.

https://www.google.com.sg/search?q=serverless+warm+up&oq=ser...

pjmlp · on June 9, 2018

The Basics, Pascal, Modula-2 linages are surely relatively easy to learn, while preventing 90% of the typical class of C errors.

mbaeten · on June 9, 2018

> Rust is a good solution, but it isn't trivial to learn, and might not suite all situations. I think it's great to have a C solution like this.

Isn't the learning curve of C pretty similar or even higher than rust's if you concider all the best practices you'll have to learn and understand and use? Imho in that regard rust pays off very well (and fast), because it's compiler _enforces_ most of these "best practices" to everyone.

_ugfj · on June 9, 2018

> like JSON payload,

Let SQLite deal with it! The JSON1 extension is quite capable of parsing, extracting, sorting and filtering JSON data and by default it's enabled.

dijit · on June 9, 2018

What is sqlite written in?

nitrogen · on June 9, 2018

SQLite is so extensively tested and widely used it may as well be written in stone. It doesn't count as "writing C".

peatmoss · on June 9, 2018

I think the parent is being a little tongue-in-cheek, but it’s a valid point. C is a dangerously inappropriate language to use... except when it isn’t.

C still has some unique portability characteristics that are very hard to match in other languages. If your requirements lead you down this path, it can be done.

You rightly point out however that this isn’t a path one travels lightly. Maybe the flip side of all this is, “if you’re prepared to invest the kind of rigor into testing that SQLite has, then rock on.”

dangerbird2 · on June 9, 2018

Yeah, if you run into a security vulnerability caused by SQLite, tens of thousands of websites, desktop apps, and virtually every iOS and Android app will be in the exect same boat.

yellowapple · on June 10, 2018

Or OpenBSD, for that matter.

j1elo · on June 9, 2018

Rust is a good solution, but it isn't trivial to learn

Maybe another opportunity to mention Zig again as a possible good alternative in the learning curve front? (I learned about it when it appeared recently in the HN front page)

earenndil · on June 9, 2018

Zig doesn't have any web frameworks yet, though.

gaius · on June 9, 2018

C is a dangerous beast, but with good practices it isn't that dangerous. Of course you have to be extra careful when dealing with user content, like JSON payload, but many decoders in other languages are written in C.

It isn't hard to build tools like Valgrind and AFL into your C workflow. I would take that over simply taking for granted that a higher-level language was secure just because it is interpreted. After all it's probably C under the hood anyway.

marmaduke · on June 9, 2018

This is a good point: you can CI run tests under Valgrind with various sanitizers and -Wall -Wextra. In a small codebase with a few cores, one can compile and run a test suite faster than it takes for a Django app to start.

zeth___ · on June 9, 2018

When your messaging protocol is Turing complete you have no hope of a secure web server.

peterkelly · on June 9, 2018

JSON is not Turing complete

simion314 · on June 9, 2018

Many of the high level languages/frameworks call a C library or a C kernel or a C database

IshKebab · on June 9, 2018

Yes but they have well defined and well tested places where that happens.

For example you could implement a string class in C++ using strlen, strdup etc. if you were feeling insane and it would probably be fine since you only use strdup once and then all the users of you class don't have to worry about getting it wrong.

If you write in C you have to use strdup every time you copy a string and there's no way you get it right 1000 times.

jstewartmobile · on June 9, 2018

The phrase "have to" here is VM language Stockholm syndrome.

You'd be surprised how far large static buffers and snprintf will get you. Easy to deal with, no leaks, valgrind is happy...

tedunangst · on June 9, 2018

It's pretty hard to get strdup wrong.

sunfish · on June 9, 2018

Yet, it happens:

https://nvd.nist.gov/vuln/detail/CVE-2017-14064

https://nvd.nist.gov/vuln/detail/CVE-2015-3182

https://nvd.nist.gov/vuln/detail/CVE-2017-13748

etc.

tedunangst · on June 9, 2018

This is true. Although they did say "copy a string", not use strdup to copy a thing that is not a string. :)

sunfish · on June 9, 2018

Yes, that's one of the three CVE's I posted ;).

You got me thinking about ways one could get strdup wrong:

- input is not a string -> possible UB

- input is a string, but the character encoding wasn't what you thought -> possible UB

- input is a string, but it was the pointer-plus-length kind -> possible UB

- input is modified by another thread -> possible UB

- strdup called from within a signal handler -> possible UB

- failure to handle error return values -> possible UB

- failure to free the memory when it's no longer needed -> memory leak

- freed the memory more than once -> possible UB

- used the memory after freeing it -> possible UB

I've personally seen several of these in real-world code.

sunfish · on June 9, 2018

How did I forget:

- forget to #include string.h, so strdup is implicitly declared, so its return type is int, which implicitly converts to char*, so everything still compiles -> possible UB

mvdwoord · on June 9, 2018

Yes, in ways that were learned over many years, by many people. Roll some of your own crypto, while you're at it.

ke29bnf · on June 9, 2018

Why did they also wrap them in new syntax and abstraction?

Why not just build the ecosystem to do that and still have it all be “C”?

Why not some fixes to the standard and update then old but useful code, and work on better compilers?

This is what’s been confusing me for a while

If you can write safe enough C for the core of an interpreted language, why not abstract that into tools and patterns that generate better, safer C and learn how to do that over the last 30 years

Instead of JS and dozens of flavors, Python, ruby, lua...

DRY right?

My suspicion is “vanity projects generate a sense of novelty that’s easier to sell.”

But if the OS and bulk of the stack are “C inside anyway” why the extra nonsense?

simion314 · on June 9, 2018

What is your point? My point is that if you nodejs you still are running "unsafe" C somewhere.

Also what is the point of calling out on projects that are not using "cool" languages? You don't agree with the developer choice that is OK, probably that developer does not agree with your choice and has other priorities and if is an open source project probably he wants also to have fun while coding it.

yorwba · on June 9, 2018

The unsafe C in popular language runtimes has hopefully been written or at least reviewed by someone who knew what they were doing, and it presents a relatively controllable interface.

Most people who write code in C are not that good at avoiding memory safety issues. The process is mostly to write some code, run it, then fix it until it doesn't crash immediately anymore.

Have you ever been the first person to run valgrind on a codebase? Lots of uninitialized reads and use-after-free issues will be flagged, because they don't normally crash the program and therefore go undetected without using an additional checker.

It can be fun to watch the error messages flow by, until you realize that now you have to file tickets for all of them. And some people won't even see the problem with those errors that don't crash the program, because potentially exploitable vulnerabilities are not as obviously bad as crashes.

simion314 · on June 9, 2018

I agree, but this does not mean that some experienced developers can't have some fun creating some tools in C, they may even be less buggy and more performant then some of the other tools written in high level languages.

Btw, I am not a C fan, in fact I think I have no favorite language, I use what I have to for the project

zeth___ · on June 9, 2018

This isn't about cool this is about dangerous. C makes shooting yourself in the foot so easy and efficient it even asks for artillery support.

The time and effort to secure a C program against malicious input is absolutely huge. It would be quicker to write a secure dsl that using raw C. Which brings us back to scripting languages that have that implemented already as a library you load to run your requests through.

potta_coffee · on June 9, 2018

C lovers are downvoting you but you're entirely right.

aneutron · on June 9, 2018

Ain't nothing wrong with that /s

What could possibly go wrong

legulere · on June 9, 2018

Most of the string manipulation and memory management happens in the safe high level languages in such cases.

zeth___ · on June 9, 2018

And those c programs call binary files. Anyone who uses C is a hypster that just wants to use a cool language instead of of good old 1 and 0.

skolemtotem · on June 9, 2018

> And those c programs call binary files.

Uh... am I missing the joke? The C programs don't "call binary files", the C programs become the binary files through compilation.

zeth___ · on June 9, 2018

>Uh... am I missing the joke?

Yes, yes you are.

yani · on June 9, 2018

It says hipster-free which will imply to me that it is not for the typical startup developer

jeremyjh · on June 9, 2018

Is there anyone other than hipsters consciously trying to avoid being a hipster?

sph · on June 9, 2018

Exactly. Who would code a web app in 2018 in C? That's right, hipsters.

Seasoned developers would use a tool more fit for the job, like Java, Python or even PHP. Seasoned hipsters would use Node, Clojure or Elixir.

chris_wot · on June 9, 2018

Normally I would consider this to be an unhelpful comment, but actually in this case, you're right.

potta_coffee · on June 9, 2018

I'm 95% that the whole thing is an elaborate joke.

reacharavindh · on June 9, 2018

As much as I align myself philosophically to most things this project stands for, I wouldn't trust myself to write public facing web applications in C. The same philosophy could still be held up while replacing the last components with a less dangerous language like Rust(with the trade off of developing complexity, and possibly giving up a bit of control).

OpenBSD as a base operating system, running relayd and httpd to face the hostile web proxying for efficient services written in Rust would be my ideal choice. SQLite for most simple data store needs, and Postgres if the data management gets big enough that it might spill to another machine.

jmartrican · on June 9, 2018

Two points. Would be nice if there was an AMI on AWS for this stack.

Second point. We do a lot of RESTful microservices. Which might be good for this in the sense that with C you would want to keep your code small (cause no classes and manual mem-management). But what people do not always appreciate at first about microservices is that you need to rely heavily on libraries (either custom or external like Spring Security) to handle cross cutting concerns, else you are writing the same code multiple times. Essentially if you write a peice of code that you expect will go on more than one microservice, then it should go into library. So my concern is how easily would it be to create custom libraries to deal with cross cutting concerns like logging, and security in this stack, AND does C ecosystem offer external libraries for dealing with these common concerns that a modern RESTful microservice would encounter?

pjmlp · on June 9, 2018

And most important how sure can one be that those libraries have been propely sanitized.

kazinator · on June 9, 2018

If you want to do web backend in C, this approach is far from the only game in town. For instance, there is this: https://kore.io/

I like the idea of a web framework having a C API. We can make bindings to it in different languages and have a kind of semantic standard across the board, like is done with numerous other things: numeric libraries, crypto, GUI, ...

hedora · on June 9, 2018

I suspect “unsafe” C leads to safer server side code when combined with a few, simple libraries.

One issue that seems to fly over the head of “safe” high level language proponents are that you need to know what data types you are manipulating in order to write safe code.

Also, language-level isolation is much harder to implement correctly than process-level isolation (and OpenBSD has spent tons of time hardening process-level isolation).

I’d love to see a security bake off of a few mature applications built with this and random-language-de-jour.

okket · on June 9, 2018

> I suspect “unsafe” C leads to safer server side code when combined with a few, simple libraries.

As many mentioned this might be true, but only if you don't have to deal with user input, strings, unicode and anything but extremely simple manual memory management.

The essence is, C and similar languages like Rust are great for writing an OS or a browser, but you really, really, really do not want to write complex server side web apps in those languages.

Sure, you can force your way through and if you have 30+ years of C in your head, it might even be safe. But it is not reasonable nor advisable for the general case. IMHO.

efficax · on June 9, 2018

Rust brigade engage!

Not sure why you're lumping Rust in with C here. Rust has automatic memory management as part of the language, without a runtime, and it has excellent libraries for dealing with Strings, unicode, and concurrency and asynchronous functions. It's a perfect language for writing a complex server side web app.

okket · on June 9, 2018

Sorry, my ignorance. I've had only a brief look at Rust, it did't 'click' in my head. On a second thought, it seems much more viable than this plain C mockery. Especially with the right support e.g. https://rocket.rs

jb1991 · on June 9, 2018

I am seeing SQLite pop up more and more in articles that tout its effectiveness and simplicity for many use cases over popular alternatives like PostgreSQL. It has definitely persuaded me to consider it on future projects.

dom96 · on June 9, 2018

Oh yeah. SQLite is incredibly versatile and for most web applications it's more than enough. The only pain point I have with it is that all data is stored as a string no matter the table schema types specified, this can lead to some bugs, but if you have a good library in your language for SQLite then it's not a problem.

jb1991 · on June 9, 2018

> all data is stored as a string no matter the table schema types specified

The docs seem to suggest otherwise:

> if a column is of type INTEGER and you try to insert a string into that column, SQLite will attempt to convert the string into an integer. If it can, it inserts the integer instead.

From: https://www.sqlite.org/faq.html#q3

tonyarkles · on June 9, 2018

>The docs seem to suggest otherwise:

>> if a column is of type INTEGER and you try to insert a string into that column, SQLite will attempt to convert the string into an integer. If it can, it inserts the integer instead.

There's nuance to your quote. From my recollection, this means that "321a" will be inserted as "321", but "foo" will be inserted as "foo" (into an INTEGER column). Definitely a wart, on an otherwise fantastic system.

SQLite · on June 9, 2018

Not quite right.

The expression "CAST('321a' AS INTEGER)" will do as you suggest and ignore the trailing 'a' character, yielding an integer 123 result. But that only happens for an explicit CAST. Automatic type conversions must be reversible. That means that '321a' is inserted as a string in an INTEGER column, but '321' (without the trailing 'a') will be converted into an integer 123.

PostgreSQL, MySQL, and SQL Server do exactly the same thing for the '321' case. For the '321a' case, the other three throw an error whereas SQLite just cancels the type conversion and inserts the original string.

tonyarkles · on June 11, 2018

Ahhh cool!

Either way, the fact that you can end up with strings in an Integer column is certainly surprising...

    sqlite> create table test (foo INTEGER);
    sqlite> insert into test (foo) values (123);
    sqlite> insert into test (foo) values ("blah");
    sqlite> insert into test (foo) values ("123a");
    sqlite> select * from test;
    123
    blah
    123a

jb1991 · on June 13, 2018

It’s not surprising if you read in the docs that it is an intended feature.

jb1991 · on June 9, 2018

But “321” is still a string, not an integer.

jb1991 · on June 9, 2018

I am currently working with a data set that is billions of rows and tens of GB. Will be interesting to see how SQLite handles it, which I plan to try. All of that in one DB file!

dagw · on June 9, 2018

The main place you might run into problems is if doing lots of simultaneous writes. As long you're mostly dealing with reads then it should work.

xyproto · on June 9, 2018

Should be fine. SQLite is fast and capable.

PretzelFisch · on June 9, 2018

How are you handling writes to the data? SQLite isn't known for handling concurrent writes which you need in most basic web applications.

jb1991 · on June 10, 2018

I think the web library, if it supports SQLite, must manage the writes to that database. Django uses SQLite by default, but if you have it running on just one server, I would expect that Django coordinates the needed writes from all its simultaneous users in a queue or something, one at a time.

marktangotango · on June 10, 2018

Has anyone verified this is the case? I tend to doubt django does it. Using a few tricks SQLite easily beats postgres insert performance, just a little obscure.

jb1991 · on June 9, 2018

That would be a good question for the author of this article’s web stack.

There is some explanation here: https://learnbchs.org/ksql.html

gaius · on June 9, 2018

SQLite is a beautiful little jewel! Every time I think "does it have X feature" it always delights me that it does! And the code is exemplary as well, you can learn loads from just reading it. If you don't need concurrent writers, then SQLite goes a very, very long way.

cup-of-tea · on June 10, 2018

I've used it for many things but for Web stuff it's the django default database. I've actually put that into internal production but I used only the test Web server (one instance). Is it possible to just use gunicorn or something with multiple workers and sqlite with no problems?

bpicolo · on June 9, 2018

It's great for simple things, but if you're going to need writes from >1 server it's not what you're looking for

sbr464 · on June 9, 2018

The example says no mysticism but is using at least 4 undeclared variables. I’m not a C developer so I’m sure I’m completely wrong, but where do pledge, puts, EXIT SUCCESS/FAILURE come from?

I assume from importing stdlib etc, which probably give you additional variables to use, that you would need to know about explicitly. One thing I like about zeit.co’s micro server (js) was the argument about not using bodyparser and magically having res.body available, but using async/await and assigning it to a variable. This is popping up more with things like es module imports, render props in react, etc. They make it clear where variables come from and allow you to avoid stepping on existing variables. What other 10-100 variables can I step on potentially with global patterns as a new user?

beefhash · on June 9, 2018

> I’m not a C developer so I’m sure I’m completely wrong, but where do pledge, puts, EXIT SUCCESS/FAILURE come from?

- pledge: unistd.h[1]

- puts: stdio.h[2]

- EXIT_SUCCESS and EXIT_FAILURE: stdlib.h[3]

You can probably safely assume a C programmer is familiar with puts and EXIT_SUCCESS/EXIT_FAILURE and that an OpenBSD programmer has heard of pledge(2).

[1] https://man.openbsd.org/pledge.2

[2] ISO C11, § 7.21.7.9 no. 1; https://man.openbsd.org/puts.3

[3] ISO C11, § 7.22 no. 3

sbr464 · on June 9, 2018

Thanks for that info. I do need to learn C still.

cpmouter · on June 9, 2018

It's surprising to me that someone could be a programmer without knowing anything about C.

scandox · on June 9, 2018

The best programmer in the world may never have even seen a computer for all we know.

carapace · on June 9, 2018

Dijkstra famously didn't use one until his friends and colleagues made him get a mac so they could email him.

Senderman · on June 9, 2018

I was programming for a few years before I first encountered C, and that was almost 2 decades ago. I don't think it's that uncommon.

ohazi · on June 9, 2018

I really don't like OpenBSD's httpd. Last I checked (three months ago?), it couldn't do something as simple as adding a custom header to a static file. Why you'd use it over Apache or nginx is beyond me.

joshklein · on June 9, 2018

That is outside the scope of httpd; use relayd.

petee · on June 9, 2018

I'd love to know what header you desperately need just for a static file, that would necessitate a behemoth such as nginx, or apache and all it's CVEs?

Simply having all the bells, whistles and free candy, doesn't determine 'toy' status

ohazi · on June 9, 2018

X-some-signal-to-tell-aws-that-its-okay-to-use-this-file: true

"Why would anyone need this" was also the typical dismissive response I got when looking around for help.

petee · on June 9, 2018

It's definitely a legitimate question - not everybody knows that AWS needs some silly header, nor why. OpenBSD people will be the first to call out unnecessary/outlandish requests unless you can back it up with a good reason outright.

I appreciate the fact that httpd doesn't attempt to be the end-all server for everybody - its primarily a lightweight way to run things like the BGPd looking glass, other tools, a simple website, or something via fastcgi. The term they use often when denying pull requests is 'Featuritis', which Apache & Nginx suffer from.

abiox · on June 9, 2018

what would you recommend for someone needing to add headers?

stevekemp · on June 10, 2018

If you're using httpd you'd probably pair it with relayd.

With relayd you can add custom headers, as this random example shows:

https://github.com/reyk/httpd/wiki/Using-relayd-to-add-Cache...

However that's probably not great still, because it's a global solution to a per-file problem.

sedachv · on June 9, 2018

I had the same issue (wanted to add a Content-Security-Policy header to a site hosted on OpenBSD httpd) and switched to https://www.lighttpd.net/

kizer · on June 9, 2018

I may be young and dumb, but C programming seems very simple and straightforward to me. With abstraction comes lack of knowledge of implementation in favor of ease of use; but in C you can build your own abstractions quickly and reach a level close to even JS (opaque types, function pointers). At least in C you know exactly what is going on.

Can someone briefly summarize the security issues in C? If you manage memory properly and take a conservative approach to handling input, where is the risk?

Like I said, I'm young and have only be programming in C for ~3 years.

petee · on June 9, 2018

C's biggest issue is well-warned (lazy?) programmers repeating the same mistakes that have been documented ages ago. In the analogy of shooting yourself in the foot, why is anyone using a gun without the safety training first??

If you're serious about writing safer C code, certainly check up the following two resources from Robert Seacord:

- SEI CERT C Coding Standard: https://wiki.sei.cmu.edu/confluence/display/c/SEI+CERT+C+Cod...

- Secure Coding in C and C++: https://resources.sei.cmu.edu/library/asset-view.cfm?assetid...

pjmlp · on June 9, 2018

In C you think you know what is going on, specially bad when developers mix the compiler specific behavior with the standard and then try to write portable code.

- Decays of arrays into pointers.

- Decays of enumeration into numeric types

- Default signess is implementation defined

- Implicit conversions

- Currently ISO C11 lists about 200 UB cases, C2X plans to list even more

- Strings are a pointer to somewhere in memory that you hope the caller actually terminated them.

- While the pre-processor seems basic, compared with real macros, you can still be very creative with it

- No way to validate security issues in binary libraries

The LLVM and PVS Studio blogs have quite a few examples of C gotchas.

earenndil · on June 9, 2018

> While the pre-processor seems basic, compared with real macros, you can still be very creative with it

That sounds like a benefit, not a downside. See, for instance, http://libcello.org/

pjmlp · on June 10, 2018

Try to be do maintenance work on a foreign code base and you quickly will change your mind regarding it being a benefit.

Doing code fixes on a server code which was using clever tricks to convert between memory handles and the real memory addresses teached me that.

_ugfj · on June 9, 2018

http://www.cl.cam.ac.uk/~pes20/cerberus/notes50-survey-discu...

Not only does this survey some interesting C behaviors in practice but asks "Do you know of real code that relies on it?" and there's always a number of "yes" / "yes, but it shouldn't" and also "no, that would be crazy" answers to each :D

usgroup · on June 9, 2018

Lol, about 10 years ago I wrote a CGI framework in C. It was fun and the minimalist in me loves the lark but for productive work especially working with other people, opting for a stack with next to no abstraction is crazy.

Put it this way; Would you invest in s startup intending to use this stack ? Can this stack ever be the result of pragmatic decision making?

If you can genuinely answer yes to both in your use case , then why not ...

stevekemp · on June 10, 2018

People generally invest in startups because they believe the sales/customer numbers. Not because they pay attention to the implementation of every single component which is used by the company.

That said OkCupid famously used a custom HTTP-server to power their site:

https://github.com/OkCupid/okws

UncleEntity · on June 9, 2018

> C is a straightforward, non-mustachioed language

Maybe non-mustachioed out of the box but I've done the mustachio thing to generate code for my ASDL parser backend.

Well...I guess it was a C++ library since it was "header only" but I'm sure I could have found a C library if part of the project wasn't to learn boost::spirit.

potta_coffee · on June 9, 2018

Please...I'm convinced this "BCHS Stack" is just an elaborate joke.

qwerty456127 · on June 9, 2018

I can imagine an even more weird one: DOS, ASM, dBase... (;,;)

> C is a straightforward

Indeed, what can be more straightforward than manual memory management and pointer arithmetic...

ghapereira__ · on June 9, 2018

> "man pages and "-Wall -Wextra" are your new best friends" These guys are more likely friends with "-pedantic"

bitofhope · on June 9, 2018

And nothing wrong with that. Stricter compiler is a good thing if you don't need it to be laxer.

My favourite flag in clang is -weverything. Everything I write in C I compile with it and fix whatever I reasonably can.

kworker · on June 10, 2018

I wonder if it can compile with -pedantic-errors That would be awesome.

blueside · on June 9, 2018

I love this; we are definitely in the territory of Poe's law here.

I strangely really like the css formatting used in the website's source code (e.g. https://github.com/kristapsdz/bchs/blob/master/json.css)

ape4 · on June 9, 2018

That formatting is cool. Why would they not want radio buttons to display? `input[type=radio] { display: none; }`

alexhutcheson · on June 9, 2018

Doing anything on the web with a language that doesn’t have a string type seems like masochism.

kizer · on June 9, 2018

A character array seems like a natural way to me to represent a string of characters. But Unicode though...

fulafel · on June 9, 2018

As C functions can't take arrays as arguments, you don't get very far this way.

kizer · on June 9, 2018

...you pass in a pointer to the first character. Arrays in C are simply contiguous spans of memory.

fulafel · on June 10, 2018

Thid then has the problem that thr length is unknown, leading to well known problems.

petee · on June 19, 2018

Well if you wrote the array, or your program did, then you do know the length. Not a problem. Would it be nice if an array internally stored that? Sure. Does it? No. Is that a real problem? No, not really - its been solved and worked around for decades

petee · on June 9, 2018

This is incorrect; a quick search with your first sentence will answer that

fulafel · on June 10, 2018

I found the opposite, and a quote from the standard: " A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to type’’, where the type qualifiers (if any) are those specified within the [ and ] of the array type derivation."

petee · on June 19, 2018

You pass by reference, not by copy. It is still passing an array to a function though

yellowapple · on June 10, 2018

If you're looking to develop a website exclusively with the tools that ship with OpenBSD, Perl would probably be a much saner choice than C, not to mention one with a long history of use in web development.

tannhaeuser · on June 9, 2018

I'm loving the spirit if not the snark, but it's still odd OpenBSD's httpd (not to be confused with Apache httpd) calls it's CGI gateway "slowcgi" (and kindof derides CGI programming). I mean, it being slow is entirely the fault of the O/S isn't it ?: And using C gets a whole level more challenging with long-running processes and async I/O because of manual memory management. Native CGI programming IMHO is best enjoyed with a reverse cache if you can help it, such as with Apache's mod_cache.

notaplumber · on June 9, 2018

It's slowcgi.. you know, as opposed to fastcgi. It's just a name. It was implemented so that httpd(8) didn't have to support executing CGI programs itself and could speak only FastCGI.

You don't /have/ to use it, but it's safer.

http://man.openbsd.org/slowcgi

tannhaeuser · on June 9, 2018

I was referring to the release announcement back in 2014 or so which I'm unfortunately not able to find (there was more than a bit snarkiness towards CGI programming).

I'm aware what FastCGI is ;) Btw. if you're interested in native HTTP I'd be looking into nghttp2 rather than FastCGI and OBSD's httpd.

Edit: see also https://ef.gy/fastcgi-is-pointless and https://news.ycombinator.com/item?id=9202039 (previous discussion of OBSD's httpd)

_lwad · on June 9, 2018

how can i trust a lib which still uses openssl in this day and age?

ben_bai · on June 9, 2018

httpd(8) only supports the FastCGI protocol. If you want to run your "legacy" CGI programs you need a wrapper. The chosen name for this wrapper is slowcgi(8).

CGI always spawns off an new process for each request, and always was slow compared to FastCGI, so slowcgi(8) seems like a good name.

ekr · on June 9, 2018

It's slow because of its design, i.e. spawning a new process for each request; so it's not because of the OS.

tannhaeuser · on June 9, 2018

I'm not sure that sentence makes much sense though since the O/S is responsible for creating processes after all. There's an inherent overhead in the POSIX process model; but there's also overhead in the kind of manual memory management needed in support for evented I/O considering typical mixed web server workloads.

jeremyjh · on June 9, 2018

Is there an OS in which spawning a new process is not relatively expensive (relative to forks and thread pools)?

jokoon · on June 9, 2018

How will does it run on a raspberry pi?

sgt · on June 9, 2018

It should run just fine, but I haven't tested it myself. OpenBSD has support for the Raspberry Pi (BCM2837): https://www.openbsd.org/arm64.html

sfifs · on June 9, 2018

Expose a C program with all the language's subtle undefined states and manual memory management to the big bad web! What could go wrong...

petee · on June 9, 2018

Well untrained programmers who have no business writing C apps in the first place, for starters.

Everyone here is completely glossing over this as if it were suggested that this is intended for everybody, for every solution. If you're not extremely confident in your C skills, or have the experience to back it up, This is NOT for you! If you think you can only do it with another language, then This is NOT for you! If you think you're gonna write the next Amazon, This is NOT for you!

And lets not forgot just how many things are written in C and still running the internet. It's just another tool in your toolbox.

moosingin3space · on June 9, 2018

Linus Torvalds is "extremely confident in his C skills". Doesn't stop hundreds of CVEs per year targeted at the kernel.

Sick of the C Apologism Task Force's idea that "if only rockstar programmers wrote code we wouldn't have any problems", completely ignoring the reality of software development and the evidence of 50 years of exploits.

petee · on June 9, 2018

50 years of the same bugs over and over - the language might let you shoot yourself, but that doesn't abdicate the programmer from continuing education, code review, or testing. Seacord has a whole book on avoiding these; arguably anyone without it isn't doing all they can to write safe code, and might as well be considered uneducated. You can write unsafe code in any language if you don't know what's going on, or not a 'rockstar'

As for Linus, the kernel was written when he was a student, and I doubt he personally still makes the same low hanging fruit vulnerabilities that are the big issue.

brokencode · on June 9, 2018

Everybody makes mistakes, no matter how confident they are in C. And while there are numerous things written in C, there is also a continuous stream of bugs and vulnerabilities as well.

That’s not to say that other languages are perfect, but there are much safer options that are still very fast, such as Rust, or even C# or Java.

kazinator · on June 9, 2018

> Expose a C program with all the language's subtle undefined states and manual memory management to the big bad web!

I hate to break this to you, but that describes almost every TCP/IP stack and network adapter driver in deployment.

(Not to mention httpd servers, and programming language interpreters written in C.)

kworker · on June 9, 2018

Any example site with huge traffic, or is it just a "joke" project?