I'm not a C engineer, but I think I have an interesting recommendation to consid...

asveikau · on Dec 23, 2022

For me, when I learned more than 20 years ago, reading source and manpages from Linux and other Unix like projects was a source of inspiration.

I'd recommend OpenBSD libc. https://github.com/openbsd/src/tree/master/lib/libc

String functions especially are a good way to get into the C way of thinking. Since C's approach to strings is unique.

I also learned a lot by reading manpages of libc functions or Unix utilities and thinking about how they were implemented, and writing my own little versions.

_ank_it · on Jan 6, 2023

This. But for C++ and Python.

daniele_dll · on Dec 23, 2022

Fully agree with the principle but maybe let's not call Redis "well-written" :)

Redis is a great piece of software because of how much it impacted the "evolution" of Internet not because of the code quality, Redis's code resemble spaghetti code, a lot.

I will push myself and say that nowadays people would get fired for writing code like that!

And before someone start to throw stones (just for the sake of it), here a few examples: - Redis source code is FULL of GIANT files with thousands and thousands of lines of code, decoupling and separation of concerns are basic things! A few examples -- https://github.com/redis/redis/blob/unstable/src/module.c -- https://github.com/redis/redis/blob/unstable/src/cluster.c -- https://github.com/redis/redis/blob/unstable/src/redis-cli.c -- https://github.com/redis/redis/blob/unstable/src/networking....

- talking about auto generated, the repo contains commited auto generated code, a valid reason though really slips my mind (yes, of course I can think of a few, but ... really? :)))

- https://github.com/redis/redis/blob/9c7c6924a019b902996fc4b6... ... that struct feels like an "Italian minestrone", basically a kind of soup where you throw in, literally, all the vegetables you have around and it's great for a soup, a bit less for software engineering though :)

- the naming structure is a bit random at times, there is a mix of camel case, camel case mixed with underscores from time to time, pascal case

This kind of staff might be alright for a 0.3 or 0.4, maybe even for a 1.0, definitely not for a 7.x that has had 15 years to evolve :/ Even in cachegrand, that hasn't got to a v0.2 yet, I tried to avoid these kinda of dramas, in C is quick and easy to get there if you are not careful.

There are plenty of other small things but most of the big dramas should be there.

palmtree3000 · on Dec 23, 2022

Re: auto generated code: it saves ci time which is a pretty huge benefit. You still need to regenerate the code to make sure it's up to date, but you can do that in parallel with tests that rely on that code.

daniele_dll · on Dec 23, 2022

This is a valid reason, but makes sense for code bases where the generation takes tens of minutes or more not seconds.

sfpotter · on Dec 23, 2022

Here's one example of when to commit auto-generated code: if you use a tool like Cython and want to distribute sources to an end user, it is frequently recommended to distribute the C sources generated by Cython instead of the Cython code itself. The given reason is that the end user may not have access to Cython, but will likely have access to a C compiler. (This is frequently the case on scientific clusters where users don't have superuser privileges.)

daniele_dll · on Dec 23, 2022

Importing a C file into a Python session is a practice that can ---easily--- cause ---a lot--- of headaches, the packages required to build the code as python library and make it importable will vary between systems and make it hard to debug in case of crashes. I personally would distribuite the pre-compiled binaries, CI + docker makes that much easier.

The fact that "can be done" doesn't necessarely mean that "it's the right way" to do it.

Going back to embedding auto-generated code, focusing on this specific case, are you telling me that a line in the requirements.txt, on a machine where the user is expected to have access to a compiler, is the reason to keep auto generated code in the repository? Sorry, personally it seems a flaky reason.

Also, the very fact that you mention that users don't have superuser privileges should be even more of a reason to use virtual environments, where you don't need super privileges, or if you can't or want to use virtual environments you can still have multiple paths in the PYTHONPATH env variable and you can install packages in a specific paths leveraging the --target of pip or the PYTHONUSERBASE env variable, solving the issue as well.

I would prefer to suggest my users to improve their runtime environment than enable them to improve the chaos.

sfpotter · on Dec 23, 2022

I don't agree that it's a good practice, I'm simply quoting one example to you since you mentioned you were not aware of any. IIRC, you can find many GitHub issue threads on the Cython repo where they go into more detail. It has been a while since I spent any time thinking about this so the details are no longer fresh.

Unfortunately, suggesting that users improve their runtime environment is frequently a nonstarter for different reasons. (One obvious reason: users either don't have time or technical experience to do so.)

ozim · on Dec 23, 2022

Well I tried it and ed9b544e... is god awful, documentation has so many typos. God file 'redis.c' with 3k lines having even commented out code and some comments that I would not pass my code review.

But I love it and it is great recommendation.

Going to move with commits to see how it unfolds :)

stcroixx · on Dec 24, 2022

I think this is important just to get the idioms down that you otherwise only get on the job. Could pick almost any big active C project. I used to do this and I’d see some construct I didn’t understand and consult my K&R book to learn it. I’d do this in parallel with a book though for sure, K&R preferably. C is so basic that if you pursue it further it will be about learning the projects/companies way of using C.

brezelgoring · on Dec 23, 2022

I've got a single digit number of contributions to projects that I like, mostly because of how daunting it is to familiarize oneself with a codebase, and this 'git log -reverse' trick seems incredible as an entry point to a new project.

That being said, how do you feel about using this technique for repos with 2k+ commits? It seems unrealistic to read all of them, or am I being short-sighted here? I get demotivated just thinking about it.

sfpotter · on Dec 23, 2022

Reading 2k+ commits is a big waste of time.

Forcing yourself to read every commit to a project you "like" in order contribute to it is putting the cart before the horse. Just because you "like" a project doesn't mean you will have the skills to contribute usefully to it. This isn't a bad thing. I "like" GCC, but I don't have the skills to contribute to it. I'm not losing any sleep over this, and I'm definitely not thinking about reading every commit to GCC in chronological order.

If you use a project, you might find a bug or some deficiency in it. Then it might happen that you have the skills to quickly fix this problem. Either that, or this bug causes you so many problems or slows you down so much that it becomes worth it for you to develop the skills to fix the problem.

brezelgoring · on Dec 25, 2022

I like this approach to FOSS contributions, thanks for the answer, sfpotter.

baby · on Dec 23, 2022

I'm not sure why GP's post is upvoted so much, IMO it's a bad way to learn. I would instead find an entry point, a self-contained task that doesn't touch a lot of the codebase that you can use to learn something. Then, as you get more comfortable with some part of the codebase you can expand your scope by looking at tasks that touch other things. In general reading code is 50% of the work, the other 50% is to play with the codebase or try to make changes to it and see what happens.

brezelgoring · on Dec 25, 2022

That sounds reasonable, and I do agree that inching forward seems like a more viable approach to contributing in FOSS.

Thanks for the answer.

baby · on Dec 23, 2022

It's funny because I would never think that this is a good way to learn a codebase. Code changes completely through time, and early phases where nothing is set can be extremely messy and filled with constant refactorings. Commit messages are rarely good. I would rather try to learn a codebase looking at the latest state and the docs.

trenchgun · on Dec 23, 2022

> It's funny because I would never think that this is a good way to learn a codebase.

It is NOT for learning the codebase.

It is for learning how to program in C.

baby · on Dec 23, 2022

Ah I see. I might try that with another language then! That makes more sense.

pg314 · on Dec 23, 2022

I did that with Git itself. As a bonus you’ll learn more about a tool you’re probably using everyday. If I remember correctly, even the early commits were pretty high quality.

shortlived · on Dec 23, 2022

FYI, missing a second dash

  git log -–reverse

AccountAccount1 · on Dec 28, 2022

Damn, what a great comment.