Hacker News new | past | comments | ask | show | jobs | submit | coyotebush's comments login

No, those two statistics alone don't say anything about the total number of CS majors. Per the article, CS represents ~20% (of students with declared majors).


CS is 214 of an estimated 3500 female student body.

That still seems exceptionally low. Posted this upthread but in 2010 20% of the student body was in Biology (once topical sub-specialties were aggregated).

20% of ~3500 would be closer to `700, which is 3.3x the number of declared comp sci students.

Or maybe my math is wrong? I'm more interested in the actual knowledge being put in and attention being paid out than a critique of the degree naming conventions.


Thanks, I got the numbers twisted in my head.


This list appears to include only software packaged in Debian, but that fact does get a mention at https://www.debian.org/intro/about#history


Diceware is great, though the paper fairly points out that algorithms like that and https://xkcd.com/936/ necessarily often choose very uncommon words.

I suppose the Markov model could just as well use whole words (as done in all sorts of other scenarios), at the expense of substantially longer passwords.


I took a look at whole words. Things got a lot longer. Like, twice as long. Here, let me generate some. These are 56-bits each:

'(#" that looked up to Darnay: you not?\" \"Very willingly,\" pointing" #" that it was confirmed. \"He may not--thou wouldst rescue this is touched the child" #" that direction through his, and said Stryver, \"that, there! I" #" that criminal in Lombard-street, out at neighbouring streets that she" #" that he,\" said Darnay. Released yesterday. I.\" It was set, and--in a highly" #" that although Sydney Carton.\" This must have to finish that way" #" that Madame Defarge's wine-rotted fragments of his. \"Carton, idlest and its accessories" #" that every rapid movement, and, as the executioner showed a high grass and")

Really, I would not not like to type in "that every movement, and, as the executioner showed a high grass and" every time I opened my laptop.


Yes, as long as you have a sound random source (whitened via a good CSPRNG, if you're not using real dice), and a correct unbiased bijective mapping to the wordlist (not just a biased modulo), then it is secure.

A friend of mine already wrote a generator for his own use along the Diceware lines, which probably I don't think he'd mind me linking, although it's a work-in-progress: https://qrmn.uk/dwr/ - you can feed it a custom wordlist, too, by just feeding it a textfile and it'll digest it.

(It's fairly easy to read, uses getRandomValues, a simple CSPRNG of my design based on djb's ChaCha20, and seems to do the selection correctly, I think: the trials and tribulations of JavaScript crypto, potential TLS MiTMs, the attack surface of the browser, etc are of course well known.)

The trick to memorable passphrases seems to be getting a really good wordlist. The standard ones of 7776 words fit well with 5d6, but don't cut it well for heading to the 100-bit range, and some of the words are strange. A 19973-word list would allow for 7 words to be 100 bits. To get that to 6 words, we'd need 104032 words in the list, which is unlikely to be a memorable English corpus. If we're willing to have 100 bits = 8 words, 5793 words are sufficient.

If anyone knows of any particularly nice English corpora that are free for use and would work well for this, I'd appreciate a tip.

A pox on those with "you-must-have-a-capital/symbol/dingbat" (it doesn't help!) or short maximum password lengths (I'm looking squarely at you, Microsoft Passport/Windows Live).


For my own use and amusement I wrote a Diceware-inspired program in Scheme. It produced random passwords like "luthier-beige-6139" or "unintimated-clamp-3529". The word-number pattern and separators could be varied.

Like you said, the issues are good RNG and word list. For the latter I used the "web2" list from FreeBSD /usr/share/dict directory. The word list was filtered to remove words that were capitalized, too short (< 5 chars), hyphenated, etc., leaving a final list containing about 151000 entries.

I estimated entropy for generated passwords at ~40 bits. To get 100 bits would require 5 or 6 words. A problem with this method is having passwords composed of obscure terms, reducing acceptability. A more carefully culled list would be smaller, but there's the tradeoff--password legibility vs. length.

Yes, limiting max password length to < 16 chars isn't very smart. Really that should be the minimum rather than maximum.


You can use a 10-bit "common english" wordlist that has no exceptionally rare words and code your 56 bit values with just 6 words.

It's not clear that the markov property helps memorability much when you're still left with lots of structurally incorrect punctuation.


try it out! I found this style of password (horse-battery-staple) to be much longer and less memorable. The proof, of course, is in the pudding. The next step (for me) is to conduct a reasonable experiment to discover whether or not these passwords are memorable.


Agreed that the expandable sections feel a little unusual for this type of documentation. They're great for drilling down to particular information, but not so much for skimming or reading straight through.

I might even suggest putting the sub-section navigation in the sidebar a la readthedocs.



Cool to watch, though this is a fairly straightforward model of complex contagion on a finite grid. It's similar to Conway's Game of Life in that there's a grid and local rules, but beyond that, the nature of the rules results in a different kind of behavior.

https://en.wikipedia.org/wiki/Complex_contagion

http://www.ladamic.com/netlearn/NetLogo4/DiffusionCompetitio...


I believe using those together counts as two user-defined conversions and is rejected. But one or both of those might as well be declared explicit.


> I wonder if they weed out duplicate copies of, say, jQuery.

Linguist tries to filter out things like that:

https://github.com/github/linguist/blob/master/lib/linguist/...


"Tries" is the operative word here. There is a documented history of utter incompetence in Linguist.


Likewise for most of GitHub's other libraries. Sundown is some of the worst code I've ever seen.


> Migrate to a cmake-based build

> Legacy support and compile-time features

> Platform-specific code

Sounds like well-justified cleanup, although it's possible this project is underestimating the usefulness of feature selection.

> New plugin architecture

> New GUI architecture

Now that's cool and ambitious.

I wonder whether Bram has an opinion on this?


Twilight (color adjustment is in the free version) works nicely too.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: