More

reality_czech · on Sept 12, 2018

Friends don't let friends depend on Google products.

z0r · on Sept 13, 2018

Friends don't let friends depend on web apps.

reality_czech · on Aug 10, 2018

Wow. I don't even know where to start on this one.

My experience with academia is that everyone is scrambling to get grants and get published. Nobody ever asked questions about where the grants came from. A lot (probably even the majority at the time) was from the department of defense, and explicitly targetted to create weapons.

Professors spent a huge amount of their time writing grant proposals. It's like pitching to a bunch of VCs, only you do it every month, and the amounts of money are much smaller.

And this is the reward for a lifetime of achievement. If you're starting at the bottom today, conditions are positively Dickensian. The average (not the maximum, the average!) PhD in CS took 6 years. During that time you'll be paid almost nothing, no matter what the cost of living is around you. And you are essentially an indentured servant of the professor. If he wants you to do a routine task that has nothing to do with your research, you have to do it. Cumulatively these tasks could add up to years of delays. After you graduate, you'll probably have to take multiple postdoc jobs, often at very low salaries, in hopes of getting a faculty position. Sometimes the hopes come true, but very often not.

And from what I understand, CS is actually one of the "good" subjects to go to graduate school for. Things are much, much worse in the humanities.

It's truly incredible that anyone would hold this up as a better system than how industry works. Hmm, let's see... a two week interview process, after which the company will tell the applicant whether they're hired. Or, a two year postdoc after which the university may choose to throw them away like garbage. Spending half your time writing grants, versus spending a few minutes a week writing a status report. Come on.

Also, the section about how "the students will suffer" from industry partnerships reads like a bad joke. Students suffer because the most universities hire faculty purely based on research, and not at all based on teaching. Full stop. The top research schools have contempt for teaching undergrads; that's why they hire adjuncts to do it at minimum wage. (Well, they also dump some of the burden on graduate students, too.)

azhenley · on Aug 10, 2018

It seems like once a week I see comments like this. This was not my experience at all.

With internships, I made over 60k a year in grad school. I worked on projects of my choice. I did not do a postdoc after graduating. I graduated from a small, unranked department and got a tenure-track position at an R1 university in a top 75 department.

s0rce · on Aug 11, 2018

I don't know a single person who has gotten a recent (5 years) faculty position without one or more postdocs. I'd say this is extremely uncommon experience. Nor have a seen salaries more than 40k for students, even postdocs don't always make 50k.

azhenley · on Aug 11, 2018

All of my friends on the market this year from a variety of schools got faculty positions without postdocs. It isn’t uncommon in CS.

My salary as a student wasn’t 40k. My income from my RA position plus my internship was more than 60k.

s0rce · on Aug 11, 2018

I guess I'm not in CS, maybe thats the difference.

sid-kap · on Aug 10, 2018

60k per year doing internships? I thought a typical tech internship paid $5-8k/month, which means at most $25k over the summer. (Maybe my number is for undergrads, and grad student interns get paid significantly more?)

azhenley · on Aug 11, 2018

$6-9k a month for research internships with the ability to extend them beyond 3 months.

It worked out well since I would be doing basically the same research whether I was at the company or at my university.

closed · on Aug 10, 2018

What field? Even at top tier institutions I've never seen grad students make more than 40k, but I'm thinking of research heavy PhDs, where you continue your research over the summer.

Edits: ah, just saw in your profile it was software engineering!

lazyjeff · on Aug 10, 2018

I had the same experience as the grandparent, and averaged 70K a year. I had a lot of great opportunities during my Ph.D so I was really fortunate, but I felt that many students I met during internships were serial interns and got fellowships too, so it seemed at the time like most strong PhD students in Computer Science averaged about 60K a year and enjoyed it. Also note that you don't pay FICA (Social Security / Medicare) on your PhD stipend so 25K is like 30K.

I graduated in 5 years into my dream tenure-track job, and 7/8 of the students in my cohort got good tenure-track jobs too (the remaining one went back to running a successful business unrelated to her research). The school I'm at pays $40K/year stipends for PhD students, including the summer; so I think many people only count the 9-month stipend for PhD students which is about 27K, and not the summer salary.

azhenley · on Aug 10, 2018

I continued my research at companies every summer. Seems fairly common in CS, so you get access to real data/systems and good salaries.

grigjd3 · on Aug 11, 2018

I think this is fairly unique to CS. In physics, I had much the other experience.

reality_czech · on July 16, 2018

Expect an insta-shutdown in a few months.

reality_czech · on July 2, 2018

Yahoo hasn't had its own search engine for years. In 2010, they became essentially a frontend for Bing. In a later 2015 deal they switched the backend to using Google.

Duckduckgo is a metasearch engine, technically, but mostly it delegates to Bing.

As far as I can tell, there are only two and a half real search engines that still exist: Bing, Google, and Wolfram Alpha. (I count Alpha as a half because it's not really what most people are looking for.) I'm curious if anyone else knows of other real search engines still in existence.

cobookman · on July 2, 2018

Yahoo still uses Bing for majority of their searches. https://arstechnica.com/information-technology/2015/04/micro...

DrScump · on July 2, 2018

Check the date on that again; it doesn't contradict the parent comment.

cobookman · on July 2, 2018

https://searchengineland.com/yahoo-google-search-deal-233963

Unless something has changed it seems bing still gets at least 51% of traffic.

lqdc13 · on July 2, 2018

I just tried hotbot and it seems to be better for programming-related questions than duckduckgo.

I wonder if they use mostly Google for the backend.

anothergoogler · on July 2, 2018

Baidu, Yandex, Naver, probably quite a few others.

mda · on July 2, 2018

What percentage of traffic comes from Bing though?

hawkice · on July 2, 2018

Woah woah woah Duckduckgo delegates to Bing? If many/most searches are unique that means they can't live up to their privacy statements.

zaptheimpaler · on July 2, 2018

If they hide the users IP, HTTP headers etc. and proxy searches to Bing through their own servers it would be anonymous.

fragsworth · on July 2, 2018

Unless your query leaks private information.

Or a series of unique queries could leak private information.

jstanley · on July 2, 2018

Bing would be unable to associate series of queries with users.

As long as DDG are doing it properly (and I believe they are), Bing would only learn that the contents of each individual query are associated together, they would learn nothing about which other queries were performed by the same user.

tedsanders · on July 2, 2018

I think the concern isn't necessarily that Bing would associate query X with person Y. The concern is that Bing would even know that query X exists. For example, if Bing saw a spike in searches for "Aramco IPO July 4, 2018" and were to reveal it to a human or store it, that might be a serious leak of non-public information. Many searches reveal private information, even when they aren't associated with a user.

Sean1708 · on July 2, 2018

> if Bing saw a spike in searches for "Aramco IPO July 4, 2018" and were to reveal it to a human or store it, that might be a serious leak of non-public information

Maybe I'm missing something obvious here, but how is that any different from Google or DuckDuckGo seeing the same spike?

sammorrowdrums · on July 2, 2018

Well you might trust DDG as a good actor but not a third party. To discover that this information is discoverable to a third party (even if un-attributable) would breach their trust in DDG. Whether that's reasonable or DDG are misleading people in that regard is another matter. Personally I still use them a lot, and will continue.

I just think there is a point to be made here. Even generally it's often opaque what third parties have what data and I don't really think GDPR has fixed that. It's surprising for people the Bing might have the contents of their DDG search history, somewhere in the huge dataset of DDG searches that pass through.

Also they might not want to help improve Bing search but I'm guessing they do inadvertently?

FartyMcFarter · on July 2, 2018

There's no practical technology that would get around that problem.

Homomorphic encryption might do the trick (?), but it's too slow at the moment.

qznc · on July 2, 2018

Intel SGX is only answer at the moment. The Signal messenger uses it, do address book matching is private. It requires the user to trust the server hardware vendor (Intel) instead of also the cloud provider.

jstanley · on July 2, 2018

That would not stop the Bing query matcher (or indeed the Signal address book matcher) from being able to look at the contents of its own secure enclave.

qznc · on July 2, 2018

The trick is that every user uploads his own matcher. The server only sees encrypted matchers, feeds them data and returns the encrypted results. You as a user decrypt your results and nobody (except Intel) was able to see them.

y4mi · on July 2, 2018

thta has little to do with privacy though. and i believe thats the only thing they assure?

lowkeyokay · on July 2, 2018

I think that’s pretty far fetched. Such a spike would most likely be the product of an already very well known rumour.

Vinnl · on July 2, 2018

IIRC DDG delegates the crawling to Bing, but does the actual searching itself.

mda · on July 2, 2018

I really doubt it. They never tell what happens in the background. Probably because it would spoil their "magic".

Kiro · on July 2, 2018

How would you do that with the Bing API? Really don't think that's true.

larkeith · on July 2, 2018

Another half (read: limited-scope) search engine that springs to mind is Shodan.

PhasmaFelis · on July 2, 2018

Did Yahoo ever have its own search engine, technically? In the early web it was a directory maintained by humans, which made sense at a time when the total number of pages in existence on any given subject was no more than a few hundred; I thought that when that era passed they went straight into licensing other search engines' results.

KMag · on July 2, 2018

I worked at Google on search indexing at the time Yahoo switched from their own search engine to using Bing. At the time, by most of Google's own search metrics, Yahoo had a product superior to Bing. If Bing had been spun off as a separate company, or otherwise hadn't had access to Microsoft's deep pockets and default IE search status, it's likely Yahoo would have fared better.

toast0 · on July 2, 2018

I was at Yahoo during that time, although not in web search. From what I could tell, company leadership was frustrated with lack of growth in search market share, and didn't want to invest in it anymore.

Yahoo was running user studies where they would put Google results and Yahoo results side by side but switch the branding; while Yahoo results were ranked better than Google for most of the tested queries, results with Google branding ranked better than with Yahoo branding, regardless of whose results they were.

The plan was to just use Google, but the DOJ (or FTC?) put out guidance that that would be anti-competitive, so Bing was it. This might have worked out anyway, but the expected cost savings from outsourcing search didn't actually happen that I saw, but I left in late 2011, and stopped following closely after that. Web search was also linked with search ads, which Bing did poorly at too.

KMag · on July 2, 2018

Google also ran similar user studies, sometimes between Google and other search engines, and sometimes between production Google and a proposed change.

One tough thing is there isn't one search quality metric. It's important to have the search results page look good with its snippets, and another thing to have people actually look at the linked pages and compare the usefulness of the linked pages.

Common vs. uncommon searches are also important. It's not difficult to write a search engine that badly over-fits on the most common searches. However, for market share, it's important to do well enough on the common searches that users don't leave, and do well enough on tough long-tail searches that you pick up users that leave other search engines on tough queries. The idea is to be pretty good at the common searches, but the best at the kinds of searches that cause people to try other search engines. Naive frequency-weighted metrics will get this totally wrong.

It's also more important to get useful information in the first 2 or 3 links. If Google links to the second-best link at result #1 and puts the best link off the first results page, but Yahoo puts the best link down at #7 and second-best at #8, the user may lose interest before following a really good link.

I don't think Google took the union of front-page search results between two competitors and asked humans to hand-order the (up to 40) pages for how well they fit the query. But, that seems like a good way to test the actual usefulness of search results. You'd probably especially want to keep track of the percentage of the top 3 search results that were filled by top-5 (guessing at 5) useful links.

Anyway, inside Google it was well-known that Yahoo was the competitor to worry about in terms of search quality.

saalweachter · on July 2, 2018

I would note it is possible (and even likely?) that each search engine performed better for its own traffic-weighted query stream.

Endy · on July 2, 2018

Nope, they had a spider-crawler and a full engine along with the human-curated Directory.

wumpus · on July 2, 2018

The sequence was: Yahoo Directory -> Yahoo showing Google results -> Yahoo builds their own search engine -> Yahoo uses bing

anothergoogler · on July 2, 2018

Yahoo search was provided by Altavista for some time as well.

wumpus · on July 2, 2018

Yes, before they used google. It's a pretty interesting story, actually, how Yahoo felt that they should use the best underlying search engine with a "white label" approach, and how Google succeeded in eventually building a very strong brand despite being invisible.

boyter · on July 2, 2018

Didn’t they use inktomi for a while?

carlivar · on July 3, 2018

Yes, and then they bought inktomi.

scruffyherder · on July 2, 2018

Baidu

reality_czech · on June 25, 2018

Does this mean we can have more than 16GB in our laptops now?

Also, is "stratechery" named based on the old Celebrity Jeopardy skits on SNL?

reality_czech · on June 11, 2018

The mind boggles at how the dongle mongerers hobbled these Mac models. From every new Mac, a tangle of dongles hang at odd angles. What a boondoggle.

reality_czech · on May 4, 2018

Scott's complaint is basically that the IRB regulations don't match the regulations that apply to standard psychiatric treatment.

For example, the department had a huge amount of patient information stored in their regular computer system. But the IRB demands an archaic pen and paper system for storing patient records, rather than using the computer system that exists. You could argue that maybe the department computer system was bad, but in that case it should be fixed, rather than creating a parallel system.

Similarly with the issue of consent. The doctors were already giving questionnaires to the patients and already using them to determine what the treatment should be (counselling, medication, etc.) Why do the consent requirements suddenly become completely different when it's part of a study? Should the doctors be asking patients to fill out a consent form each time they give a questionnaire? Should the forms be in pen, even though mental patients can use pends to hurt themselves? I don't know, but I do know that having bizarrely different criteria for a study versus normal medical practice seems unwise.

I agree that it would be good to have better-defined criteria for stopping the study. It sounds like Scott had an informal goal of 100 data points, but didn't manage to hit this goal.

I'm sure IRBs do some good. But it's really hard to take the IRB's side in this particular case. They weren't even following their own rules (the person asking him to take ethics training hadn't taken the training herself-- which would have probably torpedoed his study even if he had done everything else right.)

Fomite · on May 4, 2018

"For example, the department had a huge amount of patient information stored in their regular computer system. But the IRB demands an archaic pen and paper system for storing patient records, rather than using the computer system that exists. You could argue that maybe the department computer system was bad, but in that case it should be fixed, rather than creating a parallel system."

The department's clinical activities and research activities are subject to different Federal regulations, and the IRB doesn't have any authority over standard clinical care.

"Similarly with the issue of consent. The doctors were already giving questionnaires to the patients and already using them to determine what the treatment should be (counselling, medication, etc.) Why do the consent requirements suddenly become completely different when it's part of a study?"

Because it's part of a study. An entirely different ethical framework applies when you are trying to treat someone vs. when you're trying to learn how to treat them. Sometimes those criteria are laxer (for example, you can use investigational protocols and things that aren't yet standard of care). Sometimes they're not.

"Should the doctors be asking patients to fill out a consent form each time they give a questionnaire? Should the forms be in pen, even though mental patients can use pends to hurt themselves?"

Again, different contexts. They happen to be in the same locations, but very different contexts. For example, your doctor can access your entire medical record vs. a hospital's EHR system for your care. That does not mean that the doctor should be able to access all your information, for all time, for any reason.

"I don't know, but I do know that having bizarrely different criteria for a study versus normal medical practice seems unwise."

Most IRB regulations exist for a reason, and no, study vs. normal medical practice are different, because while the risks may be the same, who bears the benefits are not. The entire purpose of the IRB is to ensure the risks to the study subjects are kept to a minimum because they are also unlikely to bear most of the benefits. Medical ethics works on the ethics of a single patient.

They are different by design, and through long experience.

"I agree that it would be good to have better-defined criteria for stopping the study. It sounds like Scott had an informal goal of 100 data points, but didn't manage to hit this goal."

I meant that it would be halted if the patients become "Violent". Violent is not a clear definition.

Though he also probably should have had power calculations in his protocol. IRBs really don't like studies that don't have a chance of producing information.

"I'm sure IRBs do some good. But it's really hard to take the IRB's side in this particular case. They weren't even following their own rules (the person asking him to take ethics training hadn't taken the training herself-- which would have probably torpedoed his study even if he had done everything else right.)"

The person asking the questions are not necessarily the people who make the decisions. For example, most of the questions at an IRB I worked with were asked by admins, because they've got dedicated time to do it, and can go through a checklist for things like "Is violence well defined?" or "You seem to have deviated from the normal way this goes..."

After that, it goes to a formal committee who take a lot of training, who either approve, disapprove, or ask other questions.

reality_czech · on May 4, 2018

I don't think "an entirely different ethical framework" should apply when you're "treating someone versus learning to treat them." First of all, "learning to treat someone" is not clearly separable from treating them. That's the whole point of medical residency-- you spend a bunch of time watching and helping doctors treat people, so that you can become one yourself. Ethical standards don't radically shift the moment a resident enters the room.

The concept is absurd. "Whoa guys! A resident just entered the room! Now we have to fill out all our forms in pen rather than pencil because the ethical standards just radically shifted! Now we're learning how to treat rather than treating!"

I think you are also missing one of the big points here, which is that the study protocol was exactly the same as what the doctors were doing anyway. So philosophical tangents like whether risks should be minimized more in research or treatment are irrelevant in this particular case. The risks are the same because the protocol is the same.

Fomite · on May 7, 2018

Residency is not research.

You can disagree that they should be entirely different ethical frameworks, but at the moment they are by Federal law.

And the risks might be the same, but the benefits may not be. And beyond that, it's the researcher's burden to show that, not the IRBs to take their word for it.

reality_czech · on April 24, 2018

Yes, publishers do certainly value a platform that is able to combat privacy. Although lately the combat has been a little one-sided.

reality_czech · on April 11, 2018

Python broke compatibility a lot more than once. Python 3.5 and Python 3.6 are not compatible, for example. There are even cases where code written for an older version of Python 3 will not work on a newer version of Python 3. For example, see PEP 479.

reality_czech · on March 25, 2018

Now in Smell-o-Vision, so that you can really take in the "ambiance"