Hacker News new | past | comments | ask | show | jobs | submit | moinnadeem's comments login

I would hold skepticism for the moment.

I know the authors from the blog post quite well. Say what you will about the firm, but one of the authors have been investing in machine learning since 2016, and another has a PhD in CS (including a SIGCOMM test of time award!)

I come from a strong ML background (multiple publications, PhD dropout), I would say that the canon is actually quite good.


> and another has a PhD in CS

Sorry to say but 'big deal'.

> one of the authors have been investing in machine learning since 2016

Ditto.

I have been doing something (in another field) since the mid 90's. I would say most people would consider me an expert. I get referrals for what I do from 'top' people investors in tech. I also went to what most would consider 'a top college'. I would never want to be positioned as being right or expert because of the amount of time I spent doing something or the college that I went to, or who trusts me, but actual things that I have done that point to my expertise (not a halo of some type).


Medium matters. Anyone making public statements should understand as much.


I agree with both you and the commenters roasting a16z, tbh.


I wouldn't go so far. I know the authors quite well, and as someone who has multiple publications in machine learning confeerences (and started a PhD in ML), they know their stuff well.


OK, thanks for reassuring.


Disclosure: I work at MosaicML

Yeah, I strongly agree. While Nvidia is working on better hardware (and they're doing a great job at it!), we believe that better training methods should be a big source of efficiency. We've released a new PyTorch library for efficient training at http://github.com/mosaicml/composer.

Our combinations of methods can train CV models ~4x faster to the same accuracy on CV tasks, and ~2x faster to the same perplexity/GLUE score on NLP tasks!


I've been seeing a lot more about MosaicML on my Twitter feed. Just wanted to ask -- how are your priorities different than, say, Fastai?


Instagram is down for me (located in the midwest). Reports "5xx server error"


Curious if anyone knows why they would use xx instead of the number?


The HTTP response says 503, so the text is just generic for any 5xx error.


How do you know the HTTP response was 503? (sorry for the silly questions, I'm not a software engineer)


I looked at the headers returned from the server using developer tools. 503 is "Service Unavailable" and you often see it when a proxy can't reach the backend server.


Whatsapp is reporting same error here(India) https://web.whatsapp.com/status.json


But not FB blue, interestingly enough for me.


What do you think about the train test discrepancy? ie. will practitioners have to fine-tune Nubia's models on their training dataset in order to evaluate on their test dataset?


Thanks for clarification mhkane.

At least 3 datasets go into making a NUBIA model:

- The general dataset used to train the language model before being fine-tuned to extract semantic similarity, logical entailment and grammaticality (ie Wikipedia)

- The dataset used to fine-tune the semantic similarity module and logical inference scorer

- The dataset used to predict human judgement

So far, the experiments have actually shown that without any finetuning, the NUBIA model trained to assess machine translations does better at agreeing with human judgement for image captions than the metrics specifically design to assess image captions.

For more advanced cases like, say, scoring medical reports where, for example, grammaticality doesn't matter as much, it may have to be fine-tuned. This is not unlike human training actually where experts are trained on "what to look for".

The nice thing with this modular architecture and the interpretable scores is that it can provide a lot of flexibility to study individual components and their emergent properties and make a judgement call on whether or not to fine tune.


The aggregators in Nubia are pretrained to correlate with human judgement, so it should only be used for inference, but the idea is that you can use it as a loss function to optimize translation/image captioning/summarization. It’s too big for that as is but thats what we’re working towards.


I think the question here is more along the lines of "If now, I have ,say, radiology reports, do I use Nubia out of the box or do I need to make it read radiology reports and have a sense of what high quality radiology reports look like before using it?"


oh I see thanks! will clarify.


MIT student here, there's really no rhyme or reason. Generally, higher numbers mean more advanced courses, but that's about it.


Jenks High School class of 2016, MIT class of 2020. Honestly, Jenks was an incredible high school to go to since it was public; having a $20M Math and Science center let me go to places like MIT.


My high school having a brand new computer lab completely rebuilt every year, an agricultural sciences program that put most corn country colleges to shame, and a $xx million a year budget surplus let me drop out at 16.


That’s good to hear. I went to OSSM in the 90s and the legislature tried to defund it.


> having a $20M Math and Science center let me go to places like MIT.

No it didn't.


My school not having something like that led to me being unable to apply to MIT.



Honestly, I think Cambridge MA may give you that environment.


Median home price in Cambridge is $800k. Certainly cheaper than the bay, but I wouldn’t consider that cheap by any stretch.


It _sounds_ like he plans on quitting his job and trying to "get by" for a while, hence needing the cheaper cost of living. While Cambridge would definitely scratch the intellectual itch I wouldn't exactly call it "cheap"...


Freshman at MIT here taking this class -- the lectures are actually taught in a flipped classroom format, so I wouldn't imagine they would release the course considering there are no lectures to follow. I could see them releasing problem sets, however.


Interesting! Who is your instructor? Tell them to release a hard copy!


Anyone have thoughts / ideas on what people need that developers could create for times like these? ie. What are the biggest problems in today's political climate which a developer may be able to solve?


The EFF probably has some things. https://github.com/EFForg/


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: