Hacker News new | past | comments | ask | show | jobs | submit | dougabug's comments login

No, your constant output example isn’t what people are talking about with “hallucination.” It’s not about destroying information from the input, in the sense that it you asked me a question and I just ignored you, I’m not in general hallucinating. Hallucinating is more about sampling from a distribution which extends beyond what is factually true or actually exists, such as citing a non-existent paper, or inventing a historical figure.


Sure, but if we label the text as “factually accurate” or “logically sound” (or “unsound”) etc., then we can presumably greatly increase the probability of producing text with targeted properties


What on Earth makes you think that training a model on all factual information is going to do a lick in terms of generating factual outputs?

At that point, clearly our only problem has been we've done it wrong all along by not training these things only on academic textbooks! That way we'll only probabilistically get true things out, right? /s


A pixel (in the context of an image) could be “wrong” in the sense that its assigned value could lead to an image that just looks like a bunch of noise. For instance, suppose we set every pixel in an image to some random value. The resulting would look like total noise and we humans wouldn’t recognized it as a sensible image. By providing a corpus of accepted images, we can train a model to generate images (arrays of pixels) which look like images and not, say, random noise. Now it could still generate an image of some place or person that doesn’t actually exist, so in that sense the pixels are collectively lying to you.


My recollection of CLIP is that it’s more of a text-language co-embedding, where you have two transformers, one which encodes images into vectors and one which encodes captions into vectors. Through a contrastive loss (positive pairs are captioned image pairs, negative pairs are random image-caption combinations), embeddings of positive image-caption pairs are brought together (i.e., made similar) and embeddings of negative image-caption pairs are made more dissimilar.

https://openai.com/index/clip/


I would assume some of them can write Verilog code


Well, I clearly had to try it even though I don't know my verilog from my veri-println. Can someone who does know, rank this?

--

Prompt: verilog code for 4 bit binary adder

[I'm skipping the English description of the code, this is just the code part of free ChatGPT's response, no cherry picking]

  module binary_adder(A, B, Cin, S, Cout);
    input [3:0] A, B;
    input Cin;
    output [3:0] S;
    output Cout;
    
    wire C0, C1, C2;
    
    // Full adder for bit 0
    full_adder FA0(A[0], B[0], Cin, S[0], C0);
    
    // Full adder for bit 1
    full_adder FA1(A[1], B[1], C0, S[1], C1);
    
    // Full adder for bit 2
    full_adder FA2(A[2], B[2], C1, S[2], C2);
    
    // Full adder for bit 3
    full_adder FA3(A[3], B[3], C2, S[3], Cout);
    
  endmodule
  
  module full_adder(A, B, Cin, S, Cout);
    input A, B, Cin;
    output S, Cout;
    
    assign S = A ^ B ^ Cin;
    assign Cout = (A & B) | (A & Cin) | (B & Cin);
    
  endmodule


It seems correct, but verbose. GPT-4 gives the more concise and idiomatic:

  module full_adder(input [3:0] a, input [3:0] b, input cin, output reg [3:0] sum, output reg cout);

  always @(*) begin
      {cout, sum} = a + b + cin;
  end

  endmodule


We found that Codex was pretty bad at Verilog and so we fine-tuned our own models on Verilog code, starting from the SalesForce CodeGen model. The paper will be at DATE: https://arxiv.org/abs/2212.11140

The largest model we trained is here: https://huggingface.co/shailja/fine-tuned-codegen-16B-Verilo...


Data can be crowd sourced, too. Wikipedia demonstrated that crowdsourced data can be pretty competitive.

More recently the open LAION data sets have become widely used by both tech giants and independent researchers.


> Wikipedia demonstrated that crowdsourced data can be pretty competitive.

The problem is DL is really sensitive to dirty data, disproportionately so.

At $DAYJOB once we cleaned the dataset, removed a few mislabeled identity/face pairs (very few, about 1 in 1e4) and the metrics goes up a lot.


You need to be very careful about making sweeping generalizations based on a single personal anecdote. The really large data sets typically have very high error rates and sample biases. For instance, Google’s JTF300M is far noisier than ImageNet, which itself is hardly free of errors and biases. Any data set with hundreds of millions to billions of images will generally contain a large proportion of images and labels scraped from the web, w/ automatic filtering or pseudolabeling, perhaps w/ some degree of sampled verification by human labelers.

In fact, generally DL is quite tolerant to label noise, especially using modern training methods such as SSL pretraining.

https://arxiv.org/pdf/1705.10694.pdf https://proceedings.neurips.cc/paper/2018/file/a19744e268754... https://proceedings.mlr.press/v97/hendrycks19a.html


If it faithfully memorized and reproduced a set of watermarks, it would be premature to conclude that it hadn’t memorized other (non-generic) graphical elements.


Yes, the future is now. It’s still early, extracted model quality will undoubtedly improve dramatically over time.

https://blogs.nvidia.com/blog/2022/09/23/3d-generative-ai-re...

https://research.nvidia.com/publication/2021-11_extracting-t...


The first AI generated feature length film I’d want to see would be, “It’s a Wonderful Life, Calculus,” where we get to see what human history would’ve been like if we’d never stumbled across Calculus.

So the basic idea at the heart of Calculus is that when you break a problem up into tiny enough pieces, in the right way, the pieces become simpler to analyze and approximations become much more accurate. Like, if I break a complex surface into triangles, if the triangles are small enough, many of the physical properties of interest can be computed using the little triangles.

The two main manifestations of this principle are differentiation and integration. With differentiation, the simplification which comes from breaking a function into tiny pieces is that the pieces behave asymptotically like linear functions.

For integration, the simplification becomes that the tiny fragments eventually tend to have approximately uniform density, so the mass of the whole body is the sum of the masses of all the tiny bits, each of which is just the volume of the fragment times its density.

Differentiation and Integration are in some sense inverses, or opposite sides, of single greater idea. In one dimension, the derivative of the (indefinite) integral of a function equals the original function, and the integral of the derivative of a function is that function plus a constant. Almost inverses of each other, but not quite, and this nuance is where some care is required in developing your understanding of the subject.


That about matches my fuzzy understanding, but the big picture is the struggle. Like as submitted in a comment here about the venerable for loop being rooted in calculus, it makes sense when it's explicitly pointed out, but I haven't established a framework to draw those connections intuitively.

To generalize further, given a solution where the use of calculus is explained then that isolated example is comprehensible, but given a problem there is nothing that sparks a "I know, I can solve this with calculus!" I don't feel the same way about other mathematical disciplines that I have studied, even those of which I have spent far less time studying.

Perhaps it's simply harder to understand than other areas of math, but then I think that goes against the idea that is easy for kids to understand.


Here's how I see the relationship:

Calculus helps define the underlying rules for the higher-level (simpler by appearance) math we use daily. "I know, I can solve this with calculus!" is unlikely to ever come up, but the vague idea that there's something there you can dig into when you need to can be helpful in rare edge cases, where other people might be lost.

An example using programming languages: If all you've ever been exposed to was python, and no CS, you may never have considered why using "insert" on a list may be slow. Python presents it as a single function call, so you probably think of it as a single operation and don't go any further. That's the equivalent of the higher-level (simpler by appearance) math. But if you've been exposed to something lower-level, like C where you may well have implemented "insert" yourself on an array, or general CS concepts where you had to use big-O notation, you'll probably have in the back of your mind "yeah, that's not a single operation, it's doing more stuff in the background". Usually not something you need to think about, until you hit that edge case where it's suddenly running really slowly.

Remember very early on in education when you had to memorize various equations like area of a circle? Those equations can be generated from basic calculus. One I could never remember was area of a sphere, until one day when I was bored at my part-time job, found a pencil and scrap of paper, and decided to see if I could use what I'd just learned in class to derive it. And it worked, and I've never forgotten that equation since, because instead of it just being a series of numbers and letters to memorize, each part now has meaning.


That makes sense, but suggests that calculus is perhaps the most difficult concept to wrap one's head around, which flies in the face of the idea that is easy to teach to children. It is not clear where the breakdown occurs here.


Calculus is not easy to teach to children. We fail to teach it adequately to most college students in their first two years of study. Even a few historically noteworthy mathematicians failed their first contact with the subject.

Elementary linear algebra is far easier to understand and motivate. We can deal with finite, concrete examples without having to delve into the subtle complexities of limits, continuity, and infinity.


But it was said that "since kids can easily understand it" and it is not clear how that resolves.


That statement was clearly false. With an amazing teacher, an extremely bright student, focus, patience, time, etc., sure we can have the next Galois… but in the vast majority of cases, we should avoid setting kids up for failure by expecting them to easily grasp things which took humanities greatest minds centuries to grasp. Newton “invented” Calculus in the 17th Century, but these ideas had been percolating since Archimedes and even before going back two millennia.


Calculus was quite difficult for human civilization to get a logically sound handle on; basically, it took the better part of two centuries from Newton’s original formulation of Calculus in the 17th Century to the work of Cauchy, Gauss, Weierstrass, Dedekind, Riemann, etc working throughout the 19th Century to develop rigorous foundations for Analysis (the modern name for the subject). That’s where all the epsilon-delta business comes in. But this machinery is totally overwhelming for the vast majority of children and teenagers without signicant context, motivation, and guidance.

The analogy to a ‘for loop’ in programming is pretty direct to a mathematical summation. For S := 0, i := 1 .. N { S := S + a[i] } differs only in notation from the standard sigma notation for a finite series.

You don’t need to look far for applications of Calculus. Any simulation of a physical system, such as the Solar System, navigating the DART space vehicle into an asteroid; modeling climate, nuclear explosions, fluid dynamics, structural stability, propagation of sound through matter, modeling and manipulating the properties of semiconductors through controlled diffusion of dopants, chemical engineering, thermodynamics, electrodynamics, optics, quantum mechanics, hypersonic missile flight, even Computer Graphics has the Rendering Equation at its heart …

Mastery of Calculus (and the ability to apply it to model dynamics) is pretty much the dividing line between Classical Ancient civilizations, and the modern world.


I would actually start with a first pass at elementary linear algebra, before taking a serious crack at even single variable Calculus. Then let the key ideas marinate in your brain; develop some facility and intuition / insight.

Multivariable Calculus hardly makes any sense without a basic amount of linear algebra. It’s a bit of a historical accident that (Western) mathematics developed Calculus extensively prior to linear algebra taking shape as a significant subject. Don’t try to tackle Multivariable Calculus with a decent grip on matrices. Maybe do some computer graphics and learn how to build 2D & 3D shapes and animate them using matrices.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: