Hacker Newsnew | past | comments | ask | show | jobs | submit | taw1285's commentslogin

For the less experienced devs, how should I be thinking about choosing between this vs Amazon Aurora?


I don't think either is a bad choice, but Aurora has some advantages if you're not a DB expert. Starting with Aurora Serverless:

- Aurora storage scales with your needs, meaning that you don't need to worry about running out of space as your data grows. - Aurora will auto-scale CPU and memory based on the needs of your application, within the bounds you set. It does this without any downtime, or even dropping connections. You don't have to worry about choosing the right CPU and memory up-front, and for most applications you can simply adjust your limits as you go. This is great for applications that are growing over time, or for applications with daily or weekly cycles of usage.

The other Aurora option is Aurora DSQL. The advantages of picking DSQL are:

- A generous free tier to get you going with development. - Scale-to-zero and scale-up, on storage, CPU, and memory. If you aren't sending any traffic to your database it costs you nothing (except storage), and you can scale up to millions of transactions per second with no changes. - No infrastructure to configure or manage, no updates, no thinking about replicas, etc. You don't have to understand CPU or memory ratios, think about software versions, think about primaries and secondaries, or any of that stuff. High availability, scaling of reads and writes, patching, etc is all built-in.


They have a very nice comparison in terms of performance and price https://planetscale.com/benchmarks/aurora


It will be faster and a lot easier to use than Aurora.


Curious if anyone has applied this "Skills" mindset to how you build your tool calls for your LLM agents applications?

Say I have a CMS (I use a thin layer of Vercel AI SDK) and I want to let users interact with it via chat: tag a blog, add an entry, etc, should they be organized into discrete skill units like that? And how do we go about adding progressive discovery?


This tracks for me. I have deleted TikTok and Instagram but now I find myself browsing X short videos!! Addiction is a crazy thing.

I have a daily 30 minute one way commute. I usually put on a YouTube video about startup or tech talk. But I find myself forgetting it all the day after. I am curious how you go about remembering the content without being able to take notes while driving.


Information for its own sake to obtain doesn't have any lasting effect, it makes sense why you forget. Try to intake the information and have it cue a relation to your life, have it spark some internal thought. I'm struggling to articulate this, I've always been "a thinker", just think about things all day. I rarely finish books because whatever I read I think about it for so long.

It's my own personal reflection on information, knowledge, and learning, I hesitated to write this comment but I did at the chance it helps.

Information is basically a commodity these days. The leverage is in how the info informs your thoughts.


are you watching talks while driving?

One thing I've tried recently, was that going no-nothing while driving: so no music, radio, nothing, just me and my thoughts.

It's been immensely pleasurable, like I've rediscovered myself.

But I still have an issue with finding a good long form video to watch while washing up, or shorts while I'm waiting for CI to finish at work, etc. I need to find something else to do.

something along the lines of "you can't remove an addiction habit, you can only replace it"


I have YouTube.com and X.com IP blocked on this computer for exactly that reason.

Because I noticed I have zero self-control with the short-term video format. So now I don't touch it and consider it similar to cigarettes.


You don't. This is where taking public transport to work shines.


Thank you for this article. I have yet to discuss with my doctor about this. But I have noticed several issues that are severely lacking for me compared to my peers:

1. My brain drifts away very easily. Even in an important work conversation, my brain just starts thinking about a completely different project or upcoming meeting. 2. I have a hard time remembering things/events that my spouse and others can easily recall (ie: which restaurants we have been to) 3. I can't seem to form an opinion on very basic things like do you like restaurant A or restaurant B better? do you like option A or option B? I can't decide or come up with any heuristics.

At first I chalk it up to I am being too critical about myself and others are having the same issue. But that doesn't seem to be the case. Can these all be rolled up in the same conversation with my doctor?


Probably all part of possible ADHD, except point 2, which might be a sign of SDAM (Severely Deficient Autobiographical Memory). It’s recent and not well known.


This is so amazing. Are there any resources or blogs on how people do this for production services? In my case, I need to rewrite a big chunk of my commerce stack from Ruby to Typescript.


Your comment really helps me improve my mental model about LLM. Can someone smarter help me verify my understanding:

1) at the end of the day, we are still sending raw text over LLM as input to get output back as response.

2) RAG/Embedding is just a way to identify a "certain chunk" to be included in the LLM input so that you don't have to dump the entire ground truth document into LLM Let's take Everlaw for example: all of their legal docs are in embeddings format and RAG/tool call will retrieve relevant document to feed into LLM input.

So in that sense, what do these non-foundational models startups mean when they say they are training or fine tuning models? Where does the line end between inputting into LLM vs having them baked in model weights


(1) and (2) are correct (well, I don’t know specifics of Everlaw). Fine tuning is something different, where you incrementally train the model itself further using more inputs, so that given the same input context it will produce better output in your use case.

To be more precise, you seldom directly continue training the model, because it’s much cheaper and easier to add some more small layers to the big model and train those instead (see LoRA or Peft).

Something like Everlaw might do all three, by fine tuning a model to do better at discovery retrieval, then building a RAG system on top of that.


I am fairly new to all this data pipeline services (Databricks, Snowflakes etc).

Say right now I have an e-commerce site with 20K MAU. All metrics are going to Amplitude and we can use that to see DAU, retention, and purchase volume. At what point in my startup lifecycle do we need to enlist the services?


A non-trivial portion of my consulting work over the past 10 years has been working on data pipelines at various big corporations that move absurdly small amounts of data around using big data tools like spark. I would not worry about purchasing services from Databricks, but I would definitely try to poach their sales people if you can.


Just curious, what would you consider, "absurdly small amounts of data around using big data tools like spark" and what do you recommend instead?

I recently worked on some data pipelines with Databricks notebooks ala Azure Fabric. I'm currently using ~30% of our capacity and starting to get pushback to run things less frequently to reduce the load.

I'm not convinced I actually need Fabric here, but the value for me has been its the first time the company has been able to provision a platform that can handle the data at all. I have a small portion of it running into a datbase as well which has been constant complaints about volume.

At this point I can't tell if we just have unrealistic expectations about the costs of having this data that everyone wants, or if our data engineers are just completely out of touch with the current state of the industry, so Fabric is just the cost we have to pay to keep up.


One financial services company has hundreds of Glue jobs that are using pyspark to read and write less than 4GB of data per run. These jobs run every day.


I'm aware of a govt agency with a few hundred gb of data using Mongo, Databricks and were being pushed towards Snowflake as well. Boggles the mind.


I used to do similar work. Back in the day I used 25 TB as the cut off point for single node design. It’s certainly larger now.


Which is also a reason to not use Databricks, as they will cost your company money by selling gullible users things they don’t need.


This is very interesting to me. From this thread: https://news.ycombinator.com/item?id=43472971, I am wondering if there are anecdotal stories of how equity is being handled after a split.

On one hand, if the leaving co-founder retains all equity, it creates a sandbagging situation on a cap table that's no longer useful to the business. On the other hand, it feels right for the leaving co-founder to enjoy some upside for the years they put in.


This problem is what vesting is for.


Standard 4-year vesting doesn't work well for this situation. A founder leaving a worthless startup with 20% of the equity is a huge problem. The remaining founder will need that equity to offer outsize offers to senior (and eventually C-level) folks to replace them. And it's demoralizing for the remaining founder and team to be working extremely hard to make the company successful, while the departed founder reaps the rewards with no effort.

Opinions will differ here, but I think if you're leaving a pre-PMF startup you've created essentially no durable value, and should return nearly all of your equity.

I've heard of startups doing 10 year vesting for founders (with double trigger) to align this better.


I'm sympathetic. We tried to get 10 year vesting together at my last company. But 4/1 is industry standard, and my rebuttal is that partners just bake 4/1 expectations into their decisions. If the partnership is uncertain about a member 6 months in, and kicks the can down the road another 7 months, that's on the partnership, not the structure.


If you can't do 10 up front, you can usually reset founder vesting back every funding round to slow it down. This is fairly common.


You're counting on one reticent founder not jamming everything up, though.


Yeah, but this also offers a clear exit opportunity (during the raise), and limits the "blast radius" to time-since-last-raise, rather than progress against the first four years of the company.


I want to get better at taking project notes for work via Obsidian. I'm curious if you have a different page per project or do you just put everything in the same giant log? I like the idea of organizing it, but it takes me a bit of time to find out which notebook it should go under.


The beauty of Obsidian is that you don't have to file under a single notebook/directory. If you have a note that belongs to more than one category you can tag or link to both relevant notebooks.


Agentultra was talking about paper notebooks imho.


Love it! It would be cool to be able to auto tag cuisine type. Did you use an LLM to scrape and parse receipt details?


Actually adding that as we speak! Categories, cuisine, dietary, etc. Yeah I am using OpenAI, I tested with Claude but it was actually less reliable. FWIW, I am extracting the HTML and doing some cleanup work for it before passing it to the LLM with instructions.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: