Hacker Newsnew | past | comments | ask | show | jobs | submit | nivekkevin's commentslogin

what's generally the go-to these days for non server-less?

We use vultr. So far all is good. May be a few bucks a month more, but very minor in the grand scheme of things.

Hetzner?

"Loved by Teams Worldwide"

searched three names, cannot find any match


The issue I have with these wrapper services is privacy.

> We collect several types of information from and about users of our Services:

> ...

> Prompts and Queries: The text prompts and questions you submit to AI models


I built something similar for my own personal use case where it allows any LLM to float on top of all windows and chat with it. And I do not collect such info. Your data only goes to the LLM company which you are interacting with:

https://apps.apple.com/ca/app/select-to-search-ai-assistant/...


Stay tuned, we have a plan which we are anticipating to be available in a year or less, that will not store anything on our servers.


if it is live, oh man they better have a good moderation system


p99 for the same reason has been used widely in monitoring systems and benchmarks.


it's a rolling buffer, so it just upsert index % 4 in this case


Thanks, so does that mean position within the buffer is irrelevant?


it does feel like so, the position eventually loses its meaning as more and more data gets crunched by the training process, eventually it's just a context of the past 4 tokens it feels like


The annoying (?) part of Scala Spark is the lack of notebook ecosystem. Also spark-submit requires a compiled jar for Scala yet only the main python script for Python. I would've loved Scala Spark if the eco system was in place.


What about Zeppelin?


Why does this look extremely sketchy?


One significant disadvantage of PySpark is its reliance on py4j to serialize and deserialize objects between Java and Python when using Python UDFs. This constant overhead can become burdensome as data volume increases in such an exchange. However, I am glad to see efforts to create a data pipeline framework using Python and Ray.

~One suggestion, a Scala/Java Spark run of those benchmarks should be a valid baseline to compare against as well instead of PySpark.~ Ah it's SparkSQL so the execution probably wouldn't have much of py4j involvement, except for the collect.


There is also pandas udfs, which uses arrow as the exchange format. I assume it still has to copy the data (?), but it makes the (de)serializarion fast, and allows for vectorized operations.

https://spark.apache.org/docs/3.0.0/sql-pyspark-pandas-with-...


None of the benchmarks involved any UDFs.


one thing I find to be a nice touch is how the virtual 3D camera view tries to mimic the real-world camera view angle and position


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: