More

init · 2026-01-03T14:24:56 1767450296

I've built and worked on this exact problem before at bigtech, startup and personal projects.

Regex works well if you have a very limited set of sender and recipient accounts that don't change often

Bayesian or DNN classifiers work well when you have labeled data.

LLMs work well when you have a lot of data from lots of accounts.

You can even combine these approaches for higher accuracy

init · 2025-11-05T12:29:11 1762345751

  Location: Sweden, EU
  Remote: Yes (can work within UTC +/- 5)
  Willing to relocate: No
  Technologies: C++, Java, Python, Go, Linux, GCP, AWS, Terraform, Bazel, CI/CD, Github Actions, WebRTC, Data processing (big and small), Machine Learning and AI (Tensorflow, LLM APIs and local models).
  Résumé/CV: Available on request
  Email: (initproc) at (proton.me)

Looking for short-term or part-time consulting projects.

Background: 10+ years experience across startups and big tech (ex Google and Stripe). Lead and contributed to projects in compliance/security, distributed systems (Fintech, payments processing, real-time communication), machine learning and AI, as well as developer productivity and release engineering.

init · on March 13, 2025

6700 / 10000. I guess because I didn't zoom on the map and got the time off by 34 years of the actual event.

The scoring could be improved.

samplank2 · on March 13, 2025

Thanks for the feedback. It's a hard game!

init · on Oct 14, 2024

This reminds me of Microsoft FrontPage more than 20 years ago.

ok_dad · on Oct 14, 2024

Well now I have to try it. I used Frontpage for all of my web sites as a kid and I miss it a lot.

hbogert · on Oct 16, 2024

The html it spit out was horrendous, but it was a good gateway to web development.

init · on July 21, 2024

> Can you explain in what way did this manifest in Sweden for instance? Or have any data/evidence besides some semi-vague claims about antisemitism and some immutable characteristics of European societies (which are hardly monolithic to begin with).

If you live in Sweden then you will also know that the state puts refugees in the same areas (Rinkeby, Vivalla, Tensta, etc ...). These areas are then labeled as unsafe because of a slightly elevated crime rate and because they're labeled unsafe, swedes start moving out and quality of services and house prices drop and the downward spiral continues until the area becomes a ghetto even though they're usually not that bad.

Although SFI exists to teach Swedish to immigrants, the quality of the teaching is not great in most schools.

That's where the integration effort stops.

Even professionals who move to Sweden for work have a hard time integrating in Swedish society. That's how you end up with people living in upscale parts of Stockholm for 10+ years and can barely form a sentence in Sweden.

RubberbandSoul · on July 21, 2024

> state puts refugees in the same areas (Rinkeby, Vivalla, Tensta, etc ...).

No it doesn't. Refugees are placed in municipalities all over Sweden but most choose to move to the big cities as soon as they can and end up in these districts because they are the cheapest.

> slightly elevated crime rate.

Citation needed. Compared to what? Casual crime is very high compared to traditional Swedish society. Also a lot of crime goes unreported because the locals don't trust the police to be able to do anything.

> That's where the integration effort stops.

Simply not true. There are oodles of integration efforts all over Sweden at many levels; public projects, local initiatives and on top of that immigration heavy areas gets more public funding than average for schools, after-school activities, park/street cleanings, etc.

> Even professionals who move to Sweden for work have a hard time integrating in Swedish society.

That's because Swedish is a small language and most professionals don't plan on staying. Most Swedish professionals speak English on a native speaker level and most large Swedish companies has English as the official corporate language. In my experience most non-English speakers that comes to Sweden spend their efforts on becoming fully proficient in English while the English speakers are delighted to find that they can use English everywhere in society. Learning Swedish has a very low priority and after a couple of years most expats grows tired of the cold, darkness, taxes, low salaries, etc.

EGreg · on July 21, 2024

In my opinion, refugees should be spread out and placed among neighbors who are willing to interact positively with them and invite them to stuff so they can integrate, and NOT allowed to relocate their residence for 5-10 years. That will be better for the country. Beggars can’t be choosers, they’re happy to get asylum.

How to enforce that: fine whoever sells/rents to them outside where they are supposed to live. And threaten to deport them if they move without the years passing or showing they’ve integrated / learned the language / culture etc.

Obviously, exceptions can be made for reasons of safety or being closer to a job they got, but then the same procedure should be followed (spread out and surrounded by neighbors willing to help integrate them).

They should also have access to resource to accelerate the cultural integration, like meetups and schools etc.

RecycledEle · on July 21, 2024

I like the idea of setting an immigration quota based on how many meighborhoods overwhelmingly vote to welcome immigrants, and then requiring the immigrants to live in those neighborhoods as a condition if their immigration status.

If the immigrants enrich the community, those who welcomed them get the enrichment.

If the immigrants bring crime and disease, those who welcomes them get the crime, disease, and decreased property values.

I love solutions that work whether my views are right or wrong.

May I steal that for part of my political platform?

kgwgk · on July 21, 2024

It's what has been happening lately in the US. Not based on quotas but on putting people on buses:

https://www.nytimes.com/2024/07/20/us/abbott-texas-migrant-b...

It has had interesting consequences:

https://www.nytimes.com/2024/02/27/nyregion/adams-deport-mig...

https://www.nytimes.com/2024/03/18/nyregion/shelters-are-ove...

https://www.nytimes.com/2024/05/25/nyregion/hotels-prices-mi...

EGreg · on July 21, 2024

Sure. It sounds very bottom-up and libertarian, the kind of libertarian I am is exactly this thing … making a new bottom-up system with software, giving people the tools to self-organize, get critical mass in various local areas, and then using the new tech to bring about change by working with the old top-down structures.

Facebook and Uber and AirBNB did it in social networking and transportation and housing, respectively, starting in colleges and cities.

I am doing it for all kinds of things, and sometimes selling it to political campaigns such as I did with https://qbix.com/yang2020.pdf but that is not really my goal, just to help some politicians was never my goal

I put out a few apps like Groups on iOS and so far we attracted a million small community leaders in 100 countries, who have our app on their phones. So the first phase (bottom-up) is under way

I even launched blockchain applications worldwide, that are actually helpful: https://intercoin.org/applications

including working on launching a fund for refugees that will be crowdfunded by people worldwide: https://community.intercoin.app/t/fund-for-refugees/2688

Years I go I met with Rohingya Project guys and working together to create the R Coin, Identity and Academy on decentralized platform for the Rohingya refugees: https://rohingyaproject.com/platform/

Now this year for the first time, we got a VC (Balaji’s fund) leading our round for Network States. Balaji is a big proponent of these (kind of like Estonia’s e-residency), his fund also has Naval Ravikant, Fred Wilson and others on their investment commitee… basically a lot of people involved in Web3 (CoinBase, CoinList, etc.)

I’m going to Singapore on Sept 22nd for their conference to meet with Vitalik and others: https://balajis.com/p/network-state-conference

So if you’re serious about doing the first part of the solution (software) I recommend you can do it in software, and working on the ground with small towns and neighborhoods. I already have a platform doing just that, so if you want to do it locally, we can reach out about doing something together. We’re eventually looking to go to every part of the world, but currently we’re at a stage of just doing local pilots. Look at my profile and you can email me.

And/or come to the Singapore conference on September 22nd and let’s all meet and discuss there in person :)

But PS: our platform isn’t only about resettling refugees, although it is a big part. It’s about dating, job boards, local currencies, and much more. I think that if Donald Trump and Co get into office again, there will be a huge “crypto summer” but we need to use crypto for actual applications like the one I mentioned, with global donation crowdfunding and transparency and benefitting the stateless people on the ground, instead of crazy ponzi schemes round 4 LOL.

qb1 · on July 21, 2024

Do it! Start-up meet-ups and find a way to make the labor, especially the idle labor, more productive. This is entreprenuerism. It is also hard and then there are the costs--who pays?

KennyBlanken · on July 21, 2024

> Citation needed. Compared to what? Casual crime is very high compared to traditional Swedish society. Also a lot of crime goes unreported because the locals don't trust the police to be able to do anything.

Citation needed? You haven't provided a single citation for any of the wild claims you've made.

ryan93 · on July 21, 2024

Don’t need a citation for that

Ma8ee · on July 21, 2024

It’s not the state that put the immigrants there, but outside those areas it’s almost impossible to find something to rent, and as an immigrant without a steady income it’s kind of impossible to buy anything.

init · on April 24, 2024

Making their APIs easy to use like they used to be 10 years ago will be equivalent to releasing their core assets. In the past you could do almost anything from the Facebook API that you could do with their web or mobile app.

They release a lot of open source stuff as other commenters have mentioned but you can't build a Facebook or Instagram competitor just by integrating those components.

init · on April 10, 2024

I've used both Code Search and Livegrep. No, Livegrep does not even come close to what Code Search can do.

Sourcegraph is the closest thing I know of.

isker · on April 10, 2024

Agreed. There are some public building blocks available (e.g. Kythe or meta's Glean) but having something generic that produces the kind of experience you can get on cs.chromium.org seems impossible. You need such bespoke build integration across an entire organization to get there.

Basic text search, as opposed to navigation, is all you'll get from anything out of the box.

init · on April 10, 2024

In a past job I built a code search clone on top of Kythe, Zoekt and LSP (for languages that didn't have bazel integration). I got help from another colleague to make the UI based on Monaco. We create a demo that many people loved but we didn't productionize it for a few reasons (it was an unfunded hackathon project and the company was considering another solution when they already had Livegrep)

Producing the Kythe graph from the bazel artifacts was the most expensive part.

Working with Kythe is also not easy as there is no documentation on how to run it at scale.

isker · on April 10, 2024

Very cool. I tried to do things with Kythe at $JOB in the past, but gave up because the build (really, the many many independent builds) precluded any really useful integration.

I did end up making a nice UI for vanilla Zoekt, as I mentioned elsewhere: https://github.com/isker/neogrok.

birktj · on April 10, 2024

I see most replies here ar mentioning the the build integration is what is mainly missing in the public tools. I wonder if nix and nixpkgs could be used here? Nix is a language agnostic build-system and with nixpkgs it has a build instructions for a massive amount of packages. Artifacts for all packages are also available via hydra.

Nix should also have enough context so that for any project it can get the source code of all dependencies and (optionally) all build-time dependencies.

jeffbee · on April 10, 2024

Build integration is not the main thing that is missing between Livegrep and Code Search. The main thing that is missing is the semantic index. Kythe knows the difference between this::fn(int) and this::fn(double) and that::fn(double) and so on. So you can find all the callers of the nullary constructor of some class, without false positives of the callers of the copy constructor or the move constructor. Livegrep simply doesn't have that ability at all. Livegrep is what it says it is on the box: grep.

humanrebar · on April 10, 2024

The build system coherence provided by a monorepo with a single build system is what makes you understand this::fn(double) as a single thing. Otherwise, you will get N different mostly compatible but subtly different flavors of entities depending on the build flavor, combinations of versioned dependencies, and other things.

jeffbee · on April 10, 2024

Sure. Also, if you eat a bunch of glass, you will get a stomach ache. I have no idea why anyone uses a polyrepo.

humanrebar · on April 10, 2024

The problem with monorepos is that they're so great that everyone has a few.

refulgentis · on April 10, 2024

God that is good.

yencabulator · on April 12, 2024

Nix builds suck for development because there is no incrementality there. Any source file changes in any way, and your typical nix flake will rebuild the project from scratch. At best, you get to reuse builds of dependencies.

tayo42 · on April 10, 2024

Is there like a summary of what's missing from public attempts and what makes it so much better?

sdesol · on April 10, 2024

The short answer is context. The reason why Google's internal code search is so good, is it is tied into their build system. This means, when you search, you know exactly what files to consider. Without context, you are making an educated guess, with regards to what files to consider.

riku_iki · on April 10, 2024

How exactly integration with build system helps Google? Maybe you could give specific example?..

isker · on April 10, 2024

Try clicking around https://source.chromium.org/chromium/chromium/src, which is built with Kythe (I believe, or perhaps it's using something internal to Google that Kythe is the open source version of).

By hooking into C++ compilation, Kythe is giving you things like _macro-aware_ navigation. Instead of trying to process raw source text off to the side, it's using the same data the compiler used to compile the code in the first place. So things like cross-references are "perfect", with no false positives in the results: Kythe knows the difference between two symbols in two different source files with the same name, whereas a search engine naively indexing source text, or even something with limited semantic knowledge like tree sitter, cannot perfectly make the distinction.

dmoy · on April 11, 2024

Yes the clicking around on semantic links on source.chromoum.org is served off of an index built by the Kythe team at Google.

The internal Kythe has some interesting bits (mostly around scaling) that aren't open sourced, but it's probably doable to run something on chromium scale without too much of that.

The grep/search box up top is a different index, maintained by a different team.

sdesol · on April 10, 2024

If you want to build a product with a build system, you need to tell it what source to include. With this information, you know what files to consider and if you are dealing with a statically typed language like C or C++, you have build artifacts that can tell you where the implementation was defined. All of this, takes the guess work out of answering questions like "What foo() implentation was used".

If all you know are repo branches, the best you can do is return matches from different repo branches with the hopes that one of them is right.

Edit: I should also add that with a build system, you know what version of a file to use.

j2kun · on April 10, 2024

Google builds all the code in its momnorepo continuously, and the built artifacts are available for the search. Open source tools are never going to incur the cost of actually building all the code it indexes.

DannyBee · on April 11, 2024

The short summary is: It's a suite of stuff that someone actually thought about making work together well, instead of a random assortment of pieces that, with tons of work, might be able to be cobbled together into a working system.

All the answers about the technical details or better/worseness mostly miss the point entirely - the public stuff doesn't work as well because it's 1000 providers who produce 1000 pieces that trade integration flexibility for product coherence. On purpose mind you, because it's hard to survive in business (or attract open source users if that's your thing) otherwise.

If you are trying to do something like make "code review" and "code search" work together well, it's a lot easier to build a coherent, easy to use system that feels good to a user if you are trying to make two things total work together, and the product management directly talks to each other.

Most open source doesn't have product management to begin with, and the corporate stuff often does but that's just one provider.

They also have a matrix of, generously, 10-20 tools with meaningful marketshare they might need to try to work with.

So if you are a code search provider are trying to make a code search tool integrate well with any of the top 20 code review tools, well, good luck.

Sometimes people come along and do a good enough job abstracting a problem that you can make this work (LSP is a good example), but it's pretty rare.

Now try it with "discover, search, edit, build, test, release, deploy, debug", etc. Once you are talking about 10x10x10x10x10x10x10x10 combinations of possible tools, with nobody who gets to decide which combinations are the well lit path, ...

Also, when you work somewhere like Google or Amazon, it's not just that someone made those specific things work really well together, but often, they have both data and insight into where you get stuck overall in the dev process and why (so they can fix it).

At a place like Google, I can actually tell you all the paths that people take when trying to achieve a journey. So that means I know all the loops (counts, times, etc) through development tools that start with something like "user opens their editor". Whether that's "open editor, make change, build, test, review, submit" or "open editor, make change, go to lunch", or "open editor, go look at docs, go back to editor, go back to docs, etc".

So i have real answers to something like "how often do people start in their IDE, discover they can't figure out how to do X, leave the IDE to go find the answer, not find it, give up, and go to lunch". I can tell you what the top X where that happens is, and how much time is or is not wasted through this path, etc.

Just as an example. I can then use all of this to improve the tooling so users can get more done.

You will not find this in most public tooling, and to the degree telemetry exists that you could generate for your own use, nobody thinks about how all that telemetry works together.

Now, mind you, all the above is meant as an explanation - i'm trying to explain why the public attempts don't end up as "good". But myself, good/bad is all about what you value.

Most tradeoffs here were deliberate.

But they are tradeoffs.

Some people value the flexibility more than coherence. or whatever. I'm not gonna judge them, but I can explain why you can't have it all :)

init · on Oct 30, 2023

Henry Kissinger has a heavy German accent that didn't prevent him from becoming one of the most influential American politicians of the 20th century.

fsckboy · on Oct 30, 2023

Henry Kissinger was also known as a ladies man. Once he was in his hotel room with a pretty woman when a world crisis broke out that required his attention. However, he was not answering the phone, so a desk clerk was sent up to the room. He knocked on the door and said "Mr Kissinger, I have a message for you". From behind the door he heard, "Go avey!" but it was important so he knocked again and said "Mr Kissinger, it is urgent that I speak to you!" and again "Go avey!" so for the third time he said "It is urgent, are you Kissinger!?" and the reply "No! I'm fuckingher! Now go avey!"

My gf's mother told me that joke back in the day, with a very heavy South American accent, but it still worked, maybe a little better because she said "Kissin-gher".

I saw Henry himself just a few years ago, right before Covid, in a NYC restaurant. He's extremely old, but he seemed very together.

You might call him a "statesman", but he wasn't precisely a politician. Also, the post WWII/Cold War era opened the door, so to speak, to a large number of displaced Europeans, scholars, to give advice about East and Middle European issues, advanced science, etc. Zbigniew Brzezinski, Werner von Braun also come to mind.

init · on July 12, 2023

I worked on something like this more than a decade ago back when bayesian classifiers were SOTA for sentiment analysis. It's relatively easy to use an existing model to bootstrap and fine tune a new model. The hardest part is collecting and cleaning up the data for training.

PaulHoule · on July 13, 2023

That’s exactly it, the thought of finding and validating 5000 angry toots makes me feel sick!

init · on July 10, 2023

Unfortunately this has already been happening for a decade or more since the NATO "intervention" in Libya and it has little to do with climate change.

csomar · on July 10, 2023

The NATO intervention has nothing to do with this. The previous dictator created the threat to dissuade the EU from removing him. Now that he's out of the picture, the inevitable happened.

A good lesson is to stop propping dictators in exchange for short-term stability. Based on the current events, the EU did not learn that lesson.

atlantic · on July 10, 2023

No, the lesson is to stop interfering in other countries' internal affairs. The so-called refugees all originate from countries that suffered western military operations. "Humanitarian" intervention is the new colonialism.

sharikous · on July 10, 2023

Pakistan did not suffer "Western military operations", at least not recently and at a big scale, and half of the refugees in the ship were probably Pakistani.

"Humanitarian" intervention has always been an excuse to conquer, at least from the times of the Persian Empire or so.

And there have been refugee waves wherever there had been war and famine - that is all over the globe, with nothing explicitly indicating the "western interventions" are outsiders

atlantic · on July 10, 2023

The vast majority of refugees crossing the Mediterranean are from countries torn apart by western-instigated civil unrest or war: Syria, Iraq, Afghanistan, Lybia, and various north African countries subject to "color revolutions". This isn't even remotely controversial. Pakistanis, Indians, Nepalese, generally fly in with tourist or student visas, and then try to find work.

sharikous · on July 10, 2023

Read the article, in that specific ship it is believed that half were from Pakistan

atlantic · on July 10, 2023

Even if this particular ship is an outlier, the broader point stands regarding the make-up of refugee populations.

And concerning Pakistan, former PM Imran Khan stated his fall from power was caused by US interference. That counts as major.