RegExr: Learn, Build and Test Regex

nerdponx · on Jan 27, 2022

I (and most people I talk to about this stuff) tend to use Regex101 (https://regex101.com) for this purpose. It will be interesting to spend some time with RegExr and compare the two tools.

I am also aware of RegexBuddy (https://www.regular-expressions.info/regexbuddy.html), whose author also publishes very good regex learning content on their site. It looks great, but it's a closed-source Windows-only application, which means it's something I'll never be able to benefit from.

bloblaw · on Jan 27, 2022

RegexBuddy works perfectly on wine: https://www.regexbuddy.com/wine.html

Why does it matter if it's closed source?

I am a paying customer of RegexBuddy. Best $39 I've ever spent on software.

RegexBuddy's `debug` feature has no equal in open-source or commercial software: https://www.regexbuddy.com/debug.html

It has many more features than regex101 AND it keeps all data local. Maybe regex101 keeps things locally too, but I have to run it in a browser and I'm not about to put sensitive test data there.

That being said, regex101 is very well done, but I paid $39 for RegexBuddy 14 years ago (and price is still the same) and last year paid $19 for the optional upgrade from v3 to v4...but I really only did that to support the developer. v3 still met all my needs.

teh_klev · on Jan 28, 2022

I bought my copy of RegexBuddy in 2008. I should probably upgrade as well, if only to support the developer.

For me it's definitely been one of those "best money spent ever" tools as well.

fiatjaf · on Jan 27, 2022

https://www.regular-expressions.info/ is the best place to learn though, not because of the tools, but because the text is so good, so clear, you can learn without helper tools, just by reading, and become a regex master in one day.

libraryatnight · on Jan 27, 2022

Last time a regex conversation came up on HN someone turned me on to https://regexcrossword.com - which is good fun if you're someone who enjoys regex :)

geenew · on Jan 28, 2022

I’ve thought for a while that regex could make for a good competition. Given an input, write a regex that will get some output. Multiple competitors get the same test, first one to produce the output wins. In a tie, shortest regex wins :)

Would make for an entertaining show as the various malformed regexii produce partially correct outputs.

l30n4da5 · on Jan 27, 2022

I've used RegExr for a few years now. No real reason other than it was the first tool for writing regex that I found.

blahyawnblah · on Jan 27, 2022

This one is my go-to. But it would be a lot better if it didn't display an alert when you try to leave the site. I've started to try and find alternatives.

cdolan · on Jan 27, 2022

I don't get a warning to leave the site unless I've input data into the form. I find that to be a valuable feature and not a nag.

dmitriid · on Jan 27, 2022

> I've started to try and find alternatives.

https://regex101.com/

kroltan · on Jan 27, 2022

In Firefox you can set dom.disable_beforeunload to true in your about:config to disable this behaviour (globally)

a99c43f2d565504 · on Jan 27, 2022

I'd like a tool similar to this but for sed or awk usage. Like to see interactively what would be the output for given input and command. Particularly for the toolchain distributed with Ubuntu. I believe demand for such tool would be greater than just me. Let me know if you know one!

kbouck · on Jan 28, 2022

Ultimate Plumber https://github.com/akavel/up

HN discussion: https://news.ycombinator.com/item?id=26644110

laacz · on Jan 28, 2022

How about this one? https://awk.js.org/

nerdponx · on Jan 27, 2022

I was just talking about this yesterday. I too would really appreciate a tool like this!

viggity · on Jan 27, 2022

I think one reason why most people have a hard time reading regex is because they don't use any indentation or linebreaks. Honestly, if a buddy came to you and asked you to help him debug a javascript method and all 15 statements were on the same line, would you offer to help him, or tell him to fix his shit first so you can read it? What if it was all on one line and his variable names were all "v1", "v2", etc. Would you help him then? fuck no. And yet, this is standard operating procedure with regex, except you don't even get "v1", "v2" because nothing is labeled at all. v1/v2/... would be an improvement!

This is how most people write a simple date regex:

\d{1,2}/\d{1,2}/(\d{4}|\d{2})

And mind you, this is a very simple scenario. Here is how you would write it if you treated it like actual code:

(?<month>\d{1,2})

/

(?<day>\d{1,2})

/

(?<year>\d{4}|\d{2})

First off, you can know what my intent is when I'm capturing each group. Maybe this code gets used by a european where the month and day switch places. They can figure out how to fix it in like two seconds. Secondly, the forward slashes are not lost in a sea of characters anymore because we use whitespace like a civilized developer, not a regex savage.

If you want to keep things simple with regular expressions:

* Be liberal with what your pattern matches and use a normal programming language for your complicated conditional logic to filter out crap you don't want

* Don't be afraid to break up the search with multiple regular expressions

* Ignore pattern whitespace and use it to visually break up your pattern. Nobody would agree to debug javascript that has been minimized, yet people do this all the time with regex

* For the love of all that is holy, USE NAMED GROUPS. It is a fantastic way to document your intent.

nicoburns · on Jan 27, 2022

Part of the reason for this is that many languages don’t support these features. For example JavaScript supports neither extra white space or named capture groups in it’s regarded.

s1mon · on Jan 27, 2022

Rubular: a Ruby regular expression editor (https://rubular.com/r/mP6IRzteSm) is another option which is pretty minimal but useful.

freedomben · on Jan 28, 2022

I love Rubular and use it often. The "Regex Quick Reference" block is so incredibly useful too when you don't completely remember that one thing.

However it is closed source, and it sends your info to the server for processing. This is of course not (likely) nefarious, but it does give me a little pause, and means that you should never use it if the data you are putting into it is sensitive!. For example, don't test your API key parsing regex on it with real/active keys! Also for corporate dev you are (probably) violating company policy by using it.

Rubulex[1] is a neat open source clone that I use a lot. The only downside is I have to start it locally. One of these days I'll stand up a permanent instance, though I don't want to do that without auditing the code and I simply haven't had time to do that. If anyone has done so I'd love to hear about it. Scriptular[2] is an open source clone that uses javascript:

Side note: If anyone knows of or wants to build an Elixir regex tool in Phoenix/LiveView, I'd be willing to collaborate a bit and willing to host/maintain (and I'll pay for the VM/domain). I've already got some Phoenix apps running in prod so if you get it to work with `mix phx.server` I can take it from there (I know that operationalizing and devops isn't what usually interests most people). A non LiveView version would be cool too, but it seems like such a cool project to build with LiveView to be super responsive and show off what you can do (and also learn Elixir/Phoenix/LiveView with a simple app). Would be really neat to release an elixir-desktop[3] version too! My email is in my profile if you are interested.

[1]: https://github.com/ofeldt/rubulex

[2]: https://github.com/jonmagic/scriptular

[3]: https://github.com/elixir-desktop/desktop

avgcorrection · on Jan 27, 2022

(I’m too much of an idealist for my own good.)

I’m sure that these resources are great. And it’s not their fault that the regex family of languages evolved in the way that they did. But the simpler regex languages (without backreferences and other stuff… that I might not even know about) seem simple at first glance. In a perfect world I want to just spend and hour internalizing them forever. But in practice it seems that doubt always grips me, mostly because of the meta-syntax problem: did I unintentionally use some metacharacter in this part of the string which I meant to be “fixed”? So then I feel I have to “validate” it with some external tool. And suddenly it feels like this seemingly terse and agile language is just making me second-guess myself.

Cyberdog · on Jan 27, 2022

> did I unintentionally use some metacharacter in this part of the string which I meant to be “fixed”?

When in doubt, just throw a backslash in front of it, which always means "the next character is to be interpreted literally," even in cases where it's not necessary.

(Well, not always; the backslash will invoke a special character when thrown before some letters; eg, "\t" means the tab character. But normal letters never need to be escaped; just punctuation.)

burntsushi · on Jan 27, 2022

Fun fact: Rust's regex crate won't let you do this. If you try to escape a character that isn't a meta character, you get an error. So in cases like this, it will erase your doubt.

(There is ongoing discussion about relaxing this rule for some characters, since it is so common in some cases. For example, escaping / so common that folks try to do it with the regex crate and are surprised when it returns an error. / is rarely a regex meta character, rather, it tends to denote the start and stop of regexes, e.g., in Javascript or sed.)

somehnguy · on Jan 27, 2022

I switch between this site and debuggex.com depending on what I’m doing at the time. I find they both have their strengths for specific tasks.

scottc · on Jan 27, 2022

What's the diff? I love regexr and credit that site to my finally understanding regex.

In fact, I was just on it this morning!

somehnguy · on Jan 27, 2022

I think regexr does a better job of helping you to understand exactly what is happening in the regex. I typically reach for debuggex when I already have a decent understanding of how to accomplish what I want and just want a simple way to edit & test, I think the interface is less busy for that case.

funstuff007 · on Jan 28, 2022

debuggex makes really nice railroad diagrams.

mkdirp · on Jan 28, 2022

Regexr was great. It's a shame development has stalled. I've started using regex101 instead, mostly because of support for more than javascript and php. There is plenty of opportunity to add some additional languages through webassembly but sadly I don't think this is gonna happen.

newusertoday · on Jan 27, 2022

Is there a list of regex patterns for common usecases like imei/geo cordinates etc. somewhere . My google searches are leading me either to regex tutorial sites or regex libraries. There are handful of results for emails/url etc. but not getting exhaustive list.

nerdponx · on Jan 27, 2022

Do geographical coordinates have a standard layout? I'd expect that you have to look at your particular data for cases like this.

Something like IMEI should be pretty easy, if Wikipedia [0] is to be trusted (e.g. in Python):

    # Matches IMEI and IMEISV
    imei_pattern = re.compile(r"\d{2}-\d{6}-\d{6}-\d\d?")

You could write a big monster pattern that sets up capture groups for all the different TAC and Check Digit variants, but why bother? Just slice off what you need from the result after matching.

0: https://en.wikipedia.org/wiki/International_Mobile_Equipment...

brendanfalk · on Jan 27, 2022

RegExr is my go to tool for testing regex. It’s always been able to solve my needs and I’ve never had any need to change. In saying that, I’ve always wondered if regexr is missing out on something that other regex build/test tools have?

What other regex tools do people use and why?

Glant · on Jan 27, 2022

I typically use Regex101. It's been good enough for my needs and I'm just used to it at this point. Looking at RegExr, it seems like the only big difference is that Regex101 supports substitution, though I think I've only used it once.

https://regex101.com/

walls · on Jan 27, 2022

It also lets you switch between regex implementations in different languages.

ask_b123 · on Jan 27, 2022

What do you mean by substitution? Replacing a match with a string/pattern?

If so, RegExr does have that tool (at the middle/bottom, tools bar), and it probably is the functionality I most often use.

Glant · on Jan 27, 2022

Ah, I see it now. Usually when I do replace/substitution it's in a larger project so it's not a feature I typically use on the site. Most of my Regex101 use is just figuring out why a regex I wrote isn't executing like I expected.

ask_b123 · on Feb 10, 2022

I normally use the replace/substitution in situations such as: I have list of, say, 1000+ things/rows in a single column. I want to put them in a single line and separate by comma. It is is pretty easy to go there and substitute \n for ", ".

That said, that's also doable in a text editor, but I like that I can visualize the pattern and results more easily.

Lindrian · on Feb 3, 2022

regex101 has a few more distinguishing features:

- more flavor support,

- a regex debugger,

- code generator, with support for a lot of languages,

- a complete quick reference with examples,

- an extensive regex library,

- a regex quiz for "golfing" and learning purposes,

Perhaps, most importantly, it runs entirely client side and does not submit any information to the server unless you hit save (which returns a delete link to remove all data). You can even run the website and (most) of its features offline.

Regexr submits all input to the server for processing.

But, I'm biased, since I wrote regex101 :)

Zababa · on Jan 27, 2022

I use regex101.com, that looks a lot like RegExr. I remember using RegExr at some point, and I think I default to regex101 just because the name is a bit easier to remember. I'll try to remember to switch to RegExr since it's open source.

lvl100 · on Jan 27, 2022

Are there any regex builders that work with natural language inputs?

maciejgryka · on Jan 27, 2022

If I understood what you mean, then yes, I built one https://regex.help/ (powered by https://github.com/pemistahl/grex doing the heavy lifting).

Hjfrf · on Jan 27, 2022

Like this? https://devblogs.microsoft.com/powershell/convertfrom-string...

blable2 · on Jan 27, 2022

Just thinking out loud... can't I just ask the google "Hey Google, in PERL, show me a regex to find the first occurrence of a semicolon to the next period."?

Mockapapella · on Jan 27, 2022

Regex seems like a good use case for GPT3. Most people that I've seen use regex use it so rarely that they end up having to relearn the syntax each time they use it.

Hjfrf · on Jan 27, 2022

There are some regex AI tools already - e.g. https://devblogs.microsoft.com/powershell/convertfrom-string...

It's what excel uses for power query's "column from examples" too.

janpot · on Jan 27, 2022

I sometimes use https://regexper.com/ to debug hard to understand regexes.

bloblaw · on Jan 27, 2022

If you paste a regex into RegexBuddy, it will explain each portion of the regex. Clicking on parts of the regex will highlight its meaning.

https://www.regexbuddy.com/analyze.html

Or it has the best regex debugger I've ever seen in my 20 year career: https://www.regexbuddy.com/debug.html

angryGhost · on Jan 27, 2022

This site is a rite of passage for any programmer, ever.

yashg · on Jan 27, 2022

I've used regexr for years. It has helped me build some really complex expressions.

jklinger410 · on Jan 27, 2022

Validating regex with some visualization is cool and all...but I actually value my time. I want a WYSIWYG regex builder, not a tool that helps me learn it.

bloblaw · on Jan 27, 2022

Oh, you should checkout RegexMagic. That's exactly what it does: https://www.regexmagic.com/

Written by author of RegexBuddy and Oreilly's Regex Cookbook: https://learning.oreilly.com/library/view/regular-expression...

jklinger410 · on Jan 27, 2022

Ah this is exactly what I am looking for! I'll have to see if the free version will run in Wine.