Hacker Newsnew | past | comments | ask | show | jobs | submit | sivoais's commentslogin

It is possible to do so, but it stands out in a way such that most people aren't doing it using direct language features unless necessary. If you have `strict` and `warnings` enabled, which is recommended, then the interpreter can give runtime warnings or compile-time errors if you try to manipulate the symbol table or redefine a function. If you still want to use those on a case-by-case basis, you have to turn off those pragmas within a scope. So there are built-in ways the language discourages using them arbitrarily.

In tests, you can use monkeypatching as a quick way to mock, but typically, you use modules that wrap up the functionality and make it cleaner. The mocks get cleaned up on scope exit.

There are also more principled ways of having composable behavior, such as <https://metacpan.org/pod/Class::Method::Modifiers> (advice/method combination).

There is another approach that takes advantage of the variety of scoping approaches Perl has. If one uses `local`, it indicates the use of dynamic scope rather than the standard `my` lexical scope. This can be used for things like setting an environment variable temporarily and automatically reverting it when leaving the scope. But this is not common.


PDL also has support for many of those distributions beyond the common ones. All the GSL ones in fact. Except Wishart didn't get a binding because that was just added to GSL in 2018. So thanks! I'll add the one line needed to bind that to PDL now and check if others are missing.


> For instance, there’s no official Jupyter notebook support for Perl

Not sure how official support would work in the Jupyter Project since anybody can write a kernel. I wrote the Perl one (IPerl) and that has existed since 2014 (when Jupyter was spun off from IPython). It supports graphics and has APIs for working with all other output types.

Now I do need to help make it work with Binder, but it does work.

---

The other point about MCMC samplers is valid. This is why I wrote a binding to R to access everything available in R and why I use Inline::Python sometimes. I should create a binding for Stan --- should not be hard --- at least for CmdStan at first, then Stan C++ next.


Yep, there's more than one way to do things and PDL wraps all the same GSL functions <https://metacpan.org/pod/PDL::GSL::RNG#ran_gumbel1>.

Also note that PDL does automatic broadcasting of input variables so it does an entire C loop for an array of values being evaluated. See this example <https://gist.github.com/zmughal/fd79961a166d653a7316aef2f010...> for how that applies to all GSL functions that are available in PDL. Though I do notice that some of the distributions available at <https://docs.scipy.org/doc/scipy/reference/stats.html#contin...> are not in GSL.

Though when I do stats, I often reach for R and have done some work in the past to make PDL work with the R interpreter (it currently has some build bitrot and I need to fix that).


Hi, PDL core dev here. Feel free to ask me anything about it.

The last release wasn't in February, it was just last week! <https://metacpan.org/release/ETJ/PDL-2.050>.

I agree with many of the commenters here that Python has a lot of great libraries and is a major player for scientific computing these days. I also code in Python from time to time, but I prefer the OO modelling and language flexibility features of Perl.

Speaking for myself and not the other PDL devs, I don't think this is an issue for Perl-using scientists as Perl can actually call Python code quite easily using Inline::Python. In the future I will be working on interoperability between the two better specifically for NumPy / Pandas. This is also the path being taken by Julia and R.


Looks great! I used perl a lot when I started programming and it is lovely to see it alive and kicking with scientific computing!

As a "heavy" user of scientific computing, I must say that the name "data language" is a bit disheartening... It echoes of useless "data frames" not of cool "sparse matrices" which is what I actually need. Does PDS support large sparse matrices? I grepped around the tutorial and the book and the word "sparse" is nowhere to be found. Yet it is an essential data structure in scientific computation. Are there any plans to, e.g., provide an interface into standard libraries like suitesparse?


I plan to improve that, but will need to figure out the design (perhaps with something from Eigen). There is <https://metacpan.org/pod/PDL::CCS>, but it is not a real full PDL ndarray and is actually a wrapper around the PDL API.


Very interesting, thank you!

Do you have a tutorial and some examples? If not, could you write one?

I sometimes deploy perl code at large scale for financial computing where only performance matters: with XS the overhead is low while gaining language flexibility.

Even in 2021, this is usually faster than alternatives by orders of magnitude.

PDL could be a good addition to our toolset for specific workloads.


Here is a link to the PDL book <http://pdl.perl.org/content/pdl-book-toc.html>.

I can share some examples of using PDL:

- Demos of basic usage <https://metacpan.org/release/ETJ/PDL-2.050/source/Demos/Gene...>

- Image analysis <https://nbviewer.ipython.org/github/zmughal/zmughal-iperl-no...> (I am also the author of IPerl, so if you have questions about it, let me know. My top priority with IPerl right now is to make it easy to install.)

- Physics calculations <https://github.com/wlmb/Photonic>

- Access to GSL functions for integration and statistics (with comparisons to SciPy and R): <https://gist.github.com/zmughal/fd79961a166d653a7316aef2f010...>. Note how PDL can take an array of values as input (which gets promoted into a PDL of type double) and then returns a PDL of type double of the same size. The values of that original array are processed entirely in C once they get converted to a PDL.

- Example of using Gnuplot <https://github.com/PDLPorters/PDL-Graphics-Gnuplot/blob/mast...>.

---

Just to give a summary of how PDL works relative to XS:

PDL allows for creating numeric ndarrays of any number of dimension of a specific type (e.g., byte, float, double, complex double) that can be operated on by generalized functions. These functions are compiled using a DSL called PP that generates multiple XS functions by taking a signature that defines the number of dimensions that the function operates over for each input/output variable and adding loops around it. These loops are quite flexible and can be made to work in-place so that no temporary arrays are created (also allows for doing pre-allocation). The loops will run multiple times over that same piece of memory --- this is still fast unless you have many small computations.

And if you do have many small computations, the PP DSL is available for the user to use as well so if they need to take a specific PDL computation written in Perl, they can translate the innermost loop into C and then it can do the whole computation in one loop (a faster data access pattern). There is a book for that as well called "Practical Magick with C, PDL, and PDL::PP -- a guide to compiled add-ons for PDL" <https://arxiv.org/abs/1702.07753>.

---

I'm also active on the `#pdl` IRC channel on <https://www.irc.perl.org/>, so feel free to drop by.


Now you just need to port it to Raku. (Maybe you have).


I would really like to do some scientific computing in Raku. It has crossed my mind that I can maintain both Perl5 and Raku ports of some of the library code I'm writing. I just haven't worked through the tooling.


Thank you for your work. I used PDL early 2000 when working in bioinformatics area.

I did not know at the time any of the specialized languages, so intially approaching the project -- I was very concerned on how to deal with matrices, but as I got to understand the PDL better -- i was getting better and better at it.

If I may suggest someting (this is based on the old experience though) --

a) some 'built-in' way to seamlessly distribute work across processes and machines.

b) some seamless excel and libreoffice calc integration.

Meaning that I should be able to 'release' my programs as Excel/Libre Office files.

Where I code in PDL but leverage Spreadsheet as a 'UI' + calc runtime.

So that when I run my 'make' I get out a Excel/Libre office file that I can version and distribute into user or subsequent compute environments.

Where the PDL code is translated into the runtime understood by the spreadsheet engine.

I know this is a lot to ask, and may be not in the direction you are going, but wanted to mention still.


Good ideas!

a)

A built-in way would be good. There is some work being explored in using OpenMP with Perl/PDL to get some of that. In the mean time, there is MCE which does distribute across processes and there are examples of using this with PDL <https://github.com/marioroy/mce-cookbook#sharing-perl-data-l...>, but I have not had an opportunity to use it.

b)

Output for a spreadsheet would be difficult if I understand the problem correctly. This would more about creating a mapping of PDL function names to spreadsheet function names --- not all PDL functions exist in spreadsheet languages. It might be possible to embed or do IPC with a Perl interpreter like <https://www.pyxll.com/>, but I don't know about how easy that would be to deploy when distributing to users.

Am I understanding correctly?

Interestingly enough, creating a mapping of PDL functions would be useful for other reasons, so the first part might be possible, but the code might need to be written in a certain way that makes writing the dataflow between cells easier.


Perl's CPAN has tooling for diffing versions through the use of MetaCPAN (a top-notch site which every language should try to emulate). For example, here is a diff of the URI distribution:

https://metacpan.org/diff/file?target=ETHER%2FURI-1.71%2F&so...

This information is also available through an API for integration into command line tools.


IIRC, and I could be wrong, CPAN was the first to go down the route that many modern toolchains now provide. We've looked at it.

In fact, the original creator of Glide (Go package manager) wrote about Perl and CPAN when talking about Go at http://technosophos.com/2015/09/02/dont-let-go-be-condemned-....


Yeah, I think it was. The only thing older was probably CTAN, but it didn't have the same structure.

Other things from the Perl ecosystem that should be copied are:

- CPAN Testers which automatically tests every package on multiple systems from Windows to Solaris. This helps identify portability, backcompat, and regression issues.

- CPAN mirrors which ensure that there isn't a single point of failure. This might not be as important now with fast networks and high uptimes, but it also ensures that everyone can replicate all of the code at anytime. I believe R's CRAN does this.


CPAN Testers also test on different operating system versions. And hence also test the Perl binaries, for a lot of different versions, by literally running the combined test suites for all of the (public) Perl code.

I really, really don't get why this isn't copied to the rest of the open source language environments. (Masochism? :-) )


Well, yes. But CPAN does not provide src urls via git (github, sourceforge, ...). You need to upload your tar.gz to a central repo which is then distributed via mirrors.

All modern packagers provide now such src urls, esp. go went ahead with this idea.


You never know when that pull request could turn into co-maintainer permissions. ;-) That happened to me with 3 packages last month!


Well volunteered!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: