Union Types have been accepted for PHP 8

yagodragon · on Nov 9, 2019

When I started programming a few years back, I was told that PHP is legacy tech and a terrible language I should avoid like a plague. After years of battling with Node.js on the backend, I took a closer look at modern PHP and the Laravel framework and I was amazed by how good the developer experience was. The language itself feels like a lite version of Java. I believe everyone starting on web development should know PHP. It's hugely deployed, it has a very mature and rich ecosystem and it's a great language to build your side project/business without spending your time on meaningless tasks. Don't be driven by FOMO like I was, there's no such thing as a "perfect programming language". Every language has it's quirks and that's fine. PHP might be old and boring but it helps you get things done faster than using the new coolest language.

ashton314 · on Nov 9, 2019

> there's no such thing as a "perfect programming language". Every language has it's quirks and that's fine. PHP might be old and boring but it helps you get things done faster than using the new coolest language.

I disagree with you on this point. I worked with PHP for two years on some legacy systems, as well as some Laravel-based systems. While Laravel yields results that are worlds away from PHP build in the 90's, I still wasted dozens of hours because of poor constructs that the language is unfortunately coupled to. I documented each instance where I wasted a significant amount of time due to issues like scoping issues, closure problems, arrays-that-are-arrays-sometimes-but-maps-other-times, etc.

PHP has gotten better. But there are so many good choices that provide much better tooling, guarantees, etc. Examples: Ruby, Go, and Elixir. Heck, even Perl is more sane about its data types in some ways by enforcing consistent comparisons with `==` vs. `eq` and much more robust data structures.

I agree that every language has it's quirks. But I think PHP has so many that it seriously gets in the way often. I don't think it offers any significant velocity gains over using, say, Ruby.

uryga · on Nov 10, 2019

> arrays-that-are-arrays-sometimes-but-maps-other-times

god, the time i spent going paranoid debugging a stray "undefined array index 0" just to find out

  array_filter(
    is_uppercase,
    ['a', 'B', 'c', 'D']
  )

returns

  [1 => 'B', 3 => 'D']

and not

  ['B', 'D']

like every other `filter(...)` i've used...

EDIT to be fair, i see the rationale – having the index of the filtered element is useful sometimes, and requires some contortions with the usual impl of filter. it's quite neat, because you get both the index and the elements! but it's just... surprising as the default, and PHP's conflation of maps and lists obscures it – all the docs say is "array keys are preserved", which makes sense in retrospect, but doesn't really jump out for something with this much impact

ashton314 · on Nov 10, 2019

This is exactly the sort of thing that bites. It totally breaks your functional pipelines: now you have to thread everything through `array_values`. Ew ew ew. No thank you.

astrodev · on Nov 10, 2019

On the other hand, having learned PHP inside-out as my first language, this is perfectly natural to me, and I miss the flexibility of PHP arrays in other languages :)

dabernathy89 · on Nov 10, 2019

Never heard anyone complain about scoping or closure issues with PHP. I'd be interested to hear more if you recall.

jamroom · on Nov 10, 2019

I don’t believe there are maps in PHP - just plain arrays which to be honest are pretty simple to use. Unlike say JavaScript where arrays are always numerically indexed, PHP can use any value as an index.

azernik · on Nov 10, 2019

JavaScript, by contrast, doesn't have arrays - it just has objects, which can have numerical keys.

  a = []
  a.foo = 'bar' // works!

  b = {}
  b[0] = 42 // also works!

I happen to think these designs are both INSANE, but even if you like them, PHP has no advantage here.

roblabla · on Nov 10, 2019

I’m pretty sure this is wrong. JS objects and arrays are different. The former will only yield elements from the array itself when iterated on, and has a consistent iteration order, while the later does not.

azernik · on Nov 10, 2019

That's because arrays happen to override the special [Symbol.iterator] method:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

Arrays are objects of class Array which have special literals, override the toString() method, and update the non-enumerable "length" property on certain operations.

Try calling Object.keys([1, 2, 3]), or even better try Object.getOwnPropertyNames([1, 2, 3])

TurningCanadian · on Nov 10, 2019

Order is consistent in both

https://www.stefanjudis.com/today-i-learned/property-order-i...

mncharity · on Nov 10, 2019

https://www.ecma-international.org/ecma-262/#sec-array-exoti...

cutler · on Nov 10, 2019

JS arrays are array-like objects. That's the official line anyway.

uryga · on Nov 10, 2019

so isn't it like PHP, which doesn't have lists, only maps with integer keys?

azernik · on Nov 10, 2019

With the exception that unlike (I think?) PHP, JS also uses maps (w hich can only have string keys [1]) to implement objects and classes. Hence their being called objects, not maps.

[1] So how do arrays have numerical keys? Well... actually they don't. I was trying to make JS look better than it is in my original comment. Actually, array keys are stringified versions of numbers, and values are automatically cast to string when you put them in indexing braces.

  Object.keys([1, 2])
  --> ["0",  "1"]

As you can imagine, this leads to gotchas when you assume that a JS object can have, say, a date as a key

  b = {}; b[new Date()] = 10;
  Object.keys(b)
  --> [
    "Sat Nov 09 2019 23:15:58 GMT-0800 (Pacific Standard Time)"
  ]

uryga · on Nov 10, 2019

i think that's just a matter of naming – afaik PHP "arrays" are what you'd call a map/dict/hash in other languages

capn_cabbage · on Nov 10, 2019

I believe it is just a naming issue. Not even sure that PHP devs have agreed on how to name the type. Internally, PHP calls them both arrays and hash tables if I understand correctly.

In Zend/zend_types.h[1] of PHP source:

    typedef struct _zend_array zend_array;
    typedef struct _zend_array HashTable;
    struct _zend_array {
        ...
    };

That being said AFAICT, HashTable and zend_array are used interchangeably throughout the source. I am not a C programmer, but I did write a couple of PHP extensions and that was my general understanding. Perhaps it is a compatibility issue or just used to abstract types differently in various areas of the C API.

Check out [2] for a deeper understanding of how arrays are handled internally in PHP.

[1]: https://github.com/php/php-src/blob/php-7.3.11/Zend/zend_typ...

[2]: https://nikic.github.io/2014/12/22/PHPs-new-hashtable-implem...

igouy · on Nov 11, 2019

rtm ?

"An array in PHP is actually an ordered map."

https://www.php.net/manual/en/language.types.array.php

idoubtit · on Nov 9, 2019

PHP has come a long way, and nowadays it's in the same category as Python. Worse in many ways, but better by other, like the type declarations and their (runtime) enforcing.

Yet there are still many ugly sides to PHP, and Laravel illustrates most of them. There is so much magic that IDE can't follow: some classes have a `__call()` magic function that redirect methods calls to other instances. Some functions return values of varying types, with no common interface.

I've worked with several PHP frameworks, and Laravel is by far the worst. Its awful documentation plays a big role in this (no real reference doc, just a tutorial ; no links to classes or function in the doc ; the API doc is a joke ; the acclaimed "laracasts" are useless for serious work). The fact that this framework is dominant in the PHP community is worrying.

thdrdt · on Nov 9, 2019

Laravel is popular because it allows you to setup your project very quickly.

And then the trouble starts. Remember what properties your models had? No? Well so doesn't your IDE. There is just too much magic going on to keep things maintainable.

If you like a framework like Laravel you should go with Symfony instead. But don't use annotations. Keeps things separated so you and your colleagues can find routing and database information at logical places instead of all over the place in classes.

xellisx · on Nov 10, 2019

Laravel, from my own experience is full of deep rooted magic, and assumptions. Also it throws SOLID out the window.

We use Symfony at work and we turned off the annotations package, use XML mapping for our models.

Of course, Symfony and Doctrine do have some frustation points.

thanato0s · on Nov 12, 2019

I have the same experience. Laravel is way too magical for me.

I use Symfony for quite some time and at one point I stopped using Doctrine data mapper. The DBAL and the query builder are enough.

xellisx · on Nov 12, 2019

If it were up to me, it would all be SQL queries. Maybe even use the library I rebuilt - https://github.com/ellisgl/GeekLab-GLPDO2

kyriakos · on Nov 10, 2019

For model properties and a few other IDE related issues with laravel you can use ide helper which generates annotations on your models by inspecting the schema in the database. The ide then uses the annotations and gives you property auto complete among other things.

https://github.com/barryvdh/laravel-ide-helper

melicerte · on Nov 10, 2019

I would recommend you to give Yii[0] a try. It is a solid, consistent and well thought framework with a vey good documentation. Unfortunately, It has always been eclipsed by new kids on the block frameworks like Laravel but really deserve more credits (provided you accept to let some of the laravel magic go).

I work for a web development company and We have been successfully using Yii for years now, including for medium size projects (in terms of LOC and exposure).

Just my two cents

[0] https://www.yiiframework.com

GeneralTspoon · on Nov 9, 2019

Regarding model properties - there’s a standard documentation syntax that can be used to document that and give you autocomplete in your IDE (Laravel IDE helper can also auto generate these for you).

Not ideal - since it’s not enforced in anyway (the same as local scope type doc declarations). But can be useful.

TrispusAttucks · on Nov 10, 2019

I agree in regards to Larvel. It was a breath of fresh air in the early days but they have baked so much magic into it and break backwards compatibility constantly that I have long given up even considering it. I will second the recommendation for symfony [1]. It lets you compose your app of just what components you need instead of the everything and the kitchen sink approach. However, composable systems are my preference but may not be for everyone.

[1] https://symfony.com/

CodeWriter23 · on Nov 9, 2019

How long ago was your experience? I started using Laravel early last year, took a hiatus and have been actively developing in it for the past 7 months. This is coming from years with Perl Dancer / Dancer2. I have not run into these issues. But I have to admit my own bias in preferring the “show me how” instructional method over “tell me everything, no matter how esoteric”. I’ve also found Laravel superior to Dancer in every way. But I did have to get into a different mindset.

lone_haxx0r · on Nov 10, 2019

> But I have to admit my own bias in preferring the “show me how” instructional method over “tell me everything, no matter how esoteric”.

I don't think it's about preferring one approach or the other.

A good library/framework should have both instructional documentation and reference documentation. They have different use-cases and are not interchangeable.

cutler · on Nov 10, 2019

Comparing Dancer with Laravel is a bit apples to organges, no? Mojolicious or Catalyst would make a better comparison with Laravel.

dabernathy89 · on Nov 10, 2019

> Its awful documentation plays a big role in this

Laravel is lauded by many for its fantastic documentation. It's not comprehensive, but it's still pretty extensive, easy to read, and the gaps are only for niche situations.

As far as the API docs, what is your complaint about them?

kyriakos · on Nov 10, 2019

Where are the api docs?

peterkelly · on Nov 10, 2019

I know absolutely nothing about Laravel and haven't touched PHP in nearly 20 years - but within about 10 seconds on Google I found the following, which at first glance looks pretty comprehensive:

https://laravel.com/api/4.2/

tazard · on Nov 10, 2019

That is an old version, the current being https://laravel.com/api/6.x/ but I have to agree, I have found them to be great. There is no reference to the API docs from the main website though, which I have always found odd.

dabernathy89 · on Nov 18, 2019

There is - I guess it's not super easy to find. But if you go to the main documentation page, it's under the "Prologue" heading.

clairity · on Nov 9, 2019

php is definitely better these days (i used to use it early in my career), but unless you're already knee-deep in php, i can't think of a web project where php/laravel would generally be a better choice than ruby/rails. alas, ruby has become passé too, in favor of javascript, a decidedly unrefined language.

javascript has too many hidden ways to shoot yourself in the foot, and just hasn't evolved for developer happiness the way php and ruby have.

sowhatquestion · on Nov 10, 2019

As a Laravel dev who almost went down the Rails path, I’m curious: what advantages do you think Rails has over Laravel at this point?

clairity · on Nov 10, 2019

honestly i haven't looked at laravel recently enough to remember the specifics (although others here seem to have), but 2 general stengths are (1) ruby is more consistent, intuitive and fun vs. php, and (2) rails has reasonable defaults, easy configurability on top of that, and a nice ecosystem of both built-in and community-built widgets. it's so fast to get up and running with rails.

but i'd pick php over js any day! (probably even over python/django, unless data mining is a core concern)

TrispusAttucks · on Nov 10, 2019

Definitely not performance [1]. Maybe aesthetics?

[1] https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

todotask · on Nov 10, 2019

Which part do you see that is not performance in that benchmark which shown the average time? I have tried benchmark on my own and PHP took longer than RubyJIT.

TrispusAttucks · on Nov 10, 2019

Every benchmark on that page is faster for PHP

todotask · on Nov 10, 2019

"Every benchmark on that page is faster for PHP"

Let me advice you to benchmark it "accuracy" on your machine setup, don't rely what is on that page.

Make sure the error with --jit-verbose=1 which will show whether it uses MJIT correctly.

ruby --jit-verbose=1

- N-Body single core

Ruby 2.6.3 6:22.00s

RubyJIT 2.6.3 3:58.18s

PHP 7.3.xx 4:10.90s

igouy · on Nov 11, 2019

> will show whether it uses MJIT correctly

    …
    JIT success (511.7ms): initialize@nbody.rb:14 -> /tmp/_ruby_mjit_p5804u105.c
    JIT success (397.5ms): block in offset_momentum@nbody.rb:68 -> /tmp/_ruby_mjit_p5804u106.c
    JIT success (607.0ms): block in energy@nbody.rb:50 -> /tmp/_ruby_mjit_p5804u108.c
    JIT compaction (53.6ms): Compacted 111 methods -> /tmp/_ruby_mjit_p5804u111.so
    Successful MJIT finish

    real 6m4.201s
    user 6m42.813s
    sys 0m4.041s



    $ time /opt/src/php-7.3.11/bin/php -n  nbody.php 50000000
    -0.169075164
    -0.169059907

    real 5m24.915s
    user 5m24.808s
    sys 0m0.020

todotask · on Nov 23, 2019

As for mine, Ruby slightly faster 4:05s - 4:09s which is close to PHP in timing.

ruby 2.7.0dev (2019-11-23T07:06:30Z master b563439274) [x86_64-darwin18]

gtime -v /usr/local/bin/ruby --jit -W0 nbody.rb 50000000 -0.169075164 -0.169059907

Command being timed: "/usr/local/bin/ruby --jit -W0 nbody.rb 50000000"

User time (seconds): 249.30 System time (seconds): 0.58 Percent of CPU this job got: 100% Elapsed (wall clock) time (h:mm:ss or m:ss): 4:08.97

---

PHP 7.3.11 (cli) (built: Oct 24 2019 11:29:52) ( NTS ) Copyright (c) 1997-2018 The PHP Group Zend Engine v3.3.11, Copyright (c) 1998-2018 Zend Technologies with Zend OPcache v7.3.11, Copyright (c) 1999-2018, by Zend Technologies

gtime -v php -n nbody.php 50000000

-0.169075164 -0.169059907

Command being timed: "php -n nbody.php 50000000" User time (seconds): 248.02 System time (seconds): 0.49 Percent of CPU this job got: 99% Elapsed (wall clock) time (h:mm:ss or m:ss): 4:09.82

cutler · on Nov 10, 2019

Polish and refinement. Ruby's expressiveness and clean design make possible a level of elegance in Rails that I haven't seen in any other web framework. Ruby lends itself to DSLs much more than most other languages, Clojure excepted, and its meta-programming makes possible a number of shortcuts which help you to get a project up and running in record time. Add to that the magnitude of commits to the Rails project over its 14-year history and you have an ideal tool for SME web projects.

cutler · on Nov 10, 2019

Considering PHP's pseudo-Java verbosity I wouldn't exactly classify it as a fast-to-develop language. Doc-block comments are an abomination if you come from Ruby,Python or Javascript and if you follow the numerous PSRs to the letter you're lucky if you can fit a couple of lines of real code on a screen surrounded as they will be by space-wasting blank lines, doc-block annotations and KR-style braces.

melicerte · on Nov 10, 2019

I guess you have not read the post you are commenting. One of the rationale behind union types is to reduce the doc-block comments. It is also, I believe, one of the idea of introducing a stronger type system in PHP.

From the original post: [quote] Supporting union types in the language allows us to move more type information from phpdoc into function signatures, ... [/quote]

lone_haxx0r · on Nov 10, 2019

There's no such thing as the perfect programming language, but there are languages that are plainly bad.

For example: I don't think anyone would seriously defend the idea that brainfuck is a reasonable language to write production code in.

brenden2 · on Nov 9, 2019

Whether you love it or hate it, it's cool that the PHP project has come so far. I know a lot of people still use PHP every day, so it's good to see the language continues to improve. I imagine a significant percentage of web services are still PHP based to this day.

bpicolo · on Nov 9, 2019

Over 30%, thanks to Wordpress.

tooop · on Nov 9, 2019

w3techs reports that PHP is around 78% in Alexa Top 10mil sites.

tambourine_man · on Nov 9, 2019

Facebook, Wikipedia

no_wizard · on Nov 9, 2019

I believe that Facebook is fully transitioning to hacklang

https://hacklang.org/

captn3m0 · on Nov 9, 2019

Doesn’t that count as a PHP dialect?

krapp · on Nov 9, 2019

Depends on what you mean by "dialect." I believe complete backwards compatibility with PHP is no longer a goal of Hack.

todotask · on Nov 9, 2019

The “significant percentage“ is more of a product, web services are majority in Java, C#, Go, etc that can utilise RPC and security need.

TrispusAttucks · on Nov 9, 2019

It's nice to start out typeless during problem domain discovery and proof of concept phase then quickly introduce types when you better understand the problem. This change will help that phase transition with enforcement right in the runtime as opposed to simple doc block annotations.

peterkelly · on Nov 10, 2019

"Bad programmers worry about the code. Good programmers worry about data structures and their relationships."

― Linus Torvalds

Discussion: https://news.ycombinator.com/item?id=4560334

TrispusAttucks · on Nov 10, 2019

Yes. We want to spend as little time as possible on code so we can quickly find the data structure that best solves the problem we have yet to understand.

  // I know nothing. Not sure what types would work best...
  public function method($data)

  // I know more
  public function method($data array)

  // More
  public function method($data array|false)

  // Even more
  public function method($data array|false) : int

  // Ah. Okay. I understand the problem domain and have proper data structures to solve it
  public function method($data array|false) : int|float

Notice the type system never got in our way. We organically grew into the type system as our understanding of the data evolved.

Supermancho · on Nov 10, 2019

This is misused a bit, since the derivative collary is:

After you optimize the data structures and have established relationships, worry about the code.

ssijak · on Nov 9, 2019

Types are much better utilised in/while modeling domain.

TrispusAttucks · on Nov 9, 2019

Care to explain why types are better during domain modeling? Do you think it affects velocity? No benefits to loose types during discovery?

k__ · on Nov 9, 2019

Idiomatic dynamicly typed code looks rather different from idiomatic statically typed code.

Dynamic is like only using untagged unions and primitives.

TrispusAttucks · on Nov 10, 2019

So little meaning in that comment.

Paraphrasing:

1. Dynamic and Static lang code look different

2. Dynamic lang code can use variables of undeclared type

Neither statement adequately answers the parent comment.

ilovecaching · on Nov 9, 2019

I wonder how stock PHP is stacking up to Hack both from the dimensions of efficiency, safety, and productivity these days.

TrispusAttucks · on Nov 9, 2019

The Hack HHVM JIT is typically more performant than standard PHP 7. However, that is more of an apples to oranges comparison of JIT vs Non-JIT than Lang to Lang comparison.

That said, PHP 7 has squeezed most of the performance from the language and the low hanging fruit is gone.

However, PHP 8 looks to bring a standard JIT to PHP [1]. I'd guess that once that happens Hack / HHVM may no longer have any advantage.

[1] https://hub.packtpub.com/php-8-and-7-4-to-come-with-just-in-...

judge2020 · on Nov 9, 2019

Just to add (you didn't say otherwise), PHP on HHVM is EoL'd https://hhvm.com/blog/2018/09/12/end-of-php-support-future-o....

tyingq · on Nov 9, 2019

"PHP on HHVM is EoL'd"

True that you shouldn't expect large existing PHP projects to run. But, the syntax is close enough that for net new projects, it's essentially PHP.

tyingq · on Nov 9, 2019

Hack supports async in a way that's production ready: https://docs.hhvm.com/hack/asynchronous-operations/some-basi...

The built in webserver, Proxygen, is also usable in production and supports TLS, http/2, etc.

Supermancho · on Nov 10, 2019

This is one bad feature among many good decisions, but bad nonetheless. The language will have yet-another-way to do type-erasure/autoboxing that will have to be tracked down, in the interest of removing boilerplate when multiple types are desired to be coerced. Type safety is recognized as valuable by the PHP core voters on one hand, and waved away by the other.

Imagine the fun!

<?php

function callit(float $numI) {

    $numB = $numI || ($numI/2);

    echo gettype($numB)."\n";

    $numF = ($numI/2);

    echo gettype($numF)."\n";

    return some($numB);

}

function someType(int|float $numX):int {

    echo gettype($numX)."\n";

    return $numX;

}

$ret = callit(2);

echo gettype($ret)."\n";

robotron · on Nov 10, 2019

That's why you do this instead:

<?php declare(strict_types=1);

Your code would flag an error in the IDE and throw a TypeError exception.

6gvONxR4sf7o · on Nov 9, 2019

This is cool! I wish tools like postgres had them too.

weberc2 · on Nov 9, 2019

Do any relational databases have them? I’ve never understood why not. They go through the trouble of providing a mechanism to formally model your data schema, but if your data involves things that can be one thing or another, they totally punt and ask you to find a way to hack it on top via ORMs or similar. Why not model everything in Postgres such that users can issue requests against their actual data model and Postgres can make it fast, instead of building a shim layer between the data model surface and Postgres (shim layer = ORM, in case it wasn’t obvious) that can only generate queries that are likely more difficult for Postgres to optimize for lack of missing type information.

normaljoe · on Nov 9, 2019

I don't understand this comment from the standpoint that DBs including PostgreSQL, my fav, are not duck typed like higher level languages. They have to store the data in binary that must conform to something that can both be validated and indexed. That's to say how do you index something such as an int against a float. And how do you validate a constraint the the value must be between 1 and 10. This is why DBs can be hard and you have things like collations even among the same encoding for strings. Let DBs be good at data.

I don't see the need for abstraction either especially with Postgres which has a very good type casting system. I can insert a float into an integer column without any trouble and can return the same. Postgres even has a great syntactical sugar of :: for casting such that value::int or value::numeric just works.

Finally if this is a true requirement Postgres fully supports domains which are custom data types. They are not difficult to deal with and could provide for a syntax which would handle that. You might need a little work but I can have a domain which would include the int type and float type as a single data type. This still requires some plumbing to create a true union however it's not that far off. The downside however is again the DB has to store in binary so a domain like that would have two values stored for a single value which would be less than optimal solution.

The true value for PHP programmers here is that we can get closer to type validation in a duck typed system. This provides multiple type validation of scalars as parameter to a function. Less interesting to Java, Python, Ruby developers where everything is an object except in a few cases. Scalars are still widely used in PHP and this feature allows for not sending "banana" as a parameter to a function, that in the past would just cast this to 1. If you think banana == 1 then you are going to have logic bugs.

yumh · on Nov 9, 2019

I think that's because the proper way to accomplish something like that in SQL is via specification. I do agree with you, though, postgres added enums (another feature that it's not strictly needed in SQL, but nice-to-have) but not something like that.

It's possible to emulate union types in SQL with triggers (i.e. check that the field A and B cannot be NULL at the same time, but it does not work if NULL is be a possible value.)

An aside: I wish some relation database adopted sum types: I didn't thought about the implications, but doing 'create table foo ( bar Maybe integer );' and then 'select Some bar ...' would be cool (and maybe a cleaner way to work with NULL.)

weberc2 · on Nov 9, 2019

> An aside: I wish some relation database adopted sum types: I didn't thought about the implications, but doing 'create table foo ( bar Maybe integer );' and then 'select Some bar ...' would be cool (and maybe a cleaner way to work with NULL.)

That's not an aside, that's my whole argument--RDBMSes should support sum types for all of the same reasons that we should use RDBMSes in the first place: developers describe a data model and work against that while the database storage and retreival. Postgres enums and NULL are just special cases of sum types.

coldtea · on Nov 9, 2019

>Do any relational databases have them? I’ve never understood why not

For one, because they need a fixed binary representation for the type, to persist it on disk. In a programming language yoi do things in memory, so you don't have that issue...

Still, you could have had union types, or even coerce everything as string, as SQLite does, but it would be bad for performance as they'd need an alternate representation.

weberc2 · on Nov 9, 2019

> For one, because they need a fixed binary representation for the type, to persist it on disk. In a programming language yoi do things in memory, so you don't have that issue...

Memory and consequently programming languages also requires a fixed binary representation. How you represent data is orthogonal to its storage medium--you can write application memory to disk and read it back in, no problem (e.g., swap).

> Still, you could have had union types, or even coerce everything as string, as SQLite does, but it would be bad for performance as they'd need an alternate representation.

The problem doesn't go away by moving support out of the database and into the application; it only makes it worse insofar as the application is limited in its optimizations. At the end of the day, the real world has sum types, applications use them, and they are encoded into databases--they simply aren't encoded _well_ and the database isn't giving you any correctness guarantees as it does for product types (i.e., structs, records, etc).

coldtea · on Nov 10, 2019

>Memory and consequently programming languages also requires a fixed binary representation

Already covered that.

Programming languages can save their data as unions or structs, and take the minimal hit to switch on the type.

DB's persisting data on disk can do the same but will take a much bigger hit.

>How you represent data is orthogonal to its storage medium--you can write application memory to disk and read it back in, no problem (e.g., swap).

The costs are not orthogonal to the storage medium however.

>The problem doesn't go away by moving support out of the database and into the application

On the DBs side, it does go away. The DB only has to guarantee what it says it supports (only store one specific type in a column). So for the DB implementors, that's a great invariant for their implementation ease and performance.

weberc2 · on Nov 10, 2019

> Already covered that.

Where? I didn’t see it.

> Programming languages can save their data as unions or structs, and take the minimal hit to switch on the type. DB's persisting data on disk can do the same but will take a much bigger hit.

Yeah, of course. Disk is more expensive across the board. Same applies for storing ints, but databases don’t punt on that. And anyway, the sum types still exist in the schema, they are just implicit, as hoc spectacles built on the fly by the user. So you’re still dealing with performance issues, but they’re worse.

> On the DBs side, it does go away. The DB only has to guarantee what it says it supports (only store one specific type in a column). So for the DB implementors, that's a great invariant for their implementation ease and performance.

This applies to every feature for every tool. You don’t have to solve the problem if you just put it on your users.

Of course, tools have charters, and the relational axiom is that users shouldn’t have to manage their own storage and retrieval layer, but rather they should declare a data model and interface with it and the RDBMS would make search and retrieval fast and correct. Sum types are necessary in data modeling, so it fits clearly and neatly into the charter.

coldtea · on Nov 10, 2019

>Where? I didn’t see it.

"For one, because they need a fixed binary representation for the type, to persist it on disk. In a programming language you do things in memory, so you don't have that issue..."

The crucial difference I point is "persist on disk" vs "do it in memory", not in the "binary representation". Both running programs and DBs have one, but one absolutely needs to be persisted on disk, whereas live program memory doesn't.

>This applies to every feature for every tool. You don’t have to solve the problem if you just put it on your users.

There's also the fact that it might not be a problem just an easy cop-out from the user.

In which case it's better to force your users into the more formal and rigid structure, and have them rethink their model, than turn the DB into an "anything goes anywhere" store.

weberc2 · on Nov 10, 2019

> The crucial difference I point is "persist on disk" vs "do it in memory", not in the "binary representation". Both running programs and DBs have one, but one absolutely needs to be persisted on disk, whereas live program memory doesn't.

Right, I agree, and my point was it doesn’t matter. Disk vs memory is a red herring. The same principles apply to both and the fact that disk is slower applies as much to product types as it does to sum types. In fact, sum types are represented as product types, but the system enforces invariants about the structure.

> There's also the fact that it might not be a problem just an easy cop-out from the user.

That’s a nope from me. Tools exist to solve problems. If a tool purports to solve a problem but only does it halfway, it warrants criticism or observation.

> In which case it's better to force your users into the more formal and rigid structure, and have them rethink their model, than turn the DB into an "anything goes anywhere" store.

RDBMSs are literally forcing their users into a less formal structure. You can’t rethink your model and make them go away (they are fundamental data modeling primitives), you can only find ways to hack product types to represent them, but you have to do all the work to make them fast and you probably just have to give up on verifiable safety altogether.

And how do you get from “sum types” to “anything goes data store”? Are you sure you understand the debate?

6gvONxR4sf7o · on Nov 10, 2019

>The DB only has to guarantee what it says it supports (only store one specific type in a column).

If you mean primitive types specifically, this already is far from the case, and it's great. One "type" is great, but types can be composed. Allowing types beyond primitive types has already been a blessing for me. Getting columns of type ARRAY[some other type] or MAP[type, type] is incredibly convenient. I don't feel so strongly about the JSON types that are entering into every db, but they're certainly supported and widely used.

6gvONxR4sf7o · on Nov 10, 2019

Why would the binary representation matter distinctly for storage on disk versus storage in RAM? If you need a fixed width representation, why not do unions like C does?

sedeki · on Nov 9, 2019

It was long ago I worked with PHP, before PHP 5. Is typing enforced with PHP these days? Or is it more like type hints, as in Python?

pilif · on Nov 9, 2019

For classes it's enforced since PHP 5. For scalars you get to chose between enforcement or type coercion for callers on a per-file basis.

However, as with other dynamically typed and interpreted languages, all of this is happening at run-time, so one of the biggest benefits of strong typing (type checking at compile time) doesn't apply.

beberlei · on Nov 9, 2019

Sorry to interrupt here, but this is wrong. it is not enforced to use types for anything (arguments, return values or properties) if you don't want to. Using it is completely optional.

Edit: Ah I believe there is a misunerstanding between posters here. The usage of types is not enforced, but when you use them their correctness is enforced at runtime.

weberc2 · on Nov 9, 2019

You can statically type check a dynamically typed, interpreted language. Python via Mypy and Typescript being two examples.

j_jochem · on Nov 9, 2019

Unfortunately, PHP doesn't ship a tool for this (and it’s hard due to the way autoloading works). So PHP's typing is slightly less useful than e.g. TypeScript.

Master_Odin · on Nov 9, 2019

phpstan, phan, and psalm all exist and can provide these sorts of static analysis checks.

j_jochem · on Nov 10, 2019

They do static analyses that go beyond type checking. I have not tried phan and psalm, but the last time I used phpstan its type analysis threw false positives because it was using incorrect assumptions. So my impression is that phpstan lacks the rigor of a true compiler‘s type checker phase.

In my opinion, the PHP distribution shipping an official type checker which does nothing other than verify type correctness based on the information in the code would be much more useful. Kind of like `php -l`.

normaljoe · on Nov 9, 2019

Tooling is not the issue here, nor is autoloading.

ISO C and C++ doesn't provide tooling, nor should any language. Tooling is better handled by third parties and implementers. Language should be focused on application and execution of the language. Unfortunately for PHP the language and implementation are tightly coupled as is Java and Swift.

A well constructed modern PHP project using composer has the tooling needed to statically validate. Personally I use PHPStorm but I also use IntelliJ for Java and Android. There are others out there as well.

Autoloading used correctly is no more than Java using a package line or namespace.

I find no problem with TypeScript and others. I however have a 10+ million line code base of PHP that predates TypeScript and others and needs to get a viable transition path. This gets things closer with real type errors at runtime. That is way better than my mainframe COBOL counterparts who have no path forward at the level of modern code.

j_jochem · on Nov 10, 2019

Tooling inside IDEs is somwhat useful. But being able to just run a compiler-like CLI tool to tell you if there are type errors in your program is much more useful still, since you can run it in pre-commit hooks and on the CI. As far as I know, a tool which can do this _without_ requiring extensive configuration and without throwing false positives does not exist yet for PHP.

As for the "language shouldn't provide tooling" argument: You picked C / C++ as a positive example for this. Those are standardized languages which evolve at a glacial pace. For most of their use cases, this is a good thing. But I'd say PHP's faster evolution over the last decade was the right thing for that language. Other modern language projects seem to follow a strategy of a single standard implementation with extensive tooling pretty successfully (e.g. Go, Rust, Swift, ..).

cutler · on Nov 10, 2019

10+ million lines of PHP? What kind of hardware is required to power such an elephant?

beberlei · on Nov 9, 2019

There are phan, php-stan or psalm tools from third parties which add this functionality to PHP. All three of them are very good and actively developed.

_ugfj · on Nov 9, 2019

Objects are enforced and a simple declaration can make PHP enforce the scalars as well. https://3v4l.org/6UU2v https://3v4l.org/q1PAq In PHP 7.4 due in a few weeks covariant return types / contravariant arguments will arrive. https://stitcher.io/blog/new-in-php-74#improved-type-varianc...

sedeki · on Nov 9, 2019

It seems as if declare(strict_types) works for scalars as well (first example)?

However, if I remove that declare from your latter example, only one output is shown.

Edit: Nevermind, I get it.

baby · on Nov 9, 2019

Now please do this for Golang :)

astrodev · on Nov 9, 2019

Before the usual avalanche of posts criticising some PHP 4 features, the union types look like a nice way to formalise something that is commonly used and very useful.