bluf: PHP-Styler will turn this ...

namespace App\Report;use App\Db\{Connection,Result}; function
buildUserReport(Connection $db,string $region,int $limit):Result{
return $db->table('users')->select('id','display_name','email_address',
'last_login_at','current_region')->where('status','=','active')->where(
'region','=',$region)->whereIn('role',['administrator','editor',
'contributing_author','subscriber'])->orderBy('last_login_at','desc')->
limit($limit)->get();}

... into this:

namespace App\Report;

use App\Db\Connection;
use App\Db\Result;

function buildUserReport(Connection $db, string $region, int $limit) : Result
{
    return $db->table('users')
        ->select(
            'id',
            'display_name',
            'email_address',
            'last_login_at',
            'current_region',
        )
        ->where('status', '=', 'active')
        ->where('region', '=', $region)
        ->whereIn(
            'role', ['administrator', 'editor', 'contributing_author', 'subscriber'],
        )
        ->orderBy('last_login_at', 'desc')
        ->limit($limit)
        ->get();
}

Try reformatting your own code at the php-styler.com demo site.


I.

PHP CS Fixer, PHP_CodeSniffer/PHPCBF, ECS, and PHP_Beautifier are code fixers. They detect rule violations in your existing code and patch them in place. You turn on the rules you care about, and they reshape the parts of your source that break those rules.

PHP-Styler, on the other hand, is a complete code re-formatter. It parses PHP source files into tokens, applies configurable formatting rules and styles, and reconstructs the code with consistent horizontal spacing, vertical spacing, and automatic line splitting. It discards your existing layout entirely and arranges each element of the source code one by one. That puts it in the same family as Prettier for JavaScript, Black for Python, dart_style for Dart, and gofmt for Go.

This nets some benefits and some drawbacks. Among others:

  • Line-length-aware reflow. Long function calls, arrays, and fluent chains are split across lines automatically. (Fixers generally do not split lines for you.)

  • Deterministic pipeline. The same input always produces the same output regardless of its prior layout, and running the tool twice is idempotent.

  • However, it does not preserve hand-aligned columns, and it compresses runs of blank lines to one.

As a side note, parallel execution is built in. The --workers=auto flag uses proc_open to spread files across child processes, speeding up the processing of large codebases.

II.

In the 0.16 release from 2+ years ago, PHP-Styler would style code based on PHP-Parser AST tokens. That approach went a long way, but it brought two persistent problems along with it. The AST-based approach could lose or misplace comments in several contexts -- inside argument lists, at the end of switch cases, on concatenation lines, and as the sole content of blocks. Interpolated strings and heredocs were also reconstructed from AST nodes, which meant literal newlines inside double-quoted strings turned into \n escape sequences on the way out. (In fairness, the PHP-Parser docs said that the AST itself was not suitable for code formatting and styling, but early success encouraged me to keep going; it was only later that these behaviors became apparent.)

After taking the excellent Sandi Metz "99 Bottles of OOP" course, I began to wonder if PHP-Styler could build up a custom set of tokens based not on an AST, but instead on the code elements to be formatted. Those tokens would be more polymorphic than not, and carry most (if not all) of the parsing and dispatching logic themselves.

That approach has been hugely successful. For one, comments and interpolated strings were finally honored properly. But it also ended up creating a catalog of about 500+ highly specific token classes. The level of detail is at "this is the opening brace of an if statement" -- with a separate token for each of the different kinds of opening and closing braces, brackets, parentheses, etc.

The resolution is very finely grained, and the parsing process is specific to each token, including whether or not it should dispatch to another token for parsing. For example, when the parser encounters T_AS, the TAs token can look at the surrounding context to determine if it is a foreach ... as or a use ClassName as or a use TraitName as, and so on. Thus, the 500-odd token classes.

And when you look at the tests, you can see exactly what the source code parsed as. For example, this source code ...

<?php
abstract class Foo
{
    abstract function bar();
}

... gets parsed into these styler token classes:

[
    TPhpOpeningTag::class,
    TAbstract::class,
    TClass::class,
    TClassName::class,
    TClassOpeningBrace::class,
    TAbstract::class,
    TFunction::class,
    TFunctionName::class,
    TParamsOpeningParen::class,
    TParamsClosingParen::class,
    TAbstractMethodEndSemicolon::class,
    TClassClosingBrace::class,
]

That level of detail means all the styling for each token can live with that token: space before or after, line break before or after, blank line before or after, etc. That in turn means the styling can be as specific or as generic as you like. A colon is not just a colon, it is a "ternary colon" or a "return type colon" or "an alternative-if colon" and so on. You can set very specific styling for each one, which gives tremendous flexibility -- not all of which is needed, of course.

In the end, the tokens did not carry all of the parsing logic. The Parser class itself still needs to coordinate a lot more activity than I'd like, but it's a lot less coordination than in Printer+Styler classes of the 0.16 version. And the old Visitor pattern has been removed as entirely unnecessary.

III.

There's a ton more, much more than I can put in a blog post:

If you want a complete code reformatter, not just a fixer, try out PHP-Styler, either at the php-styler.com demo site or by installing it with composer require --dev pmjones/php-styler.


Are you stuck with a legacy PHP application? You should buy my book because it gives you a step-by-step guide to improving you codebase, all while keeping it running the whole time.