Skip to content

gh-2706: Switch to PHPStan docblock parser#2759

Open
bart-jaskulski wants to merge 29 commits intophpactor:masterfrom
bart-jaskulski:gh-2706
Open

gh-2706: Switch to PHPStan docblock parser#2759
bart-jaskulski wants to merge 29 commits intophpactor:masterfrom
bart-jaskulski:gh-2706

Conversation

@bart-jaskulski
Copy link
Copy Markdown
Contributor

This is a draft for incorporating PHPStan docblock parser with the premise of better stability and performance.

{
$properties = [];
// merge read/write virtual properties
foreach ($this->node->getPropertyTagValues() as $propertyTag) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here only @property tag is listed, without @property-read and @property-write, but I'm not sure if Phpactor has support for those

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phpactor supports them AFAIK but only in the sense that it interprets them as @property

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added @property-read and @property-write to parsed property types


$type = $this->convert($callableNode->returnType);

if (((string) $callableNode->identifier) === 'Closure') {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be unnecessary, would require a test case

use Prophecy\Argument;
use Prophecy\PhpUnit\ProphecyTrait;

class PHPStanDocblockParserFactoryTest extends IntegrationTestCase
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nearly a raw copy of DocblockParserFactoryTest


yield 'conditional type' => [
'/** @return ($foo is true ? string : int) */',
TypeFactory::parenthesized(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, PHPStan return it correctly, but without parenthesis

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConditionalTypes are now wrapped in ParenthesizedType by default with PHPStan parser, but this seems fake to me

@dantleech
Copy link
Copy Markdown
Collaborator

some before/after benchmarks:

  • ./bin/phpactor worse:analyse lib/WorseReflection
  • ./vendor/bin/phpbench run lib/WorseReflection --variant=PhpUnitReflectClassBench (maybe see if any othre benchmarks are sensitive to this change)

@bart-jaskulski
Copy link
Copy Markdown
Contributor Author

Besides DocblockParserFactory, there's also use of raw Phpactor\DocblockParser\DocblockParser in Phpactor\CodeTransform\Adapter\DocblockParser\ParserDocblockUpdater. This also might be replaced with phpstan docblock parser, but as here is crafting docblock comments for actions like Add implements tags, etc and it may be a bit more complex, as phpstan/docblock-parser has good support for printing whole docblock, but not so great for manufacturing them.

What would be your take on that?

And additionally, what are the best ways to test the Phpactor after those changes, specifically, where this phpdoc parsing is used? Apart from phpunit tests, ofc

@dantleech
Copy link
Copy Markdown
Collaborator

dantleech commented Oct 11, 2024

This also might be replaced with phpstan docblock parser

we can leave that as is for now and focus on worse-reflection which is where most of the value comes in (analysis). We can then look at if/how we can replace DocblcokParser elsewhere (one of it's design principles was that it be "lossless" so you can reconsitute the docblock from the AST - but PHPStan may also fulfil that requirement now - but not sure).

the best way to test phpactor is to run the tests? or? but let's run some benchmarks and see how it compares...

use PHPStan\PhpDocParser\Parser\TokenIterator;
use PHPStan\PhpDocParser\Parser\TypeParser;

class PHPStanDocblockParserFactory implements DocBlockFactory
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would now be a bridge to PHPStan, so the namespace for this code should be ...\Bridge\PHPStan\

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to separate namespace, yet for now I left the (redundant) PHPStan prefix in class names (easier during testing and switching with original parser)

if (trim($docblock) === '') {
return new PlainDocblock();
}

Copy link
Copy Markdown
Collaborator

@dantleech dantleech Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that you will need to filter for "supported tags" by regex before proceeding to parse as with the previous factory (maybe it can be done with a decorator). Otherwise we parse every single docblock whether we extract info from it or not which will have a pretty massive performance impact.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By supported you mean those, which has according Type classes in worse reflection? I've ditched this, because I thought those are tags supported by doc parser

Copy link
Copy Markdown
Collaborator

@dantleech dantleech Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i mean:

    const SUPPORTED_TAGS = [
        'property',
        'var',
        'param',
        'return',
        'method',
        'type',
        'deprecated',
        'extends',
        'implements',
        'template',
        'template-covariant',
        'template-extends',
        'mixin',
        'throws',
        'assert',
    ];

are the tags that are analysed, if there is a docblock that has none of those then we don't need to parse it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@bart-jaskulski
Copy link
Copy Markdown
Contributor Author

I've been looking at benchmarks from CI:

commit from master: https://github.com/phpactor/phpactor/actions/runs/11244200976/job/31261739177#step:5:84
this PR: https://github.com/phpactor/phpactor/actions/runs/11289848501/job/31400507757?pr=2759#step:5:84

There's a slight improvement, but negligible, may be only a variance between tests

@dantleech
Copy link
Copy Markdown
Collaborator

dantleech commented Oct 11, 2024

yeah, i'd run them locally in anycase, you can get wildly different results in GL runners.

FWIW this was my original attempt to use phpstan from > 6 years ago ... it was much simpler then 😄 :

phpactor/worse-reflection#28

(at the time it was replacing a reg-based solution).

@bart-jaskulski
Copy link
Copy Markdown
Contributor Author

bart-jaskulski commented Oct 11, 2024

I did some benchmarking on my (low-end) laptop (intel core i7-1065G7, integrated graphics, 8GB ram, PHP 8.3). Below is the median for both branches from 3 runs:

Benchmark master gh-2706 diff
PhpUnitReflectClassBench: test_case 15.628 15.689 0.39%
PhpUnitReflectClassBench: methods_and_properties 54.07 52.95 -2.07%
PhpUnitReflectClassBench: method_frames 2091.34 1967.71 -5.91%
SelfReflectClassBench: methods_and_properties 0.77 0.80 3.89%
SelfReflectClassBench: frames 22.04 20.9 -5.17%
YiiBench: benchMembers 92.720 61.806 -33.34%
ReflectionStubsBench: test_classes_and_methods 4.971 4.903 -1.36%
ReflectPropertyBench: property 1.38 1.35 -2.17%
ReflectPropertyBench: property_return_type 2.62 2.61 -0.38%
ReflectMethodBench: method 1.50 1.15 -23.33%
ReflectMethodBench: method_return_type 2.68 2.64 -1.49%
ReflectMethodBench: inferred_return_type 1.92 1.87 -2.60%
DiagnosticsBench: lots_of_missing_methods 61.176 59.833 -2.19%
DiagnosticsBench: lots_of_new_generic 23.622 23.405 -0.91%
DiagnosticsBench: lots_of_new_objects 21.250 20.421 -3.90%
DiagnosticsBench: method_chain.test 26.188 26.474 1.09%
DiagnosticsBench: phpstan.test 2010 1973 -1.8%
AnalyserBench: benchAnalyse 59.801 63.963 6.95%

EDIT: updated values for gh-2706 branch after fixing the measurement

@dantleech
Copy link
Copy Markdown
Collaborator

dantleech commented Oct 11, 2024

hm ok, well it's not the end of the world necessarily if we can make up the performance elsewhere (phpactor can be so much faster than it is), we can also make this a strategy and merge with it being disabled by default for now, I think there is a test for the factory/type converter which could maybe be abstracted to run on both implementations? In anycase it would be very useful to have this as a second strategy.

@bart-jaskulski
Copy link
Copy Markdown
Contributor Author

Please, review the benchmark table again, as I have mistakenly benchmarked the branches without actual changes 🤦

I've edited the table to avoid confusion, and now it includes the values where tested code is actually using PHPStanDocblockParserFactory (benchmarks before still used original parser, what makes it even more bizzare, that results were slower).

Anyway, this again shows more potential.

Please, note that at current stage I've hardcoded PHPStan parser, yet closer to the actual merge, I will introduce strategy you mentioned (after solving failing test cases)

@dantleech
Copy link
Copy Markdown
Collaborator

dantleech commented Oct 14, 2024

Yes, it's tricky :) in any case adding the filtering would have helped, but looks encouraging!

$docblock,
$matches
)) {
return new PlainDocblock($docblock);
Copy link
Copy Markdown
Collaborator

@dantleech dantleech Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should extract this "filter" to e.g. Phpactor\WorseReflection\Core\DocBloc so that it's used by both implementations (assuming we will run them side-by-side via. a strategy initially)... or add a decorator (tbh though I think I prefer a simple static method).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good 👍 . I was thinking about the same, but didn't want to introduce that prematurely

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've settled with decorator because I did not fully understand how could I handle this with just static method, but if you have any suggestion, please do.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decorator is fine 👍

@bart-jaskulski
Copy link
Copy Markdown
Contributor Author

I'd like to ask for help about specific cases and guidance in solving the issues.

  1. https://github.com/phpactor/phpactor/actions/runs/11327207256/job/31497789474?pr=2759#step:5:400
    PHPStan doesn't wrap types in parenthesizes. Is it acceptable?

  2. https://github.com/phpactor/phpactor/actions/runs/11327207256/job/31497789474?pr=2759#step:5:394
    Invalid virtual properties (missing type or property name) are ignored in PHPStan parser. Invalid in current implementation.

  3. https://github.com/phpactor/phpactor/actions/runs/11327207256/job/31497789474?pr=2759#step:5:372
    This test seems a bit odd to me -- how (string|int)|int is even valid typing? Doesn't it reduce to string|int even without any assertions?

  4. https://github.com/phpactor/phpactor/actions/runs/11327207256/job/31497789474?pr=2759#step:5:326
    https://github.com/phpactor/phpactor/actions/runs/11327207256/job/31497789474?pr=2759#step:5:348
    I've got some issues about generics, but only in those two cases. TBH, I don't know what is the difference between native parser and phpstan implementation, but I surely miss something. Any guidance appreciated.

@dantleech
Copy link
Copy Markdown
Collaborator

dantleech commented Oct 14, 2024

PHPStan doesn't wrap types in parenthesizes. Is it acceptable?

yes

Invalid virtual properties (missing type or property name) are ignored in PHPStan parser. Invalid in current implementation.

fine to ignore invalid ones i guess

This test seems a bit odd to me -- how (string|int)|int is even valid typing? Doesn't it reduce to string|int even without any assertions?

it's fine if PHPStan reduces the type

I've got some issues about generics, but only in those two cases. TBH, I don't know what is the difference between native parser and phpstan implementation, but I surely miss something. Any guidance appreciated.

diff --git a/lib/WorseReflection/Tests/Inference/generics/class_implements_multiple1.test b/lib/WorseReflection/Tests/Inference/generics/class_implements_multiple1.test
index e07e6e36..c8238f5d 100644
--- a/lib/WorseReflection/Tests/Inference/generics/class_implements_multiple1.test
+++ b/lib/WorseReflection/Tests/Inference/generics/class_implements_multiple1.test
@@ -23,7 +23,8 @@ interface Second {
 }
 
 /**
- * @implements Second<Baz>, First<Boo>
+ * @implements Second<Baz>
+ * @implements First<Boo>
  */
 class Foo implements Second, First
 {

for the second (narrowing) with @psalm-assert it looks like we're only looking for @phpstan-assert - I didn't look beyond that but maybe it helps?

@bart-jaskulski
Copy link
Copy Markdown
Contributor Author

FYI, I'll get back to this PR next week

@bart-jaskulski bart-jaskulski force-pushed the gh-2706 branch 2 times, most recently from f76cd90 to 00bbe1e Compare November 7, 2024 07:34
For now, withdraw from attempts to separate with newline changes between
different tags. The most naive approach to check for previous and
current tag, and render empty line if change is detected fails short,
when there are tags, we don't want to render – those will result in
excessive empty lines.

Signed-off-by: Bart Jaskulski <[email protected]>
Signed-off-by: Bart Jaskulski <[email protected]>
Signed-off-by: Bart Jaskulski <[email protected]>
return;
}

if ($docblockReturn->isDefined()) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are MissingReturnTypeProvider and DocblockMissingReturnTypeProvider. It seems, even on master branch declaring return type only in docblock leaves with false-positive diagnostic about missing return type. I've added this check, but it may be better to merge both classes -- despite the change I think it is out of the scope of this PR, but without this new test case is failing. Anyway, up to you if this should be split into own PR.

@bart-jaskulski
Copy link
Copy Markdown
Contributor Author

As phpstan/phpdoc-parser v2 just dropped, I've upgraded the library.

Besides, when using this branch on daily basis I've noticed two issues, which are now covered with test case and appropriate fix (missing return and unused import for array types).

Signed-off-by: Bart Jaskulski <[email protected]>
@dantleech
Copy link
Copy Markdown
Collaborator

dantleech commented Nov 13, 2024

Nice, so what is left from your perspective? I guess we still need to run the tests with both implementations (something I could look at).

I had a quick look into this and noticed that PHPStan has an option to record the line and offset in docblocks, although TBH I haven't figured out what they relate to or how we can use them. But presumably we can fully replace the original parser with that option in the long run (althoug the API isn't as nice)

@bart-jaskulski
Copy link
Copy Markdown
Contributor Author

bart-jaskulski commented Nov 13, 2024

Nice, so what is left from your perspective? I guess we still need to run the tests with both implementations (something I could look at).

Yeah, running tests for both implementation would be crucial -- any help appreciated. Besides coming up with some sort of consensus about discrepancies in tests, like rendered doc formatting or changes which are not reflected in native parser. I've annotated all such cases in this PR comments; it would also be obvious when running the test suite against the original parser. ATM there are 4 test cases which fail on original parser.

presumably we can fully replace the original parser with that option in the long run (althoug the API isn't as nice)

API for creating phpdoc tags in phpstan is indeed more object-oriented (:boom:) and burdensome to actually craft anything, but it would be possible to make such replacement, following the example from their readme with Printer: https://github.com/phpstan/phpdoc-parser?tab=readme-ov-file#format-preserving-printer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants