Skip to content

Commit

Permalink
Merge pull request #9 from kynx/php83
Browse files Browse the repository at this point in the history
Add PHP 8.3 support, update dependencies
  • Loading branch information
kynx authored Oct 21, 2024
2 parents 12f42ee + c55f022 commit 684ab1e
Show file tree
Hide file tree
Showing 21 changed files with 1,019 additions and 921 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
/vendor/
/.phpcs-cache
/.phpunit.result.cache
/.phpunit.cache/
9 changes: 9 additions & 0 deletions .markdownlint.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"default": true,
"MD013": false,
"MD014": false,
"MD024": false,
"MD028": false,
"MD031": { "list_items": false },
"MD034": false
}
52 changes: 28 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,15 @@

Utilities for generating PHP code.


## Normalizers

The normalizers generate readable PHP labels (class names, namespaces, property names, etc) from valid UTF-8 strings,
The normalizers generate readable PHP labels (class names, namespaces, property names, etc) from valid UTF-8 strings,
[transliterating] them to ASCII and spelling out any invalid characters.

### Usage:
### Usage

The following code (forgive the Japanese - a certain translation tool tells me it means "Pet Store"):

```php
<?php

Expand All @@ -24,11 +24,13 @@ echo $namespace;
```

outputs:
```

```text
Petto\Shoppu
```

and:

```php
<?php

Expand All @@ -40,47 +42,48 @@ echo $property;
```

outputs:
```

```text
twoDollarBill
```

See the [tests] for more examples.

### Why?

You must **never** run code generated from untrusted user input. But there are a few cases where you do want to
You must **never** run code generated from untrusted user input. But there are a few cases where you do want to
_output_ code generated from (mostly) trusted input.

In my case, I need to generate classes and properties from an OpenAPI specification. There are no hard-and-fast rules
on the characters present, just a vague "it is RECOMMENDED to follow common programming naming conventions". Whatever
they are.
on the characters present, just a vague "it is RECOMMENDED to follow common programming naming conventions". Whatever
they are.

### How?

Each normalizer uses `ext-intl`'s [Transliterator] to turn the UTF-8 string into Latin-ASCII. Where a character has no
equivalent in ASCII (the "€" symbol is a good example), it uses the [Unicode name] of the character to spell it out (to
`Euro`, after some minor clean-up). For ASCII characters that are not valid in a PHP label, it provides its own spell
Each normalizer uses `ext-intl`'s [Transliterator] to turn the UTF-8 string into Latin-ASCII. Where a character has no
equivalent in ASCII (the "€" symbol is a good example), it uses the [Unicode name] of the character to spell it out (to
`Euro`, after some minor clean-up). For ASCII characters that are not valid in a PHP label, it provides its own spell
outs. For instance, a backtick "&#96;" becomes `Backtick`.

Initial digits are also spelt out: "123foo" becomes `OneTwoThreeFoo`. Finally reserved words are suffixed with a
user-supplied string so they don't mess things up. In the first usage example above, if we normalized "class" it would
Initial digits are also spelt out: "123foo" becomes `OneTwoThreeFoo`. Finally reserved words are suffixed with a
user-supplied string so they don't mess things up. In the first usage example above, if we normalized "class" it would
become `ClassController`.

The results may not be pretty. If for some mad reason your input contains ` ͖` - put your glasses on! - the label will
contain `CombiningRightArrowheadAndUpArrowheadBelow`. But it _is_ valid PHP, and stands a chance of being as unique as
The results may not be pretty. If for some mad reason your input contains `͖` - put your glasses on! - the label will
contain `CombiningRightArrowheadAndUpArrowheadBelow`. But it _is_ valid PHP, and stands a chance of being as unique as
the original. Which brings me to...


## Unique labelers

The normalization process reduces around a million Unicode code points down to just 162 ASCII characters. Then it
mangles the label further by stripping separators, reducing whitespace and turning it into camelCase, snake_case or
The normalization process reduces around a million Unicode code points down to just 162 ASCII characters. Then it
mangles the label further by stripping separators, reducing whitespace and turning it into camelCase, snake_case or
whatever your programming preference. It's gonna be lossy - nothing we can do about that.

The unique labelers' job is to add back lost uniqueness, using a `UniqueStrategyInterface` to decorate any non-unique
class names in the list it is given.

To guarantee uniqueness within a set of class name labels, use the `UniqueClassLabeller`:

```php
<?php

Expand All @@ -96,7 +99,8 @@ var_dump($unique);
```

outputs:
```

```text
array(3) {
'Déjà vu' =>
string(7) "DejaVu1"
Expand All @@ -107,10 +111,11 @@ array(3) {
}
```

There are labelers for each of the normalizers: `UniqueClassLabeler`, `UniqueConstantLabeler`, `UniquePropertyLabeler`
and `UniqueVariableLabeler`. Along with the `NumberSuffix` implementation of `UniqueStrategyInterface`, we provide a
There are labelers for each of the normalizers: `UniqueClassLabeler`, `UniqueConstantLabeler`, `UniquePropertyLabeler`
and `UniqueVariableLabeler`. Along with the `NumberSuffix` implementation of `UniqueStrategyInterface`, we provide a
`SpellOutOrdinalPrefix` strategy. Using that instead of `NumberSuffix` above would output:
```

```text
array(3) {
'Déjà vu' =>
string(11) "FirstDejaVu"
Expand All @@ -123,8 +128,7 @@ array(3) {

Kinda cute, but a bit verbose for my taste.


[transliterating]: https://unicode-org.github.io/icu/userguide/transforms/general/#script-transliteration
[tests]: ./test/AbstractNormalizerTest.php
[Transliterator]: https://www.php.net/manual/en/class.transliterator.php
[Unicode name]: https://unicode.org/charts/charindex.html
[Unicode name]: https://unicode.org/charts/charindex.html
9 changes: 5 additions & 4 deletions composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,14 @@
"sort-packages": true
},
"require": {
"php": "~8.1 || ~8.2",
"php": "~8.2.0 || ~8.3.0",
"ext-intl": "*"
},
"require-dev": {
"laminas/laminas-coding-standard": "^2.4",
"phpunit/phpunit": "^9.5",
"vimeo/psalm": "^4.27"
"laminas/laminas-coding-standard": "^3.0",
"phpunit/phpunit": "^10.5.37",
"psalm/plugin-phpunit": "^0.19.0",
"vimeo/psalm": "^5.26"
},
"autoload": {
"psr-4": {
Expand Down
Loading

0 comments on commit 684ab1e

Please sign in to comment.