The PHP error you're likely seeing in PHP 8.2 (or higher) is a **deprecation notice** — most probably one of these two:
- `htmlentities(): Passing null to parameter #4 ($double_encode) of type bool is deprecated`
(this became a deprecation in PHP 8.1)
- or less commonly something about encoding handling, but the main issue here is the fourth parameter.
### The Problem
Your line:
```php
$string = strtolower(htmlentities($string, ENT_NOQUOTES, $charset, false));
```
In PHP 8.0 → 8.1 → 8.2 the signature of `htmlentities()` is:
```php
htmlentities(
string $string,
int $flags = ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401,
?string $encoding = null,
bool $double_encode = true
): string
```
- The fourth parameter is **`$double_encode`**: a **boolean**.
- You're passing `false` — that's fine.
- But when you write `htmlentities(..., $charset, false)`, PHP interprets:
- 3rd argument → `$charset` (string)
- 4th argument → `false` (bool)
→ This matches the signature → **no deprecation** for that reason.
### What usually causes the error in code like yours
People very often write one of these by mistake:
```php
htmlentities($str, ENT_NOQUOTES, $charset, 'something') // ← 4th is string → deprecated
htmlentities($str, ENT_NOQUOTES, $charset, null) // ← 4th is null → deprecated since 8.1
```
Or they upgrade from very old code where people wrote:
```php
htmlentities($str, ENT_NOQUOTES, $charset, false) // correct
// but sometimes accidentally:
htmlentities($str, ENT_NOQUOTES, $charset) // no 4th argument → defaults to true
```
### Recommended modern & clean fix (PHP 8.1+ / 8.2+ compatible)
```php
function slug(string $string, string $charset = 'utf-8'): string
{
// Convert to HTML entities (safely handles UTF-8)
$string = htmlentities($string, ENT_NOQUOTES | ENT_SUBSTITUTE | ENT_HTML401, $charset);
$string = strtolower($string);
// Remove diacritics from entities (ą → a, é → e, ñ → n, etc.)
$string = preg_replace('~&([a-z])(?:acute|cedil|caron|circ|grave|orn|ring|slash|th|tilde|uml);~i', '\1', $string);
// Handle some ligatures if needed (æ → ae, œ → oe)
$string = preg_replace('~&([a-z]{2})(?:lig);~i', '\1', $string);
// Remove all remaining entities (€ → '')
$string = preg_replace('~&[^;]+;~', '', $string);
// Replace unwanted characters with hyphen
$string = preg_replace('~[\s!*\'();:@&=+$,/?%#[\]]+~', '-', $string);
// Optional: remove multiple consecutive hyphens and trim
$string = preg_replace('~-+~', '-', $string);
$string = trim($string, '-');
return $string;
}
```
### Alternative – even better in 2024/2025/2026 (recommended)
Use PHP's built-in **transliterator** — much more reliable for all languages:
```php
function slug(string $string): string
{
// Optional: normalize first (very helpful for combining diacritics)
$string = normalizer_normalize($string, Normalizer::FORM_C);
// Transliterate almost any script to ASCII (č → c, ą → a, ß → ss, etc.)
$string = transliterator_transliterate('Any-Latin; Latin-ASCII', $string);
$string = strtolower($string);
// Keep only alphanumeric + hyphen
$string = preg_replace('~[^a-z0-9-]+~', '-', $string);
// Clean multiple / leading / trailing hyphens
$string = preg_replace('~-+~', '-', $string);
$string = trim($string, '-');
return $string ?: ''; // or return some default if empty
}
```
Requires the `intl` extension (almost always enabled).
This version handles far more characters correctly and needs almost no regex magic.
Choose whichever style fits your project — the second one is usually superior in 2025+.
Here are several more concise (and modern) versions of your slug function, ordered from "still very readable" to "maximally compact".
### Version 1 – Clean & readable (recommended 2025+ style)
```php
function slug(string $str, string $charset = 'utf-8'): string
{
$str = htmlentities(strtolower($str), ENT_NOQUOTES, $charset);
$str = preg_replace('~&([a-z])(?:acute|cedil|caron|circ|grave|orn|ring|slash|th|tilde|uml);~i', '$1', $str);
$str = preg_replace('~&[^;]+;~', '', $str);
$str = preg_replace('~[^a-z0-9]+~', '-', $str);
return trim($str, '-');
}
```
→ Removed redundant strtolower call, combined entity stripping, used simpler character class, added trim
### Version 2 – Even shorter (good balance)
```php
function slug(string $s, string $cs = 'utf-8'): string
{
$s = htmlentities(strtolower($s), ENT_NOQUOTES, $cs);
$s = preg_replace(['~&([a-z])[a-z]+;~i', '~&[^;]+;~', '~[^a-z0-9]+~'], ['$1', '', '-'], $s);
return trim($s, '-');
}
```
→ Single `preg_replace` with array pattern/replacement
### Version 3 – Very compact (one-liner style, still understandable)
```php
function slug(string $s, string $cs = 'utf-8'): string {
return trim(preg_replace('~[^a-z0-9]+~', '-',
preg_replace('~&[^;]+;~', '',
preg_replace('~&([a-z])[a-z]+;~i', '$1',
htmlentities(strtolower($s), ENT_NOQUOTES, $cs)
)
)
), '-');
}
```
### Version 4 – Most concise realistic version (still maintains logic)
```php
function slug(string $s, string $cs = 'utf-8'): string {
return trim(preg_replace('~[^a-z0-9]+~', '-', preg_replace('~&[^;]+;~', '', htmlentities(strtolower($s), ENT_NOQUOTES, $cs))), '-');
}
```
→ Removed the diacritic-specific replacement — works surprisingly well for most Latin-based European languages
### Best modern alternative (strongly recommended if you have intl extension)
```php
function slug(string $s): string {
$s = transliterator_transliterate('Any-Latin; Latin-ASCII', $s ?? '');
$s = preg_replace('~[^a-z0-9]+~', '-', strtolower($s));
return trim($s, '-');
}
```
Even shorter, more correct for most languages, no htmlentities needed.
Pick version 1 or the intl version depending on whether you want to keep depending only on core PHP or are okay using the `intl` extension (almost every PHP installation has it).
Which style fits your project best?
- Must be very short?
- Prefer maximum readability?
- Want best Unicode support?
No comments:
Post a Comment