More table formatting wonkiness by danielbachhuber · Pull Request #66 · wp-cli/php-cli-tools

danielbachhuber · 2014-09-09T14:56:38Z

Thar be bugs

danielbachhuber · 2014-09-08T03:56:15Z

@szepeviktor any insights into what the root of these character encoding issues are?

szepeviktor · 2014-09-08T11:51:08Z

<?php echo mb_strlen("日本語", "UTF-8");
3

szepeviktor · 2014-09-08T11:53:00Z

Strange

<?php echo mb_detect_encoding("日本語");
UTF-8

php-cli-tools/lib/cli/cli.php

Line 164 in 0fa663f

$length = mb_strlen( $str, mb_detect_encoding( $str ) );

szepeviktor · 2014-09-08T12:10:13Z

Got it.
Technically it works fine, "日本語" is really 3 characters long but takes up 6 chars' width!
You can see above it: "It-al-ia" = 6

szepeviktor · 2014-09-08T12:11:55Z

Please use mb_strwidth()

szepeviktor · 2014-09-08T12:12:19Z

And $length = preg_match_all( '/.{1}/us', $str ); instead of iconv.

Because `safe_strlen()` gives us the string length for output, we need the true length to determine how much we should pad the string

danielbachhuber · 2014-09-09T15:36:01Z

It's actually a problem with safe_strpad(). I've worked out a fix, but it still doesn't work with Hebrew and Burmese:

szepeviktor · 2014-09-09T22:19:45Z

Burmese has multiple signs in one position, which is unclear to me.
This Hebrew writing has two (separate) accents (actually vowels) under the letters.
I think this is why those strings are calculated two characters more than the actual width - making padding narrower.
PHP i18n seems shallow. Java i18n too.

szepeviktor · 2014-09-09T22:23:57Z

You can strip out hebrew vowels: http://blog.shaftek.org/2005/06/03/removing-vowels-from-hebrew-unicode-text/ before padding it. More details

szepeviktor · 2014-09-09T22:31:16Z

What font do you use on your terminal?

danielbachhuber · 2014-09-09T22:45:55Z

You can strip out hebrew vowels

Ugh. Could we do similar detection for Burmese?

What font do you use on your terminal?

Source Code Pro.

szepeviktor · 2014-09-10T02:02:29Z

Please do not support Burmese language.
I write the code when you find me a Burmese wp-cli user.

Hebrew writing has two separate accents / vowels under letters. In testing, all fonts properly handle this

danielbachhuber · 2014-09-10T12:11:26Z

I write the code when you find me a Burmese wp-cli user.

It's a deal :) Thanks for your help with this.

szepeviktor · 2014-09-10T12:12:56Z

And $length = preg_match_all( '/.{1}/us', $str ); instead of iconv.

danielbachhuber · 2014-09-10T12:14:57Z

This is why we're using iconv(). If we used preg_match_all(), we'd have to roll our own error notice. Is there something I'm missing?

More table formatting wonkiness

szepeviktor · 2014-09-10T12:24:10Z

if non-ascii encoding is present

In what case should preg_match_all return an error?
/.{1}/us does length measurement only.

szepeviktor · 2014-09-10T12:27:00Z

preg_match_all returns false on non-UNICODE input.

php -r 'var_export( preg_match_all( "/.{1}/us", "'$(echo -n óra|recode utf8..latin2)'") );'

danielbachhuber · 2014-09-10T12:32:09Z

Ok. I think this is good enough for now. We can revisit later as needed.

danielbachhuber added bug scope:table labels Sep 8, 2014

Tests for characters that take up double-width

1770752

danielbachhuber added this to the next milestone Sep 9, 2014

danielbachhuber added 2 commits September 9, 2014 08:28

safe_strpad() should pad the string according to display needs

40b886d

Because `safe_strlen()` gives us the string length for output, we need the true length to determine how much we should pad the string

Value is no longer used

f2bb69d

Strip Hebrew vowel characters from real length calculation

52d77b1

Hebrew writing has two separate accents / vowels under letters. In testing, all fonts properly handle this

danielbachhuber added a commit that referenced this pull request Sep 10, 2014

Merge pull request #66 from wp-cli/fix-66

c35014e

More table formatting wonkiness

danielbachhuber merged commit c35014e into master Sep 10, 2014

danielbachhuber deleted the fix-66 branch September 10, 2014 12:22

Conversation

danielbachhuber commented Sep 9, 2014

Uh oh!

danielbachhuber commented Sep 8, 2014

Uh oh!

szepeviktor commented Sep 8, 2014

Uh oh!

szepeviktor commented Sep 8, 2014

Uh oh!

szepeviktor commented Sep 8, 2014

Uh oh!

szepeviktor commented Sep 8, 2014

Uh oh!

szepeviktor commented Sep 8, 2014

Uh oh!

danielbachhuber commented Sep 9, 2014

Uh oh!

szepeviktor commented Sep 9, 2014

Uh oh!

szepeviktor commented Sep 9, 2014

Uh oh!

szepeviktor commented Sep 9, 2014

Uh oh!

danielbachhuber commented Sep 9, 2014

Uh oh!

szepeviktor commented Sep 10, 2014

Uh oh!

danielbachhuber commented Sep 10, 2014

Uh oh!

szepeviktor commented Sep 10, 2014

Uh oh!

danielbachhuber commented Sep 10, 2014

Uh oh!

szepeviktor commented Sep 10, 2014

Uh oh!

szepeviktor commented Sep 10, 2014

Uh oh!

danielbachhuber commented Sep 10, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants