Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure translatable strings, and string domains are not dynamic #136

Merged
merged 13 commits into from
Mar 23, 2016
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions WordPress-Extra/ruleset.xml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
<rule ref="WordPress.WP.PreparedSQL"/>
<rule ref="WordPress.Variables.GlobalVariables"/>
<rule ref="WordPress.PHP.StrictComparisons" />
<rule ref="WordPress.WP.I18n" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go into WordPress-Extra or WordPress-Core? I'm just concerned about use with actual core, and how a lot of warnings will start appearing since no text domain is supplied.


<!-- https://vip.wordpress.com/documentation/code-review-what-we-look-for/#using-in_array-without-strict-parameter -->
<rule ref="WordPress.PHP.StrictInArray" />
Expand Down
256 changes: 256 additions & 0 deletions WordPress/Sniffs/WP/I18nSniff.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
<?php
/**
* WordPress_Sniffs_WP_I18nSniff
*
* Makes sure internationalization functions are used properly
*
* @category PHP
* @package PHP_CodeSniffer
* @author Shady Sharaf <[email protected]>
*/
class WordPress_Sniffs_WP_I18nSniff extends WordPress_Sniff {

public $i18n_functions = array(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to flip this array and have the function names as keys (this is standard in most other sniffs)? Then you can use isset() instead of in_array(). And perhaps more importantly, you could let the nature of each function be specified by the values, rather than hardcoding this below. So you could have:

array(
    '__' => 'simple',
    '_x' => 'context',
    '_n' => 'plural',
    '_nx' => 'plural_context',
)

ect.

Because right now if someone ever wanted to sub-class this and add their own custom translation functions (bad idea, I know), they'd have to duplicate all of the code rather than just add them to this list, because how each function is handled has to be hard-coded below.

Anyway, I just prefer isset() to in_array(). 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great points! See 9ebd183.

'translate',
'translate_with_gettext_context',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

translate() and translate_with_gettext_context() are low-API functions, they shouldn't be used by plugins IMO.

'__',
'esc_attr__',
'esc_html__',
'_e',
'esc_attr_e',
'esc_html_e',
'_x',
'_ex',
'esc_attr_x',
'esc_html_x',
'_n',
'_nx',
'_n_noop',
'_nx_noop',
);

/**
* Returns an array of tokens this test wants to listen for.
*
* @return array
*/
public function register() {
return array(
T_STRING,
);
}

/**
* Processes this test, when one of its tokens is encountered.
*
* @param PHP_CodeSniffer_File $phpcs_file The file being scanned.
* @param int $stack_ptr The position of the current token
* in the stack passed in $tokens.
*
* @return void
*/
public function process( PHP_CodeSniffer_File $phpcs_file, $stack_ptr ) {
$tokens = $phpcs_file->getTokens();
$token = $tokens[ $stack_ptr ];

if ( '_' === $token['content'] ) {
$phpcs_file->addError( 'Found single-underscore "_()" function when double-underscore expected.', $stack_ptr, 'SingleUnderscoreGetTextFunction' );
}

if ( ! in_array( $token['content'], $this->i18n_functions, true ) ) {
return;
}

if ( in_array( $token['content'], array( 'translate', 'translate_with_gettext_context' ), true ) ) {
$phpcs_file->addWarning( 'Use of the "%s()" function is reserved for low-level API usage.', $stack_ptr, 'LowLevelTranslationFunction', array( $token['content'] ) );
}

$translation_function = $token['content'];

$func_open_paren_token = $phpcs_file->findNext( T_WHITESPACE, $stack_ptr + 1, null, true );
if ( ! $func_open_paren_token || T_OPEN_PARENTHESIS !== $tokens[ $func_open_paren_token ]['code'] ) {
return;
}

$arguments_tokens = array();
$argument_tokens = array();

// Look at arguments.
for ( $i = $func_open_paren_token + 1; $i < $tokens[ $func_open_paren_token ]['parenthesis_closer'] - 1; $i += 1 ) {
$this_token = $tokens[ $i ];
$this_token['token_index'] = $i;
if ( in_array( $this_token['code'], array( T_WHITESPACE, T_COMMENT ), true ) ) {
continue;
}
if ( T_COMMA === $this_token['code'] ) {
$arguments_tokens[] = $argument_tokens;
$argument_tokens = array();
continue;
}
$argument_tokens[] = $this_token;

// Include everything up to and including the parenthesis_closer if this token has one.
if ( ! empty( $this_token['parenthesis_closer'] ) ) {
for ( $j = $i + 1; $j <= $this_token['parenthesis_closer']; $j += 1 ) {
$tokens[ $j ]['token_index'] = $j;
$argument_tokens[] = $tokens[ $j ];
}
$i = $this_token['parenthesis_closer'];
}
}
if ( ! empty( $argument_tokens ) ) {
$arguments_tokens[] = $argument_tokens;
}
unset( $argument_tokens );

$argument_assertions = array();
if ( in_array( $translation_function, array( '__', 'esc_attr__', 'esc_html__', '_e', 'esc_attr_e', 'esc_html_e', 'translate' ) ) ) {
$argument_assertions[] = array( '$text', 'check_literal_string_text_tokens' );
$argument_assertions[] = array( '$domain', 'check_string_domain_tokens' );
} else if ( in_array( $translation_function, array( '_x', '_ex', 'esc_attr_x', 'esc_html_x', 'translate_with_gettext_context' ) ) ) {
$argument_assertions[] = array( '$text', 'check_literal_string_text_tokens' );
$argument_assertions[] = array( '$context', 'check_literal_string_context_tokens' );
$argument_assertions[] = array( '$domain', 'check_string_domain_tokens' );
} else if ( in_array( $translation_function, array( '_n', '_n_noop' ) ) ) {
$argument_assertions[] = array( '$single', 'check_literal_string_text_tokens' );
$argument_assertions[] = array( '$plural', 'check_literal_string_text_tokens' );
$argument_assertions[] = array( '$number', 'check_number_tokens' );
$argument_assertions[] = array( '$domain', 'check_string_domain_tokens' );
} else if ( in_array( $translation_function, array( '_nx', '_nx_noop' ) ) ) {
$argument_assertions[] = array( '$single', 'check_literal_string_text_tokens' );
$argument_assertions[] = array( '$plural', 'check_literal_string_text_tokens' );
$argument_assertions[] = array( '$number', 'check_number_tokens' );
$argument_assertions[] = array( '$context', 'check_literal_string_context_tokens' );
$argument_assertions[] = array( '$domain', 'check_string_domain_tokens' );
}

$argument_stack_ptr = $func_open_paren_token;
foreach ( $argument_assertions as $argument_assertion ) {
$argument_tokens = array_shift( $arguments_tokens );
if ( ! $argument_tokens ) {
$argument_tokens = array();
} else {
$argument_stack_ptr = $argument_tokens[0]['token_index'];
}
$method_name = $argument_assertion[1];
$arg_name = $argument_assertion[0];
call_user_func( array( $this, $method_name ), $phpcs_file, $arg_name, $argument_stack_ptr, $argument_tokens );
}
}

/**
* Check if supplied tokens represent a translation text string literal.
*
* @param PHP_CodeSniffer_File $phpcs_file The file being scanned.
* @param string $arg_name
* @param int $stack_ptr
* @param array $tokens
* @return bool
*/
protected function check_literal_string_text_tokens( $phpcs_file, $arg_name = '$text', $stack_ptr, $tokens = array() ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an unfortunate amount of duplication between check_literal_string_text_tokens, check_literal_string_context_tokens, and check_string_domain_tokens. However, an immediate refactoring wasn't apparent which would allow custom messages as well as error codes for each argument type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an unfortunate amount of duplication between check_literal_string_text_tokens, check_literal_string_context_tokens, and check_string_domain_tokens. However, an immediate refactoring wasn't apparent which would allow custom messages as well as error codes for each argument type.

Aren't the messages and error codes identical other than the arg names though? ("translatable text", "context", "text domain") I think it could be refactored to a single method that would just accept those strings and build the error messages and codes from them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, some are errors and some are warnings.

It could be refactored to try to abstract these out more by passing in params to common methods if you think it is worthwhile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JDGrimes here you go: a47029e

if ( 0 === count( $tokens ) ) {
$phpcs_file->addError( "Missing translatable text (%s arg).", $stack_ptr, 'MissingText', array( $arg_name ) );
return false;
}
if ( count( $tokens ) > 1 ) {
$contents = '';
foreach ( $tokens as $token ) {
$contents .= $token['content'];
}
$phpcs_file->addError( 'Translatable text (%s arg) must be a single string literal, not "%s".', $stack_ptr, 'NonSingularStringLiteralText', array( $arg_name, $contents ) );
return false;
}
if ( T_CONSTANT_ENCAPSED_STRING === $tokens[0]['code'] ) {
return true;
}
if ( T_DOUBLE_QUOTED_STRING === $tokens[0]['code'] ) {
$interpolated_variables = $this->get_interpolated_variables( $tokens[0]['content'] );
foreach ( $interpolated_variables as $interpolated_variable ) {
$phpcs_file->addError( 'Translatable text (%s arg) must not contain interpolated variables. Found "$%s".', $stack_ptr, 'InterpolatedVariableText', array( $arg_name, $interpolated_variable ) );
}
return false;
}
$phpcs_file->addError( 'Trnslatable text (%s arg) should be single a string literal, not "%s".', $stack_ptr, 'NonSingularStringLiteralText', array( $arg_name, $tokens[0]['content'] ) );
return false;
}

/**
* The $number argument can be anything, so this is a no-op.
*
* @return bool
*/
protected function check_number_tokens() {
return true;
}

/**
* Check if supplied tokens are a literal string context.
*
* @param PHP_CodeSniffer_File $phpcs_file The file being scanned.
* @param int $stack_ptr
* @param array $tokens
* @return bool
*/
protected function check_literal_string_context_tokens( $phpcs_file, $arg_name = '$context', $stack_ptr, $tokens = array() ) {
if ( 0 === count( $tokens ) ) {
$phpcs_file->addError( "Missing context (%s arg).", $stack_ptr, 'MissingContext', array( $arg_name ) );
return false;
}
if ( count( $tokens ) > 1 ) {
$contents = '';
foreach ( $tokens as $token ) {
$contents .= $token['content'];
}
$phpcs_file->addError( 'Context (%s arg) should be a single string literal, not "%s".', $stack_ptr, 'NonSingularStringLiteralContext', array( $arg_name, $contents ) );
return false;
}
if ( T_CONSTANT_ENCAPSED_STRING === $tokens[0]['code'] ) {
return true;
}
if ( T_DOUBLE_QUOTED_STRING === $tokens[0]['code'] ) {
$interpolated_variables = $this->get_interpolated_variables( $tokens[0]['content'] );
foreach ( $interpolated_variables as $interpolated_variable ) {
$phpcs_file->addError( 'Context strings (%s arg) should not contain interpolated variables. Found "$%s".', $stack_ptr, 'InterpolatedVariableContext', array( $arg_name, $interpolated_variable ) );
}
return false;
}
$phpcs_file->addError( 'Context (%s arg) should be single a string literal, not "%s".', $stack_ptr, 'NonSingularStringLiteralContext', array( $arg_name, $tokens[0]['content'] ) );
return false;
}

/**
* Check if supplied tokens are a valid text domain.
*
* @param PHP_CodeSniffer_File $phpcs_file The file being scanned.
* @param string $arg_name
* @param int $stack_ptr
* @param array $tokens
* @return bool
*/
protected function check_string_domain_tokens( $phpcs_file, $arg_name = '$domain', $stack_ptr, $tokens = array() ) {
if ( 0 === count( $tokens ) ) {
$phpcs_file->addWarning( "Missing text domain (%s arg).", $stack_ptr, 'MissingTextDomain', array( $arg_name ) );
return false;
}
if ( count( $tokens ) > 1 ) {
$contents = '';
foreach ( $tokens as $token ) {
$contents .= $token['content'];
}
$phpcs_file->addWarning( 'Text domain (%s arg) should be a single string literal, not "%s".', $stack_ptr, 'NonSingularStringLiteralTextDomain', array( $arg_name, $contents ) );
return false;
}
if ( T_CONSTANT_ENCAPSED_STRING === $tokens[0]['code'] ) {
return true;
}
if ( T_DOUBLE_QUOTED_STRING === $tokens[0]['code'] ) {
$interpolated_variables = $this->get_interpolated_variables( $tokens[0]['content'] );
foreach ( $interpolated_variables as $interpolated_variable ) {
$phpcs_file->addWarning( 'Text domain strings (%s arg) should not contain interpolated variables. Found "$%s".', $stack_ptr, 'InterpolatedVariableTextDomain', array( $arg_name, $interpolated_variable ) );
}
return false;
}
$phpcs_file->addWarning( 'Text domain (%s arg) should be single a string literal, not "%s".', $stack_ptr, 'NonSingularStringLiteralTextDomain', array( $arg_name, $tokens[0]['content'] ) );
return false;
}
}
72 changes: 72 additions & 0 deletions WordPress/Tests/WP/I18nUnitTest.inc
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
<?php

__( "hell$varo", 'domain' ); // Bad, shouldn't use a string with variables

__( "hell\$varo", 'domain' ); // OK, Variable is not interpolated.
__( "hell\\$varo", 'domain' ); // Bad, is interpolated.
__( "hell\\\$varo", 'domain' ); // OK, variable is escaped.

__( $var, 'domain' ); // Bad, shouldn't use variables

__( 'string', SOMETHING ); // Bad, shouldn't use CONSTANTS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more unit tests:

__( 'string' . $var, 'domain' ); // Bad, shouldn't use variable for string

__( $var . 'string', 'domain' ); // Bad, shouldn't use variable for string

__( SOMETHING, 'domain' ); // Bad, shouldn't use constant for string

__( 'string' . SOMETHING, 'domain' ); // Bad, shouldn't use constant for string

__( SOMETHING . 'string', 'domain' ); // Bad, shouldn't use variable for string

__( 'string', $domain ); // Bad, shouldn't use variable for domain

__( 'string', 'my' . $domain ); // Bad, shouldn't use variable for domain

__( 'string', $domain . 'domain' ); // Bad, shouldn't use variable for domain

__( 'string', 'domain' ); // Good

_x( 'string', 'context', 'domain' ); // Good

_x( 'string', $var, 'domain' ); // Bad, shouldn't use variable for context

_x( 'string', 'context' . $var, 'domain' ); // Bad, shouldn't use variable for context

_x( 'string', $var . 'context', 'domain' ); // Bad, shouldn't use variable for context

_x( 'string', SOMETHING, 'domain' ); // Bad, shouldn't use constant for context

_x( 'string', SOMETHING . 'context', 'domain' ); // Bad, shouldn't use constant for context

_x( 'string', 'context' . SOMETHING, 'domain' ); // Bad, shouldn't use constant for context


__( 'string' . $var, 'domain' ); // Bad, shouldn't use variable for string

__( $var . 'string', 'domain' ); // Bad, shouldn't use variable for string

__( SOMETHING, 'domain' ); // Bad, shouldn't use constant for string

__( 'string' . SOMETHING, 'domain' ); // Bad, shouldn't use constant for string

__( SOMETHING . 'string', 'domain' ); // Bad, shouldn't use variable for string

__( 'string', $domain ); // Bad, shouldn't use variable for domain

__( 'string', 'my' . $domain ); // Bad, shouldn't use variable for domain

__( 'string', $domain . 'domain' ); // Bad, shouldn't use variable for domain

__( 'string', 'domain' ); // Good

_x( 'string', 'context', 'domain' ); // Good

_x( 'string', $var, 'domain' ); // Bad, shouldn't use variable for context

_x( 'string', 'context' . $var, 'domain' ); // Bad, shouldn't use variable for context

_x( 'string', $var . 'context', 'domain' ); // Bad, shouldn't use variable for context

_x( 'string', SOMETHING, 'domain' ); // Bad, shouldn't use constant for context

_x( 'string', SOMETHING . 'context', 'domain' ); // Bad, shouldn't use constant for context

_x( 'string', 'context' . SOMETHING, 'domain' ); // Bad, shouldn't use constant for context

_n( 'I have a cat.', 'I have cats.', $number ); // Bad, no text domain.
_n( 'I have a cat.', 'I have cats.', $number, 'my-slug' ); // OK.
_n( 'I have a cat.', 'I have cats.', $number, "illegal $string" ); // Bad.
_n( 'I have a cat.', 'I have cats.', $number, SOMETHING ); // Bad.

_n_noop( 'I have a cat.', 'I have cats.', $number ); // Bad, no text domain.
_n_noop( 'I have a cat.', 'I have cats.', $number, 'my-slug' ); // OK.
_n_noop( 'I have a cat.', 'I have cats.', $number, "illegal $string" ); // Bad.
_n_noop( 'I have a cat.', 'I have cats.', $number, SOMETHING ); // Bad.

_nx( 'I have a cat.', 'I have cats.', $number, 'Not really.' ); // Bad, no text domain.
_nx( 'I have a cat.', 'I have cats.', $number, $context ); // Bad.
_nx( 'I have a cat.', 'I have cats.', $number, 'Not really.', 'my-slug' ); // OK.
_nx( 'I have a cat.', 'I have cats.', $number, $context, 'my-slug' ); // Bad.
_nx( 'I have a cat.', 'I have cats.', $number, 'Not really.', "illegal $string" ); // Bad.
_nx( 'I have a cat.', 'I have cats.', $number, 'Not really.', SOMETHING ); // Bad.

_nx_noop( 'I have a cat.', 'I have cats.', $number, 'Not really.' ); // Bad, no text domain.
_nx_noop( 'I have a cat.', 'I have cats.', $number, $context ); // Bad, no text domain, variable context.
_nx_noop( 'I have a cat.', 'I have cats.', $number, 'Not really.', 'my-slug' ); // OK.
_nx_noop( 'I have a cat.', 'I have cats.', $number, $context, 'my-slug' ); // Bad.
_nx_noop( 'I have a cat.', 'I have cats.', $number, 'Not really.', "illegal $string" ); // Bad.
_nx_noop( 'I have a cat.', 'I have cats.', $number, 'Not really.', SOMETHING ); // Bad.

translate( 'foo', 'bar' ); // Bad, low-level API function.
translate_with_gettext_context( 'foo', 'bar', 'baz' ); // Bad, low-level API function.

_( 'foo', 'bar' );
Loading