New Yahoo! Homepage

So the reason I’ve not posted anything for a while, and have had to be careful about what I talk about when and if I do.

Last week we turned on the opt in to the Yahoo! Homepage http://uk.yahoo.com/trynew

My personal project was writing the Facebook App, which does seem to have received a fair amount of press coverage.

However for the last 18 months we’ve been working on a complete rewrite of the homepage.

This has included releasing four countries (France, United Kingdom, United States and India) all using the same codebase and the same developers.

Personally I’ve been involved in setting up testing frameworks, continuous integration and code quality.

I’ll try and publish further updates later. Specifically, if possible, information on the Facebook App.

Posted in Development, Yahoo! | Tagged , | 1 Comment

The CodeSniffer Series

Recently I wrote a series of posts about PHP CodeSniffer the PHP PEAR project.

This is a brief reminder of the series and a list of all the articles.

These articles are the originals and are being used as a base for a YDN article to be published shortly.

  1. Introduction to CodeSniffer
  2. CodeSniffer Output
  3. Writing an example CodeSniffer Standard
  4. How does CodeSniffer work?
  5. Writing an example CodeSniffer Sniff
Posted in Development, PHP, Yahoo! | Tagged , , | Leave a comment

CodeSniffer Part 5: Writing an examples CodeSniffer Sniff

There has been a significant delay between the original posts and this final post in the initial CodeSniffer series, for which I apologise. This delay is mainly due to my recent workload increase, and the happy news that my wife is pregnant coupled with the brief flutter of nesting activity increase at home.

In the last post we looked at the internals of CodeSniffer and how the processing of files is done. This time we will look at how an individual Sniff is constructed.

Sniff Basics

  • Each Sniff is a PHP class.
  • Each class consist of a minimum of two methods.
    • A register method, that returns an array of tokens that the Sniff is interested in.
    • And a process method that is executed each time that token is encountered.
  • Each Sniff class lives inside a Standard.
    • Each Sniff lives inside a subfolder “Sniffs” under the standard
    • Each Sniff lives inside a subfolder of “Sniffs” containing themed Sniffs
  • Each Sniff class must adhere to the naming convention.

Starting a Sniff

Let’s take a look at an example Sniff. For instance a Sniff designed to check that our PHP code has no extraneous whitespace outside of our PHP code, that could cause a PHP error should we subsequently send headers to the browser.

First of all we create a class and decide on a name for it.

As CodeSniffer uses an autoloader that loads the class’ file and path based on the class name we have to stick to a specific naming convention.

class KingKludge_Sniffs_PHP_NoExtraneousWhiteSpaceSniff
implements PHP_CodeSniffer_Sniff
{}

equates to a path of:

{CodeSniffer}/Standards/KingKludge/Sniffs/PHP/NoExtraneousWhiteSpaceSniff.php

where {CodeSniffer} is your CodeSniffer install path.

Secondly, inside our new class we must define our register method:

public function register()
{
	return array(
		T_OPEN_TAG,
		T_CLOSE_TAG,
		);
}

As we want to check for whitespace before or after our PHP declarations, we state that this sniff is interested in PHP open and close tags.

Thirdly, we must define our process method:

public function process(PHP_CodeSniffer_File $phpcsFile, $stackPtr)
{

}

As you can see, our process method is passed some parameters.

These are $phpcsFile which is PHP_CodeSniffer_File object, containing the file being processed and $stackPtr an integer value pointing to CodeSniffer’s current place in the file.

View the empty standard file on GitHub

Now on to the actually code in our Sniff.

As our proposed Sniff is going to look for whitespace being output before our page is ready to be rendered, we are interested in whitespace that occurs before the opening PHP tag or after the closing PHP tag.

We are only interested in whitespace, any HTML or other markup is okay.

So we can see if our pointer is at the beginning of the file, if so, drop out of the process.

$tokens = $phpcsFile->getTokens();
if (T_OPEN_TAG === $tokens[$stackPtr]['code'])
{
	if (0 === $stackPtr)
	{
		return;
	}
}

The getTokens() command returns all of the tokens for the current file in an associative array. (in case you didn’t realise).

The rest should be self explanatory.

We can also do a similar check for a PHP close tag

if (T_CLOSE_TAG === $tokens[$stackPtr]['code'])
{
	$content = $tokens[$stackPtr]['content'];
	if (
	    false === isset($tokens[($stackPtr + 1)]) &&
	    trim($content) === $content
	   )
	{
		return;
	}
}

View the basic standard file on GitHub

Posted in Development, PHP, Software | Tagged , , , | Leave a comment

CodeSniffer Part 4: How does CodeSniffer Work

In the previous article we looked at writing our own CodeSniffer standard based on pre-existing rules or sniffs.

This article will try to cover in-depth how CodeSniffer actually works to give insight into the next proposed article, writing a sniff from scratch.

The PHP Tokenizer

CodeSniffer works by extending the PHP tokenizer function.

Given the following section of code:

<?php

function DoSomething(array $foo) {
    print_r($foo);
}

?>   

and running it through PHP’s native tokenizer we get the following output

PHP Tokenized version of foo.php


PHP’s tokenizer only identifies a limited subset of PHP syntax as listed here.

All other potential tokens get either identifies as a string, signified by the sub array[0] integer value of 307 or the constant T_STRING, else it simply returns the string value of the token i.e. those seen array values 17,18 and 20 in the sample output above.

To map the above integer values to PHP’s string constant name you can use the PHP function token_name()

For example:

$ php -r "print(token_name(369));"
T_CLOSE_TAG

As once the PHP tokenizer has run, we have a lot of code still encapsulated as T_STRING or with no tokenizing done, CodeSniffer takes these simple tokens and expands them further.

CodeSniffer introduces new constants such as T_TRUE, T_FALSE, T_NULL, T_PARENT, T_OPEN_CURLY_BRACKET and so on.

This gives CodeSniffer considerable scope to be able to handle much finer detail of the PHP syntax.

The PHP CodeSniffer Tokenizer

When CodeSniffer first loads, the standard in use is determined from the command line or from the stored config. The standard is then loaded and all of that standard’s sniffs are loaded.

Each of the sniffs gets called via the register() method and a hash of all the tokens and classes is created.

Then CodeSniffer starts looking for the files to check, if a directory is specified, CodeSniffer iterates through the directories until a file with the correct extension is found, then each file is processed in turn.

If a list of files or a single file is specified, then the above step is skipped and CodeSniffer starts parsing the file(s) as defined in the parameters.

Once CodeSniffer has tokenized the file under analysis into one (rather large) multidimensional array of language syntax tokens, the rest is quite simple.

CodeSniffer breaks each file under examination down and does a series of context checks before processing the tokens and calling all the registered sniffs.

These checks are:

  1. Bracket Map: checking braces
  2. Scope Map: checking for class, function and conditional statement scopes
  3. Level Map: checking for class, function and conditional statement levels

If we look at the CodeSniffer Tokenized version of foo.php we can see the levels of our sample script above.

Each Sniff in the standard has registered which tokens they are interested in being invoked to handle during the initialisation phase.

CodeSniffer then runs through each token in the file from beginning to end and calls all of the sniff process() method for sniffs that registered and interest in that token.

Finally all of the errors and warnings generated by those sniffs are organised into the desired report type and displayed.

So now we have an insight into how CodeSniffer works, in the next and final post in this series on CodeSniffer, we’ll look and writing a new Sniff.

Posted in Development, PHP, Software | Tagged , , , | Leave a comment

CodeSniffer Part 3: Writing an example CodeSniffer Standard

Last time we looked at some sample CodeSniffer reports and in the source report we saw that the Zend standard was also reporting errors from other standards.

$ phpcs --report=source --standard=Zend ./_rr

PHP CODE SNIFFER VIOLATION SOURCE SUMMARY
--------------------------------------------------------------------------------
STANDARD    CATEGORY            SNIFF                                      COUNT
--------------------------------------------------------------------------------
Zend        Files               Line length                                92
Generic     White space         Disallow tab indent                        65
Zend        Naming conventions  Valid variable name                        59
PEAR        White space         Scope closing brace                        33
PEAR        Functions           Function call signature                    26
PEAR        Control structures  Control signature                          12
PEAR        Files               Line endings                               11
PEAR        Functions           Function call argument spacing             5
Generic     PHP                 Disallow short open tag                    1
Zend        Files               Closing tag                                1
--------------------------------------------------------------------------------
A TOTAL OF 305 SNIFF VIOLATION(S) WERE FOUND IN 10 SOURCE(S)
--------------------------------------------------------------------------------

Each CodeSniffer standard is comprised of rules or sniffs. A standard can contain a sniff from other standards.

This allows us to quickly and easily create our own custom standard by leveraging those rules that are already contained in existing standards.

To find out what we have available to use, we can looking for all the files that end in Sniff.php within the Standards folder. We can see we have 167 sniff available to use already (with version 1.2.0a1 anyway).

Full list of available Sniffs

cd {your pear path}/PHP/CodeSniffer/Standards
$ find ./ -name "*Sniff.php" -and -not -name "Abstract*"

Clicking here to see the full list of available Sniffs.

Some of these Sniffs are self explanatory some are less so, and would need a cursory glance at the code to see what they are checking  for.

Luckily most of the classes are well documented (as the PHPCS standard dictates!) so looking at the class phpdoc comment usually tells what the sniff is looking for.

Something to note is that you will find some of the sniffs are mutually exclusive:

e.g.

./Generic/Sniffs/Formatting/NoSpaceAfterCastSniff.php
./Generic/Sniffs/Formatting/SpaceAfterCastSniff.php
./Generic/Sniffs/Functions/OpeningFunctionBraceBsdAllmanSniff.php
./Generic/Sniffs/Functions/OpeningFunctionBraceKernighanRitchieSniff.php

Writing the standard

So to start with we need to come up with a name for the standard.

So for this example I’m going to use KingKludge.

Now I prefer to develop in my home dir, but to get this sniff recognised you will need to have the standard available in {your pear path}/PHP/CodeSniffer/Standards/.

So I symlink my standard into the pear path, but you could work directly in the pear folder.

$ mkdir KingKludge
$ ln -s KingKludge {your pear path}/PHP/CodeSniffer/Standards/KingKludge

Then we need to create the standard file. Which follows the naming convention {standard}CodingStandard.php so we end up with the following file.

KingKludgeCodingStandard.php

The underscores in the class name are important as the autoload function splits the path at underscores, so if you miss-type one of the path parts you will get a fatal error when the class attempts to load.

Assuming that you did have your class in the correct folder you can do run.

$ phpcs -i

And you should see your new standard is listed as available for use.

As a timesaving tip, to stop us from having to type the standard name every time you can run CodeSniffer run the following command.

$ phpcs --config-set default_standard KingKludge

Now unless we override with the --standard switch CodeSniffer will default to using the KingKludge standard.

Adding some rules

So now we have a new standard but it doesn’t do anything yet, as it has no rules to apply.

We have two methods in the class to define the rules or sniffs getIncludedSniffs() and getExcludedSniffs(). Not unsurprisingly these methods should return an array of the sniffs to include or exclude.

If we decided we wanted to make life easy for ourselves and base our standard on someone else’s standard, we can quite simply include all of their standard and then swap out the sniffs we don’t want for others we do.

For example, taking the Squiz coding standard.

public function getIncludedSniffs()
{
  return array(
    'Squiz',
  );
}

Now say we want to swap the BSD style bracing rule for K&R style braces, and we don’t want the CodeAnalyzer rule to run either.

So we add the K&R braces Sniff to our include and add the CodeAnalyzer and BSD braces sniffs to the exclude.

public function getIncludedSniffs()
{
  return array(
    'Squiz',
    'Generic/Sniffs/Functions/OpeningFunctionBraceKernighanRitchieSniff.php',
  );
}

public function getExcludedSniffs()
{
  return array(
    'Generic/Sniffs/Functions/OpeningFunctionBraceBsdAllmanSniff.php',
    'Zend/Sniffs/Debug/CodeAnalyzerSniff.php',
  );
}

That seems easy enough!

So now you can see how easily you can create your own standards by picking as few or as many sniffs as you want from existing standards.

Next time, understanding the internals of CodeSniffer and how it works.

Posted in Development, PHP, Software | Tagged , , , | 3 Comments