How to validate a page number in PHP

Recently I came across an issue that was harder than it seemed. I use my own framework, which is based around Symfony’s HttpFoundation.

My controllers look like that:

use Symfony\Component\HttpFoundation\Request;
use Symfony\Component\HttpFoundation\Response;

class myControllerClass
{
    public function myController(Request $request)
    {
        return new Response("My page");
    }
}

Now let’s say I want to show a list of categories in my website. I have a lot of categories, so I need to split them across multiple pages. Now I have a few requirements:

  • The page number is a simple $_GET parameters: http://example.com/categories/?page=2
  • I don’t want any duplicate content issues, so there is only one way to display a give page
  • I want a strict check of ‘page’ argument, I don’t want to introduce any security holes
  • Page 1 should be displayed if no argument is given (ie: http://exemple.com/categories/)
  • Any invalid entry will throw a InvalidArgumentException

To sum up, we’ll get the ‘page’ query string argument, and default to 1:

$page = $request->query->get('page', 1);

Now let’s try to validate this value:

is_numeric()

is_numeric() seems like a perfect for the job. Lets run a few tests:

is_numeric("1"); // => bool(true)
is_numeric("2"); // => bool(true)
is_numeric("1e7"); // => bool(true)
is_numeric("2.5"); // => bool(true)

So unless you want exponential and half pages, it doesn’t work.

is_int()

Since our query string parameter comes as a string, is_int() won’t work either:

is_int("1"); // => bool(false)

ctype_digit()

I really though this one would work, until I realized page 1 was broken. Because when I get the ‘page’ parameter with $page = $request->query->get(‘page’, 1); I can get 2 variable types. If ‘page’ is defined, $page will be a string, if ‘page’ if not defined, it will default to ‘1’. Which is a int.

ctype_digit(1); // => bool(false)

Request::getInt()

So I realized I was mixing the types. So let’s force the page number to an integer, by using the Request::getInt() method.

$page = $request->query->getInt('page', 1);

What actually happens, is that it will cast the string to an integer, with the same rules as using intval(). It will convert invalid page values to integers, for example:

$request = Request::create('/categories/', 'GET', ['page' => '3andveryinvalidvalue']);
$page => int(3)

So there is not way we’ll be able to throw an Exception, since we don’t know that the value is invalid.

What actually works

Let’s go back to the original code, and try using regular expressions;

$page = $request->query->get('page', 1);
if (!preg_match("/^[0-9]+$/", $page)) {
  throw new InvalidArgumentException();
}

It works! Since preg_match only works on string, it will automatically cast an integer to a string.

Another solution is to use ctype_digits(), but we need to make sure that our default parameter is a string.

$page = $request->query->get('page', "1");
if (!ctype_digits($page)) {
  throw new InvalidArgumentException();
}