How to validate a page number in PHP

Recently I came across an issue that was harder than it seemed. I use my own framework, which is based around Symfony’s HttpFoundation.

My controllers look like that:

use Symfony\Component\HttpFoundation\Request;
use Symfony\Component\HttpFoundation\Response;

class myControllerClass
{
    public function myController(Request $request)
    {
        return new Response("My page");
    }
}

Now let’s say I want to show a list of categories in my website. I have a lot of categories, so I need to split them across multiple pages. Now I have a few requirements:

  • The page number is a simple $_GET parameters: http://example.com/categories/?page=2
  • I don’t want any duplicate content issues, so there is only one way to display a give page
  • I want a strict check of ‘page’ argument, I don’t want to introduce any security holes
  • Page 1 should be displayed if no argument is given (ie: http://exemple.com/categories/)
  • Any invalid entry will throw a InvalidArgumentException

To sum up, we’ll get the ‘page’ query string argument, and default to 1:

$page = $request->query->get('page', 1);

Now let’s try to validate this value:

is_numeric()

is_numeric() seems like a perfect for the job. Lets run a few tests:

is_numeric("1"); // => bool(true)
is_numeric("2"); // => bool(true)
is_numeric("1e7"); // => bool(true)
is_numeric("2.5"); // => bool(true)

So unless you want exponential and half pages, it doesn’t work.

is_int()

Since our query string parameter comes as a string, is_int() won’t work either:

is_int("1"); // => bool(false)

ctype_digit()

I really though this one would work, until I realized page 1 was broken. Because when I get the ‘page’ parameter with $page = $request->query->get(‘page’, 1); I can get 2 variable types. If ‘page’ is defined, $page will be a string, if ‘page’ if not defined, it will default to ‘1’. Which is a int.

ctype_digit(1); // => bool(false)

Request::getInt()

So I realized I was mixing the types. So let’s force the page number to an integer, by using the Request::getInt() method.

$page = $request->query->getInt('page', 1);

What actually happens, is that it will cast the string to an integer, with the same rules as using intval(). It will convert invalid page values to integers, for example:

$request = Request::create('/categories/', 'GET', ['page' => '3andveryinvalidvalue']);
$page => int(3)

So there is not way we’ll be able to throw an Exception, since we don’t know that the value is invalid.

What actually works

Let’s go back to the original code, and try using regular expressions;

$page = $request->query->get('page', 1);
if (!preg_match("/^[0-9]+$/", $page)) {
  throw new InvalidArgumentException();
}

It works! Since preg_match only works on string, it will automatically cast an integer to a string.

Another solution is to use ctype_digits(), but we need to make sure that our default parameter is a string.

$page = $request->query->get('page', "1");
if (!ctype_digits($page)) {
  throw new InvalidArgumentException();
}

How to get nice stats from Fastly using Logentries

If you never heard of Fastly, go check it out: https://www.fastly.com

Quickly, what is Fastly? It’s a CDN powered by Varnish. It means that they have a bunch of Varnish servers scattered across the globe, each node being able to cache and deliver your web pages. Why is it useful? Let’s assume that my site is hosted in Montreal, Canada. Now consider a user coming from Sweden. When he requests a page from my website, his browser will sent request to Montreal, my server will need to process the request, do whatever database processing is required, render the page, and sent it back to Sweden. This might take a few hundred milliseconds.

Now what happens with Fastly? Check out the Fastly Network Map:

Source: https://www.fastly.com/network

The initial request will be sent to the Stockholm node, which will request the page to our server and cache it locally. Once the page is cached, the Stockholm node will be able to deliver the cached page without the required trip to Canada, and the required processing by my server. It’s a lot faster.

Ok now, what does Logentries come into play? Logentries is log management software, has a free plan, and is supported by Fastly. So here’s what you need to do:

I will assume that you know the basics about Fastly and Logentries. Then follow these steps:

  1. If you didn’t already, send an email to Fastly support and ask them to enable custom VCL
  2. Create a new Log set in Logentries, take note of the token, and install theFastly Community Pack
  3. Create a new Logging entry point in Fastly (the name is important, we’ll need it later) 
  4. Grab the VCL provided by Logentries: https://community.logentries.com/wp-content/uploads/2015/04/le-fastly-pack-sample-VCL.vcl, replace ACCOUNT_KEY with your Fastly service ID, and make sure that the log name (LE) match the one in 3.
  5. Upload the VCL publish the configuration

Now you should receive logs in Logentries with a nice format. Problem is: you receive the logs twice, with 2 different formats. Why is that? When you create a new Logging endpoint, it will send logs to Logentries with the default Fastly format. And we now send a custom log format with our modified VCL. To fix this annoying issue, we need to silent the default log. This can be done by adding a conditions that’s always false:

Voila! Now you have awesome (and unique) logs and analytics in your Logentries account.