PHP Input Filtering
PHP is one of the easy-handling languages which makes developers comfortable to work with. It provides functions to sanitize and validate the outside user input. These functions are in the PHP filters extension.
This extension is enabled by default as of the PHP version 5.2.0. It can be explicitly configured with the PHP configuration file. The outsiders’ can send their input in many ways. For example, they can post the input via an HTML form, send API params via REST clients and more.
These inputs have to be sanitized and validated before processing them. We have seen the example of doing client-side validation.
PHP filters extension provides functions and filter constants to validate the different types of input. The filter_list() function will return the array of filters supported by this extension.
These filters will remove the unexpected data from the user input and validate the format after sanitization. In this post, we are going to see the list of filter functions and their use. Also, I have added an example code for validating the username and email using the PHP filter functions. The name and email data will be posted via a HTML form.
PHP Filter Functions
These are the list of PHP filter functions available in the PHP filter extension.
filter_has_var()
This function is used to check if the specified type of the variable name exists or not.
filter_has_var(int $input_type, string $var_name): bool
The possible values of $type are INPUT_GET, INPUT_POST, INPUT_COOKIE, INPUT_SERVER, or INPUT_ENV. The $variable_name is the index variable to be used to check the input.
This function will return the id of the specified filter_nam.
filter_id(string $filter_name);
filter_input_array()
It accepts an array of mixed filter definitions to validate the input type.
filter_input_array ( int $type, mixed $definition, bool $add_empty = TRUE )
The filter definition and add_empty arguments are optional. The $add_empty will be TRUE by default to return NULL for the unset variable index.
filter_input()
The PHP filter_input function validates input type with a single filter id instead of the mixed definition we have used for filter_input_array.
filter_input ( int $type, string $variable_name, int $filter = FILTER_DEFAULT, mixed $options );
filter_list()
As we have mentioned above in the PHP filter introduction, this function returns all the list of filters supported by this extension.
filter_var_array()
This function accepts an array of input data and filters definition with the add_empty flag to validate the input array.
filter_var_array ( array $data , mixed $definition , bool $add_empty = TRUE )
It takes a single input and filter id for validation.
filter_var ( $variable , int $filter = FILTER_DEFAULT , mixed $options )
PHP Example: Validating HTML Form Posts using Filters Functions
In this example, I have created a HTML form to let the user enter their name and email address. On submitting this form the input data are posted to the PHP file. In PHP code, the $_POST data are sanitized and validated using the PHP filter function filter_var.
HTML Code to Display the Form and Client-Side Validation Script
The below code shows the HTML for displaying the form with the user name and email address field. On submitting this form the fnSubscribe() JavaScript function is called and to do the not-empty check for the form fields in the client-side.
function fnSubscribe() < if (document.frmSubscription.userName.value == "") < return false; >if (document.frmSubscription.userEmail.value == "") < return false; >return true; >
PHP Filter Sanitization and Validation Code
The following PHP code applies sanitization and validation filter on the form post data. After filtering the user input posted via the HTML form, I have created the INSERT query to add the user data to the database.
I used FILTER_SANITIZE_STRING and FILTER_SANITIZE_EMAIL filter to sanitize the username and email data respectively. And then, I used FILTER_VALIDATE_EMAIL to check if the email data is in a valid format.
filter_input
One of INPUT_GET , INPUT_POST , INPUT_COOKIE , INPUT_SERVER , or INPUT_ENV .
Name of a variable to get.
The ID of the filter to apply. The Types of filters manual page lists the available filters.
If omitted, FILTER_DEFAULT will be used, which is equivalent to FILTER_UNSAFE_RAW . This will result in no filtering taking place by default.
Associative array of options or bitwise disjunction of flags. If filter accepts options, flags can be provided in «flags» field of array.
Return Values
Value of the requested variable on success, false if the filter fails, or null if the var_name variable is not set. If the flag FILTER_NULL_ON_FAILURE is used, it returns false if the variable is not set and null if the filter fails.
Examples
Example #1 A filter_input() example
The above example will output something similar to:
See Also
- filter_var() — Filters a variable with a specified filter
- filter_input_array() — Gets external variables and optionally filters them
- filter_var_array() — Gets multiple variables and optionally filters them
- Types of filters
User Contributed Notes 15 notes
This function provides us the extremely simple solution for type filtering.
Without this function.
if (!isset( $_GET [ ‘a’ ])) $a = null ;
> elseif (! is_string ( $_GET [ ‘a’ ])) $a = false ;
> else $a = $_GET [ ‘a’ ];
>
$b = isset( $_GET [ ‘b’ ]) && is_string ( $_GET [ ‘b’ ]) ? $_GET [ ‘b’ ] : » ;
?>
With this function.
$a = filter_input ( INPUT_GET , ‘a’ );
$b = (string) filter_input ( INPUT_GET , ‘b’ );
?>
Yes, FILTER_REQUIRE_SCALAR seems to be set as a default option.
It’s very helpful for eliminating E_NOTICE, E_WARNING and E_ERROR.
This fact should be documented.
If your $_POST contains an array value:
$_POST = array(
‘var’ => array( ‘more’ , ‘than’ , ‘one’ , ‘values’ )
);
?>
you should use FILTER_REQUIRE_ARRAY option:
var_dump ( filter_input ( INPUT_POST , ‘var’ , FILTER_DEFAULT , FILTER_REQUIRE_ARRAY ));
?>
Otherwise it returns false.
FastCGI seems to cause strange side-effects with unexpected null values when using INPUT_SERVER and INPUT_ENV with this function. You can use this code to see if it affects your server:
var_dump ( $_SERVER );
foreach ( array_keys ( $_SERVER ) as $b ) var_dump ( $b , filter_input ( INPUT_SERVER , $b ));
>
echo ‘
‘ ;
var_dump ( $_ENV );
foreach ( array_keys ( $_ENV ) as $b ) var_dump ( $b , filter_input ( INPUT_ENV , $b ));
>
?>
If you want to be on the safe side, using the superglobal $_SERVER and $_ENV variables will always work. You can still use the filter_* functions for Get/Post/Cookie without a problem, which is the important part!
Note that this function doesn’t (or at least doesn’t seem to) actually filter based on the current values of $_GET etc. Instead, it seems to filter based off the original values.
$_GET [ ‘search’ ] = ‘foo’ ; // This has no effect on the filter_input
Here is an example how to work with the options-parameter. Notice the ‘options’ in the ‘options’-Parameter!
$options =array( ‘options’ =>array( ‘default’ => 5 , ‘min_range’ => 0 , ‘max_range’ => 9 ));
$priority = filter_input ( INPUT_GET , ‘priority’ , FILTER_VALIDATE_INT , $options );
?>
$priority will be 5 if the priority-Parameter isn’t set or out the given range.
To use a class method for a callback function, as usual, provide an array with an instance of the class and the method name.
Example:
class myValidator
public function username ( $value )
// return username or boolean false
>
>
$myValidator = new myValidator ;
$options = array( ‘options’ => array( $myValidator , ‘username’ ));
$username = filter_input ( INPUT_GET , ‘username’ , FILTER_CALLBACK , $options );
var_dump ( $username );
?>
I wouldn’t recommend people use this function to store their data in a database. It’s best not to encode data when storing it, it’s better to store it raw and convert in upon the time of need.
One main reason for this is because if you have a short CHAR(16) field and the text contains encoded characters (quotes, ampersand) you can easily take a 12 character entry which obviously fits, but because of encoding it no longer fits.
Also, while not as common, if you need to use this data in another place, such as a non webpage (perhaps in a desktop app, or to a cell phone SMS or to a pager) the HTML encoded data will appear raw, and now you have to decode the data.
In summary, the best way to architect your system, is to store data as raw, and encode it only the moment you need to. So this means in your PHP upon doing a SQL query, instead of merely doing an echo $row[‘title’] you need to run htmlentities() on your echos, or better yet, an abstract function.
The beauty of using this instead of directly using filter_var( $_GET[‘search’] ) is that you don’t need to check if( isset( $_GET[‘search’] ) ) as if you pass that to filter_var and the key is not set then it will result in a warning. This function simplifies this and will return the relevant result to you (as per your options set) if the key has not been set in the user input.
If the type of filter you are using also supports a ‘default’ argument then this function will also stuff your missing input key with that value, again saving your efforts
In fastcgi sapi implementations, filter_input(INPUT_SERVER) can return empty results.
In my case (8.1.9 64bit php-cgi) it was caused by auto_globals_jit enabled . When disabled (in php.ini on php startup), filter_input(INPUT_SERVER) works correctly.
php-fpm sapi isn’t affected.
contrary to what is stated here on the comments on thow to use the options for filters, there is no range option or default. in fact, there is not much option AT ALL. It is not mentioned in the manual anywhere, and the provided code on that comment does nothing with php-5.4.4..
get ( GET , ‘p’ , FILTER_VALIDATE_INT , array( ‘options’ =>array( ‘default’ => 5 , ‘min_range’ => 0 , ‘max_range’ => 9 )) );
// ?p=30 => 30
// ?p=»123″ => 123
// ?p=-23 => -23
// ?p=asdf => null
?>
Note how to setup default filter for filter_var_array
When I tried to use filter_var_array and didn’t mentioned all array indexes in definition it filtered it with some filter and broke values so using this tip corrected everything
$def = array_map ( create_function ( » , ‘return array(«filter»=>FILTER_UNSAFE_RAW);’ ), $input );
?>
Discovered interesting behavior when modifying super-globals directly.
$_GET[‘p’] = 1;
filter_input(INPUT_GET,’p’); //value is NULL
It’s worth noting that the names for variables in filter input obey the same rules as variable naming in PHP (must start with an underscore or letter). We were allowing users to build custom forms but hashing the names to prevent them from putting arbitrary content into the dom. Turns out the hash function occasionally produced entirely numeric values for the field name. which doesn’t work with filter_input but worked fine if you read directly from $_GET, $_POST, or $_REQUEST. A workaround is to always prefix an underscore to the field name.
filter_input() does not seem to support multiple values for a single variable name.
Here is the code comparing the behavior of bare $_GET superglobal vs filter_input(INPUT_GET. ):
print( «Bare \$_GET:\n» );
var_dump ( $_GET );
print( «filter_input():\n» );
var_dump ( filter_input ( INPUT_GET , «var» ));
?>
When calling: /. /script.php?var=123 (there is only one value for variable ‘var’)
Output is:
Bare $_GET:
array(1) [«var»]=>
string(3) «123»
>
filter_input():
string(3) «123»
When calling: /. /script.php?var[]=123&var[]=999 (there are two values for variable ‘var’)
Output is:
Bare $_GET:
array(1) [«var»]=>
array(2) [0]=>
string(3) «123»
[1]=>
string(3) «999»
>
>
filter_input():
bool(false)
As expected, $_GET[‘var’] became an array. But filter_input() seems to be unable to process multiple values and returns false.
1. The description of the options parameter is misleading. In order to pass the options (e.g. default, min_range and max_range) you must pass an associative array with a key called «options», which itself is an associative array containing option name => option value pairs.
2. The return values section does not mention that if you specify the «default» option then the function will return the specified default value instead of returning FALSE or NULL (when filter fails or variable is absent).