writing command line scripts in php: part 1; args, preflighting and more
php has a lot of strengths as a web language, but it is also perfectly serviceable for general-purpose scripting. in this series of posts we’re going to go over building command line scripts in php. this is part one.
why even do this?
- reuse of existing code: if you have a php web application and you wish to write a command line interface to leverage some or all of its functionality, it makes much more sense to use php for your script and re-use your existing code, rather than rewrite everything from scratch in, say, python.
- available skills: if you or your team is long on php skills and short on python or bash, it often makes sense to just play to your strengths.
- the language itself: php is actually a pretty powerful language for command scripting. not only does it provide all the standard features, but it also gives us access to things like handling signals, forking processes and even semaphores if we need it.
assumptions
we’re going to be looking at writing command line scripts for unix-like operating systems. all the examples here were built on ubuntu 20.04 using php 7.4.9.
the flyover
in this installment, we’re going to go over:
- making our php script runnable on the command line
- preflighting our environment to ensure our script runs right
- parsing common command line argument strucutres, and finally
- giving our script a nice name that ps recognizes
first, make it runnable
traditionally, when running a php script on the command line, you invoke the php command and pass the script as an argument, ie php /path/to/my/script.php . this works, but is ugly.
to fix this, we’ll be putting a shebang at the top of our php script. the shebang is a specially-formatted first line that informs the operating system which interpreter to use when executing the script. if you’ve done any shell scripting, you’re probably familiar with #!/bin/bash . that’s a shebang.
let’s open our sample script ourfancyscript.php in the editor of our choice and add:
one thing that we notice about this shebang is that it calls env . normally, in shell scripting, we reference the direct path to bash with #!/bin/bash . but, here, we’ll be calling env so that the operating system searches the user’s $PATH to find the php intepreter. this is important since, while bash is almost always in /bin , we have no guarantee that we know where the php interpreter lives. using env in our shebang helps increase the portability of our script across different systems. that’s a good thing.
now that we have our shebang, we’ll set our script to be executable. we’ll use the standard permssions set for this:
chmod 755 ourfancyscript.php
once we have the execution permissions set, we can run the script simply by calling it:
of course, the script does nothing, but it actually does do that nothing. so that’s progress.
preflight
when we write php for a web application, we generally have a good idea of the environment it will run in. we provisioned the web server, after all.
command line scripts, however, are a different story. we have no control over the environment. it’s someone else’s computer.
with that in mind, it’s good idea to always start your script with a call to a ‘preflight’ function that confirms the system has everything necessary to run the script. if any of our preflight tests do not pass, we can halt the script with an appropriate error instead of just barging ahead and making a mess.
common things we can check for in our preflight include:
minimum php version: of course we’re writing for the lowest possible php version we can (portability and all!), but we should still check that we meet the minimum version. keep in mind that a three year-old amazon linux ec2 runs php 5.3 by default!
checking the php version is as straightforward as a call to the built-in phpversion() command.
necessary extensions are loaded: if our script calls for an extension, we should make sure that it’s actually loaded before we start. we can do this with the built-in command extension_loaded(). so, for instance, if we want to confirm php has ‘imagick’ available, we could test that in our ‘preflight’ function like so:
if (!extension_loaded('imagick')) die('imagick extension required. exiting.'); >
file access: we may want to read files or write to files or directories. it’s a good idea to confirm that the user running our script has permissions to do that before starting. php has a number of built-in commands to accomplish this:
- file_exists to confirm if the file exists
- is_dir to determine if the file is a directory
- is_writable() to test if the user has write access to the file or directory
let’s put all of that together into a short sample preflight function:
#!/usr/bin/env php /** * Confirm system can run script */ function preflight() $phpversion_array = explode('.', phpversion()); if ((int)$phpversion_array[0].$phpversion_array[1] 56) die('minimum php required is 5.6. exiting'); > if(!extension_loaded('posix')) die('posix required. exiting'); > if(!is_writable('/tmp')) die('must be able to write to /tmp to continue. exiting.'); > if(!file_exists(posix_getpwuid(posix_getuid())['dir'].'/.aws/credentials')) die('an aws credentials file is required. exiting'); > >
in the first if block of this function we check the php version is at least 5.6. we do a little clumsy casting here to accomplish this as we only care about the major and minor numbers.
next, we confirm that the posix extension is loaded. posix is basically a set of standard ways to interface with the host operating system. definitely something we will want for our script.
we then do a fast confirmation that we can write to the /tmp directory and, finally, determine that the user has an aws credentials file.
one thing to note is in the last if block, we used a couple of those posix commands to get the running user’s home directory. the ‘~/’ construction will not work here. instead, we get the user’s id number with posix_getuid() and then pass that to posix_getpwuid() to get an array of information about the user, including their home directory, from the /etc/passwd file.
parse command line arguments
handling command line arguments and switches is something most scripts need to do, so we’re going to write a short function that takes the arguments passed to our script and processes them into an array that we can reference later when determining what functionality to provide.
this function handles four basic types of arguments:
switches
these are single letter arguments that are preceded by a dash, think the -a argument to ls to show hidden files.
long switches
these are the same as switches except. longer. an example would be curl accepting —silent as a synonym for -s . long switches are preceded by two dashes.
asignments
this is for passing data into our script. assignment arguments take two dashes and use an equal sign to indicate the value, ie. —outfile=/path/to/file or mysql’s horrifying —password=mynothiddenpassword .
positional arguments
these are arguments without any preceding dashes; their usage is determined entirely by their position. think the linux ‘move’ command mv /path/to/origin /path/to/destination . there are two positional arguments here, and we know which value is assigned to the origin and which to the destination by the order they are written in.
with that in mind, we can add this function to our command line scripts to parse arguments for us:
#!/usr/bin/env php /** * Parses command line args and returns array of args and their values * * @param Array $args The array from $argv * @return Array */ function parseargs($args) $parsed_args = []; $args = array_slice($args, 1); for ($i=0;$icount($args);$i++) switch (substr_count($args[$i], "-", 0, 2)) case 1: foreach (str_split(ltrim($args[$i], "-")) as $a) $parsed_args[$a] = isset($parsed_args[$a]) ? $parsed_args[$a] + 1 : 1; > break; case 2: $parsed_args[ltrim(preg_replace("/=.*/", '', $args[$i]), '-')] = strpos($args[$i], '=') !== false ? substr($args[$i], strpos($args[$i], '=') + 1) : 1; break; default: $parsed_args['positional'][] = $args[$i]; > > return $parsed_args; >
we can then call that function in our script with
$our_parsed_args = parseargs($argv);
Passing arguments to php script
To experiment on performance of pass-by-reference and pass-by-value, I used this script. Conclusions are below.
#!/usr/bin/php
function sum ( $array , $max ) < //For Reference, use: "&$array"
$sum = 0 ;
for ( $i = 0 ; $i < 2 ; $i ++)#$array[$i]++; //Uncomment this line to modify the array within the function.
$sum += $array [ $i ];
>
return ( $sum );
>
$max = 1E7 //10 M data points.
$data = range ( 0 , $max , 1 );
$start = microtime ( true );
for ( $x = 0 ; $x < 100 ; $x ++)$sum = sum ( $data , $max );
>
$end = microtime ( true );
echo «Time: » .( $end — $start ). » s\n» ;
/* Run times:
# PASS BY MODIFIED? Time
— ——- ——— —-
1 value no 56 us
2 reference no 58 us
3 valuue yes 129 s
4 reference yes 66 us
1. PHP is already smart about zero-copy / copy-on-write. A function call does NOT copy the data unless it needs to; the data is
only copied on write. That’s why #1 and #2 take similar times, whereas #3 takes 2 million times longer than #4.
[You never need to use &$array to ask the compiler to do a zero-copy optimisation; it can work that out for itself.]
2. You do use &$array to tell the compiler «it is OK for the function to over-write my argument in place, I don’t need the original
any more.» This can make a huge difference to performance when we have large amounts of memory to copy.
(This is the only way it is done in C, arrays are always passed as pointers)
3. The other use of & is as a way to specify where data should be *returned*. (e.g. as used by exec() ).
(This is a C-like way of passing pointers for outputs, whereas PHP functions normally return complex types, or multiple answers
in an array)
5. Sometimes, pass by reference could be at the choice of the caller, NOT the function definitition. PHP doesn’t allow it, but it
would be meaningful for the caller to decide to pass data in as a reference. i.e. «I’m done with the variable, it’s OK to stomp
on it in memory».
*/
?>