The Dude abides.

Posted
20 August 2007

Tagged
Free Software
Grokking Material
PHP
Technology

Identify your files with confidence!

In a web application, allowing users to upload files is always a risky manevour. One way I have heard of to reduce the possibility that the user is not uploading invalid data is to verify the file type. This is of dubious usefulness as a sufficiently crafty attacker will do crazy things like embed PHP code in an uploaded image file and own your box before you can say boo. Verifying file types is more useful in situations where the user makes a mistake and uploads the wrong file so the software can prompt a carefully crafted gentle yet firm warning to upload the right file.

Anyway, RFC 1867 indicates that your browser should send a Content-Type field indicating the type of the content uploaded but which programmer worth his/her salt would trust that field given how badly browsers have abused it? Instead, the pecl/fileinfo package can come in useful. It’s basically an analogue to the trusty file command on *nix systems.

pecl/fileinfo is easy enough to use:

1
2
3
4
   $filename = "odfrocks.odt";
   $finfo = finfo_open(FILEINFO_MIME);
   echo finfo_file($finfo, $filename) ;
   finfo_close($finfo);

WACT’s mime type validation rule actually checks $_FILE["filename"]["type"] so we patched that to use pecl/fileinfo instead (doing this will left as a trivial exercise for the reader). A minor performance hit but very useful nonetheless as we had broken browsers returning wrong Content-Type information and messing up the all important user experience.

In any case, this method works well enough, except in cases where the magic file itself is b0rken. I noticed this in April 2007 in Fedora Core 6 where the magic file had the wrong values for Microsoft Office documents and it was reporting what was clearly Microsoft Word files as Microsoft Installer files. So I filed a bug on the RedHat bugzilla and lo and behold, four months later, the bug is fixed! :)


Sucky Week Watta day