PHP clusters simplified with AWS EFS

One of the main challenges I used to have with hosting my site and applications on AWS, is file synchronization between the different instances when running my application. If it’s for PHP clusters that need media files available to all nodes or a simple admin UI, that also doesn’t run on front-end servers and must sync uploaded media, it was always a real challenge to get the level of service, availability, and redundancy I’m used to from other AWS services when dealing with file sharing.

AWS has been working for a long time to solve this problem and they’ve been rolling out a new service to production called EFS – Elastic File SystemEFS is a new file storage service, available on AWS for EC2 instances. EFS is available using NFSv4 protocol and grows/shrinks as you add/remove files to it. Use of EFS allows transparent usage of NFS as a regular file system mounted on the server.

PHP clusters challenge

Managing PHP clusters can be a difficult task, especially when you store data in files that must be shared between cluster nodes. In this scenario, you used to have many solutions such as syncing files periodically between servers or setting up and managing some sort of file server (which has to be clusterized itself in case you need a HA solution).

Let’s take, for example, a very popular web application written in PHP – WordPress. Although WordPress saves most of it’s data in a database, there’s also static content that’s saved locally in files. In addition to static files that can be uploaded by the users, there are plugins and themes that are part of the local file system as well.

If we setup a cluster of two or more PHP servers, that run one WordPress application, there is a problem with keeping these files in sync. There are several ways to solve this and we’re not going to look into all of them (although this is a good idea for separate post), but let’s see how EFS solves this problem easily in a scalable way.

EFS – The out of the box solution

EFS is a file storage service that can be mounted as regular NFS filesystem. On Linux, such a filesystem can be mounted on multiple servers simultaneously. So, for our example WordPress application, we could mount that filesystem and store the /wp-content folder on it (which is a folder WordPress uses to store static content, themes, plugins, etc.).

As stated earlier, EFS is available using standard NFSv4 protocol. This means that mounting EFS on Linux is done using a simple mount command – the only prerequisite is installing the NFS client package that’s available in almost any Linux distribution.

Since mount is done using the standard mount command, we can automate the mounting of EFS, to handle instance restarts, by adding the relevant entry to our /etc/fstab file.

Performance

During the EFS preview, we were able to test the performance of EFS and compare the following scenarios:

  1. Single PHP server, serving WordPress with all its content from a “local” disk
  2. Single PHP server, with static content offloaded to S3
  3. Cluster of two PHP servers, with static content offloaded to S3
  4. Cluster of two PHP servers, with static content stored on EFS and mounted on each instance

In our tests, we used Zend Server 8.5 with PHP 5.6 and used its deployment mechanism to simplify the creation of PHP clusters, management, and application deployment (especially on cluster).

Our test results showed that a cluster of two Zend Server nodes with static content stored on EFS performed better than the alternative of storing files on S3. It also showed that under simulated load of 50 concurrent users, the results are much more consistent (in terms of response times that didn’t vary as much as other solutions).

PHP Clusters

Conclusion

EFS is the best solution AWS has to offer to share files between instances and nodes of clusters. It scales very well and provides a lot of value for PHP web applications. This is the “missing piece” we were looking for when building PHP clusters.

Follow this blog for additional posts on how to use EFS with your PHP apps and how to automate and scale your application using Zend Server on AWS.

The following two tabs change content below.

    Boaz Ziniman

    Sr. Director, Cloud Strategy at Rogue Wave Software Inc.
    The following two tabs change content below.

      Dima Zbarski

      Cloud Integration Engineer at Zend Technologies
      • James Dunmore

        I can’t find the link now, but S3 is worse than apache at serving content – Amazon don’t really care about that because for load/performance is should be fronted with cloudfront.

        Can you re-run these tests with CF as the end point, not S3?

        (very interesting article though)

        Thanks.

        • Boaz Ziniman

          Hi James,

          We saw exactly what you described but I don’t think adding CF to the mix is a fair comparison. I’m 100% sure it will be faster but more expensive as well. You can use any CDN With EFS as well and it will give you an additional performance boost, same as CF.

          • James Dunmore

            That’s not true on two accounts:

            Data out of cloudfront is not significantly more than S3 (and S3 to cloudfront is free) – actually CF is cheaper at scale.

            Secondly, performance of S3 to CF > EFS, via apache/nginx doing the grunt to a CDN. Easily, without even mentioning that S3 to CF takes advantage of Amazon’s peer networking and looking at the additional loading that your webserver then has to take (not to mention blocking requests – let your webserver concentrate on doing content, not static assets)

            As I said, I think EFS is great, and solves a lot of problems where previously we’ve had to setup our own NFS, especially for legacy applications – but if you are designing something new or you can use something like wordpress that has plugins to use S3 then it’s a complete no brainier to offload to Object Storage.

            My issue is the conclusion of your article is misleading “EFS is the best solution AWS has to offer to share files between instances and nodes of clusters”

            Simplest – maybe/probably – cost effective, possible – but it’s not the best.