How to: Stage-Environment, the basics

This will be a long post, therefore I’ll make this a “meta” post, with a lot of stuff I heard and think about staging environments and what could be made better. Followed by a few posts how we do it.

Bare metal

We use docker locally, but not for our servers, neither for the stage nor for the production systems. A few ideas on that in the end of this post.

Version Control System (VCS)

Let’s start with my dogmatic baseline: Use a VCS. I’m a big fan of “doesn’t matter which – everything is good” and although this is correct, especially for VCSs the truth is, everything is based on git. And even “worse” there are only a couple of options out there which system to use:

  • github – owned by Microsoft
  • gitlab – privately owned (but Google is invested in it, don’t ask me how big)
  • bitbucket – owned by Atlassian

There are a ton of other services I never heard of or dealed with. And another ton of open source alternatives if you prefer to selfhost.

We are using bitbucket by Atlassian and although it integrates great and every day even better with our JIRA, I’m not sure I would go down the JIRA way again if I would start today. BUT back then and still today we know a lot about the Atlassian products and I’m a big fan of “use what you know” (but don’t forget to learn new things). By the way, this is my default answer when asked “I have this project idea, which language/framework/tech is the best?” “The one you know already and now how to tame.” There is no need on the first throw to use the new shiny tech, the awesome caching system or message queues, because all of this is needed if you scale. And learning it while implementing a new idea is one thing too much if you ask me. If your idea/project/business rockets, then you need to find time and resources to scale it. And then we are talking about refactoring and maybe migrating the system on new tech/language/framework.

Pipelines

To create and update an environment we run pipelines. Pipelines can do a ton of other things, like run (static) code analysis, run all kinds of tests (unit, integration, smoke, frontend, …), query all your services for security, documentation, AI, etc.

But today we concentrate on stage environments. So depending on the project you might want do all kinds of stuff, from creating new vHosts, databases, redis instances, rabbitMQ queues, supervisord configuration and update your PHP(?) code or your database.

But maybe you are writing “only” a plugin no complete plugin and want to have a testing environment which is always up to date, then there is no need for all the tech around. We recently implemented a Shopware 6 plugin for Montonio and needed a testing env. So I setup a Shopware 6 instance manually, installed the plugin manually and made sure it is automatically updated if we push something to the dev branch. This script looked like this:

pipelines:
  branches:
    develop:
          - step:
              name: Deploy Plugin to Staging
              image: atlassian/default-image:latest
              deployment: Staging
              script:
                - |
                  ssh web-user@montonio.example.com << EOF
                    cd  montonio.example.com/shopware/ &&
                    rm -rf /var/www/share/montonio.example.com/shopware/var/cache && 
                    php8.2 /usr/local/bin/composer update montonio/shopware6 &&
                    php8.2 bin/console system:update:prepare &&
                    php8.2 bin/console plugin:update MontonioPayment &&
                    php8.2 bin/console system:update:finish &&
                    bin/build-storefront.sh &&
                    bin/build-administration.sh &&
                    php8.2 bin/console cache:clear &&
                    cluster-control login --pa_token="$MAXCLUSTER_TOKEN" &&
                    cluster-control apache:restart c-2127 srv-a &&
                    curl --fail https://montonio.example.com &&
                    curl -X POST -H 'Content-type: application/json' --data '{"text":"Branch develop was successfully deployed to https://montonio.example.com"}' $SLACK_WEBHOOK
                  EOF
              after-script:
                - ssh web-user@montonio.example.com "cluster-control logout"

This script doesn’t do much and is even written in bash. Because everything on a linux machine is done in a shell, in the end what is running are shell commands, but to make it easier for a PHP developer normally we add a “wrapper”: Deployer, Deployer is a “deployment tool”, which essentially means it runs one or multiple tasks on one or multiple server, after another or in parallel. So it helps you running the right bash script in the right moment on the right server ?.

Back to the script above and a little explanation. As you might know, we are hosting all our customers at maxcluster. One of the things we very much like is their cluster-control CLI tool to do more or less everything.

Everything in the script is run via SSH. In case you don’t know, the ssh command returns the error code, the command it runs returns. That is important because the pipeline should be red and fail if something is odd and be green and successful if everything worked. Therefore all commands are chained with && together, so the next command only runs if the former ran. This is called short-circuit evaluation – and btw php is doing the same. If everything is chained with && if anything is false, then all is false, so no need to evaluate/run the other commands.

What do we do?

  • change to the shopware folder
  • delete the cache folder, I don’t know why ?
  • update our plugin
  • Run all the commands for Shopware to update everything properly
  • build all the JS and included: theme:compile
  • clear the cache again, just to be sure
  • and now we login into cluster-control to restart the apache – because opcache and I wanted a simple solution, Samuel Gordalina’s cachetool is the better solution, especially on production systems!
  • Then we try wether the homepage returns a 200 – no real test, but helps if we fucked up hard
  • And if everything ran successfully, we ping our Slack

A simple script for the use case.

Environment

This way we build an environment we can test on. It is important, that the environment looks as much as the production system as possible. We want the same PHP, MySQL/MariaDB, RabbitMQ, Redis, ElasticSearch/OpenSearch version as the production version. The minimum is the same minor version. If you are lucky you even have the same hosting environment.

One of the most sophisticated deployments and testing strategies I know of was presented by Fabrizio Branca back in 2013, with the deployment of the Angry Birds shop.

And one last thing I learned: If you can, make sure that your stage environment is not on the same machine – and if possible not the same network as your production machine. At the beginning of my career we misconfigured our test server, so it wrote into the production redis. The consequence was that every couple of weeks (we didn’t test and deploy often) the production system started to behave weirdly. We cleared the cache and everything went back to normal. But it took month to make the connection and find the wrong configuration.

One thought on “How to: Stage-Environment, the basics

Leave a Reply