A number of our services were until recently running on Ubuntu 14 (Trusty Tahr), for which long term support (LTS) was coming to an end. Trusty Tahr was released in 2013 and has since served us quite effectively, —so effectively in-fact, that we near completely forgot about it. That is until one morning, an announcement from Ubuntu declaring that LTS for good ol’ Trusty was coming to an end, demanded attention. Our engineering team, being the enthusiasts that we are, popped the hood, blew the dust off and diligently set upon the task of taking stock, discussing and carding up requirements for the great migration.
Recently the Elements team needed to make a reasonably large change to the codebase: migrating over 300 files which imported
lodash to instead import from
To automate this change we chose to write a codemod for jscodeshift, a tool by Facebook. The power of jscodeshift is that it parses your code into an Abstract Syntax Tree (AST) before transforming it, allowing you to write codemods that are smarter than regular expression based codemods.
One of our development teams highlighted that their build was taking too long to run. We obtained a near three times speed improvement in most part by using newer AWS instance types and allocating fewer Buildkite agents per CPU.
Unknown to our users, we recently migrated edge network providers. This involved some particularly interesting problems that we needed to solve in order to migrate without impacting availability or integrity of our services.
An ELK stack is a combination of three components; ElasticSearch, Logstash and Kibana, to form a Log Aggregation system. It is one of the most popular ways to make log files from various services and servers easily visible and searchable. While there are many great SaaS solutions available, many companies still choose to build their own.
While working on the new Envato Market Shopfront app, the team agreed to always keep all the dependencies in the project up to date. Sometimes it was just straightforward patch or minor version upgrade, but sometimes it could also be breaking changes that need a whole lot of thought. The upgrade to
react-router v4 happened to be a good example.
To expose our internal services to the outside world, we use what is known as an API Gateway. This is a central point of contact for the outside world to access the services Envato Market uses behind the scenes. Taking this approach allows authors to leverage the information and functionality Envato provides on its marketplaces within their own applications without duplicating or managing it themselves. It also benefits customers who want to programmatically interact with Envato Market for their purchases instead of using a web browser.
You may have recently heard reports or seen news about a security bug called “Cloudbleed” affecting sites served by Cloudflare. Envato delivers some websites using services provided by Cloudflare, however Cloudflare have confirmed that none of our websites are directly affected by this security bug. Cloudflare published a detailed explanation of what the bug is and how it came to be, you can read it on their blog.
UPDATE Since the original publication of this post, Cloudflare have released a follow up blog post with information they have learned in their investigations. The second article focuses more on explaining the real-world impact of the bug, rather than the technical details.
On Wednesday 19 October, Envato Market sites suffered a prolonged incident and were intermittently unavailable for over eight hours. The incident began at 01:56 AEDT (Tuesday, 18 October 2016, 14:56 UTC) and ended at 10:22 AEDT (Tuesday, 18 October 2016, 11:22 UTC). During this time, users would have seen our “Maintenance” page intermittently and therefore would not have been able to interact with the sites. The issue was caused by an inaccessible directory on a shared filesystem, which in turn was caused by a volume filling to capacity. The incident duration was 8 hours 26 minutes; total downtime of the sites was 2 hours 56 minutes.
We’re sorry this happened. During the periods of downtime, the site was completely unavailable. Users couldn’t find or purchase items, authors couldn’t add or manage their items. We’ve let our users down and let ourselves down too. We aim higher than this and are working to ensure it doesn’t happen again.
In the spirit of our “Tell it like it is” company value, we are sharing the details of this incident with the public.
In a previous post, Envato Market: To The Cloud! we discussed why we moved the Envato Market websites to Amazon Web Services (AWS) and a little bit about how we did it. In this post we’ll explore more of the technologies we used, why we chose them and the pros and cons we’ve found along the way.
To begin with there are a few key aspects to our design that we feel helped modernise the Market Infrastructure and allowed us to take advantage of running in a cloud environment.
- Where possible, everything should be an artefact
- Source code for the Market site
- System packages (services and libraries)
- Everything is defined by code
- Amazon Machine Images (AMIs) are built from code that lives in source control
- Infrastructure is built entirely using code that lives in source control
- The Market site is bundled into a tarball using scripts
- Performance and resiliency testing
- Form hypotheses about our infrastructure and then define mechanisms to prove them
We made a few technical decisions to achieve these goals along the way. Here we’ll lay those decisions out and why it worked for us, as well as some caveats we discovered along the way, but first.