Skip to content

Optimising software engineering at scale

ASOS are one of the world's largest online retailers, with over £2.5bn in revenues. As the team built to over 4000 employees, their CTO, Bob Strudwick, turned to endjin to help optimise their product delivery.

People first

It's a real challenge to understand the reasons why a development process might not be as productive as the business expects.

The problems are rarely technology (however often developers complain that "the build is slow!").

So we always start by talking to the people - from senior stakeholders to the most junior developers.

We used our machine learning tools to analyse the product development lifecycle and it swiftly became apparent that the real problem was that the team was doing too much!

Dependencies will get you

Dependencies between teams in different time zones caused knock-on effects, deployment windows were missed, and whole release cycles of critical functionality could be delayed by an issue in a minor feature, in another team.

We helped identify the process bottlenecks and developed a scheduling methodology that could deal with dozens of teams delivering concurrently.

Automate the automatable

Once the methodology was right, we could then look at the detail. What was time consuming, manual, and error prone in the DevOps process? Those things are prime candidates for automation.

ASOS has a huge Azure estate across multiple technologies and environments. The continuous deployment process through various release tiers, regions and versions, means that something is always in flux. And in any sufficiently large, dynamic system, you have to take transient failure as a given, and engineer for resilience.

Designing the UX for the deployment troubleshooting tool

We designed and developed a solution, which harnessed the power of Microsoft's Reactive Extensions (Rx) and semantic logging; it enabled the team to track deployment failures through their estate, visualize the "tracer bullet" through their logging and root cause problems directly into the failed system, with just a couple of clicks.

Always be learning

More importantly, it integrated a feedback loop into their helpdesk system, so common errors could be identified, recorded, and solutions built in to both manual and automated runbooks. The system could learn from its past mistakes.

Feedback loops

We left ASOS with process and tools improvements that reduced common diagnostic times from hours to seconds, and more than quadrupled the number of releases to production they could perform each year. That was a grest result in itself.

But we also left them with a "feedback loops" mentality that they could apply to new technology and processes as they evolved, readying them for the Data Science and Machine Learning projects that were part of their evolving data strategy.

We help organizations of all sizes from start-ups to global enterprises across financial services, media & comms, retail & consumer goods, and professional services.