Whose package is it anyway? Why it's important to minimise dependencies in your solutions...
(TL;DR This month I learnt a lot about the importance of package management, maintenance and the minimisation of dependencies.)
It's been a little while since my last blog, I've been pretty bogged down in a world of packages, .NET frameworks and testing. And it all started with what I thought would be a simple update to a site's HTML…
Let me spin you a tale…
About a month ago I was asked to update a website to include the new Google Jobs Postings structured data. This is a pretty useful new(ish) feature from Google which scrapes sites for job postings and collects them all together in a handy summary card at the top of your search when you type in e.g. "Software Jobs Manchester". (I know that at least some of the people, if they're good friends at any rate, who will end up reading this are in the middle of job hunting – if that's you, definitely go check it out, makes things much simpler!)
Right so… That involved basically creating a data structure, converting this into a JSON, and embedding that into the HTML of the site… How hard could that be, right??
The site itself is built using a Content Management System we've built ourselves, so doing this involved adding the data structure to the CMS, publishing, and updating the package in the solution for the site. This was the pebble that lead to the avalanche. When I pulled through the packages, I left all the default settings for the package update. Including… "Dependency Behaviour: highest". So, it not only updated the CMS packages but updated EVERY SINGLE PACKAGE THEY DEPENDED ON TO THE LATEST VERSION. Until this point, I had never really thought about how NuGet handled dependencies, believe me, that is no longer the case. After realising what had happened, I undid the changes and updated the packages using "latest patch" as the setting for the dependencies. Everything stopped screaming at me, and I got it all up and running with the job postings data displaying correctly on the site.
However, a few days later, we had a realisation. The WebJobs weren't running. It didn't take long to establish that it must be something package related that has caused it to break, as we didn't touch anything in this part of the solution. The projects use our composition framework, and essentially we had some installers missing. So, at this point, I am updating packages both in the CMS solution and the website itself, to get them all consistent between the two in the hopes that this sorts it all out. No such luck. A large part of me wanted to just undo all the changes just to see something start working again. However, this would have meant that no package could ever be updated in either of the two solutions, which is hardly sustainable long term.
Unhelpfully this was the point where I went on holiday for a week, and poor Howard was left to sort out the mess. After many, many package updates, updating the .NET framework, collecting together dependencies and consolidating everything under the sun, when I returned the solutions were almost unrecognisable. Now, I can't claim that I was involved in most of the heavy lifting here, I was purely an onlooker while unbelievable amounts of package related magic happened, but I very much gained an appreciation for how important proper management and documentation is.
Now, I came in after all of this, to an updated solution, with its dependencies all collected into a manageable place, and with two jobs:
- Get the WebJobs running again
- Implement some testing
The first of these was relatively painless, with only a couple of serialization and configuration issues. After a month of red error messages and package mismatches, the first time I had a successful run through I nearly cried.
The second I am still working on, but has been quite an interesting project:
Selenium is a tool for testing websites. It allows you to automate tests on different browsers, e.g. "Can I visit the homepage?", "When I register am I then logged in?". This means that unanticipated consequences of changes can often be caught before going live (something which I very much appreciate the value of at this point).
We have developed a little tool for running Selenium tests on different browsers, locally and remotely using SauceLabs. You point it at a site (either one that is locally hosted or deployed), and write some tests using SpecFlow to test that site.
There were a couple of things that caught me out while updating the tooling:
- Each browser has a separate web driver which are not installed when you add the Selenium NuGet package to the solution. This is also something to be aware of as time goes on, because as browser versions are updated, the old drivers may no longer work and will need to be updated again.
- And, different browsers have different requirements, IE for example does not allow you to "AcceptInsecureCertificates".
But, after a few fiddly updates (luckily not package based ones), it is very satisfying to see the site tests working as they should!
So… After that rather rambling story… What are the take-aways?
Essentially, I now very much appreciate the value of having as little dependency on outside frameworks as possible. If as much as possible is under your own control, there is far less room for synchronization errors (this package depends on this version of this package which depends on this version of this package is always going to be an issue, but it's easier to follow if you control the source code).
Of course, some level of dependency on outside code is unavoidable, but I am also now a huge advocate for calling out to self-contained API based services rather than incorporating frameworks into your solution. If everything is self-contained, then updating one aspect is far less likely to have unintended knock on effects.
And, finally, check your settings before doing a package update. All I can say is thank god for Git, without which I may never have returned from the treacherous sea of version numbers that was the first week of August.