Creating a high performance Data Team: lessons learned from the field
SQLBits 2023
There are many factors that influence the ability for data and analytics teams to achieve their full potential. With most organisations seeking to become data driven, how can they build a data and analytics team that is able to meet increasing demands in an agile, cost effective and sustainable way?
In this session, Barry Smart will explore the factors that lead to success (and the pitfalls to avoid). He will use real world examples to bring it to life based on his experience delivering data intensive, high business impact projects over the last 25 years.
Transcript
Hello and good afternoon everyone. My name is Barry Smart. I work for endjin. We're a fully remote boutique technology consultancy based here in the UK with a global customer base. We specialize in helping small teams to deliver impactful cloud native apps, data and analytics solutions.
I'm going to talk to you today about the some of the ingredients required to create a high performance data team; based on my nearly 30 years of experience in the technology sector.
I started out as a software engineer in the Energy sector, I evolved into an architect, then a business transformation manager. I was promoted to IT Director, and then switched sector to help digitally transform a 100-year-old Financial Services firm and was promoted to CTO as a result. I then decided to change gears and follow my passion; I studied for a Masters in AI & Data Science, and now I'm Director of Data & AI at endjin.
I've been a team member, a team lead, a departmental manger, an organizational leader, and now I help mentor customers to create high-performance data teams which transform their organisations, and I'm going to share my insights with you.
Data is a socio-technical endeavour
First I wanted to talk about data as a socio-technical endeavour. In other words you have to master the cultural, organisational and human aspects as well as the technology in order to be successful.
This simple model fits any data project. The data team on the left, who create a "data product" to meet the needs of users on the right. Value flows from left to right. Also important are the feedback loops that enable the data team to evolve the data product in response to the evolving needs of users. There's often a lot of uncertainty, especially if you're doing cutting edge, innovative, transformational data products where things are under constant evolution. Where the flow of value or the feedback loop breaks down, significant issues will occur.
A major consideration in any socio-technical system is Conway's Law. This powerful, so called, homomorphic force means that organizational lines of communication will prevail over the software architecture that you're trying to deliver. Conway's Law can be leveraged for good, but if ignored it can also have significant negative implications for data teams. Let me try and bring that to life with a story.
I learned about Conway's law the hard way when I involved in delivering a new digital proposition direct to consumers. We had a strong vision for the architecture, adopting cloud platform as a service, putting a series of data pipelines in place to wrap up the core intellectual property and then enabling end users to interact with it through a web app. We had designed a modern target architecture, but what we had failed to consider was how the organisation was set up to deliver it.
When I overlay the many teams and stakeholders involved, you can see how it immediately gets more complicated.
For example, we had an independent test team. This meant time consuming conversations to hand over the software to be tested and also time consuming conversations to describe any issues that had been found.
Another example, was the deployment team who became a bottleneck for getting new releases into production. Releases were a major ceremony and were fraught with problems. As a result we were only able to release every quarter rather than our objective of every week.
That's just two examples of where the organizational design was working against the target software architecture.
As a result, we ended up with a sub-optimal architecture. And the product owner spent a lot of time facing internally trying to overcome inefficiencies, barriers, blockers to get functionality into the hands of the end users. One consequence was that we lost of sight of the end users, the feedback loops were never established.
In retrospect I believe that we could we could have achieved twice as much with half the people. Indeed, after a few years, having spent a significant amount of money on the project and only managing to onboard a small number of customers, the organisation made the tough decision to retire the product.
The moral of the story is: ignore the negative effects of Conway's Law at your peril!
Four Traits of a High Performance Data Team
High performance data teams are highly aware that they are operating within a social-technical system. I wanted to speak briefly about 4 of the traits that enable them to wield these socio-technical forces for good.
Trait 1 - Small Multi-Disciplinary Team
The first trait is that high performance data teams tend to be small and multidisciplinary in nature. The magic number is no more than 9. You may have heard Amazon's "two pizza team" principle, which is backed up by academic research, most famously by anthropologist Robin Dunbar. A small team enables a flat structure and for the team to be self-organizing. They are also be empowered to get things done and adopt agile practices. By multi-disciplinary we mean they have all of the skills necessary to discover, build and own data products through their end to end lifecycle.
So what are those skills? Much of the advice on the web about building data teams will list the roles on this slide. There is no doubt that you may need some of these skills. But the fundamental problem I have with filling your team with specialists is that you are at risk creating of silos within the team, and triggering the negative aspects of Conway's law.
With a small team you can't afford to have people that are narrow technical specialists. You need much broader skills, and for the skills across the team to overlap. You want people to be able to wear multiple hats, that are motivated by the team's success, who are interested in the business domain you operate in, who are inquisitive and are willing to constantly learn. We like to say hire for attitude over aptitude.
So how do organisations build high performance data teams from scratch? To illustrate this I wanted to tell another story, about a medium sized organisation we have been working with for just over 2 years. They had a set a goal to become more data driven. Their first hire was a team lead. Her official title is "Head of Data & Analytics" but we prefer to think of her as "Chief Data Evangelist" because that's exactly what she does: she's great at inspiring her team, but she is also highly capable at engaging outwards into the wider organisation, influencing right up to Executive level.
On day one, she inherited a DBA who had been working in isolation to maintain a legacy data warehouse solution that the firm had built. This warehouse was struggling to scale and therefore needed to be retired. This is where we came in. We engaged with the team for the first three months of its life. Augmenting it with an endjin squad to help them implement a new cloud native data & analytics platform, deliver the first few data products on this platform and develop skills in the team along the way.
During the course of delivering that first data product, she worked closely with a domain expert who showed a high degree of data literacy. She became team member number 3. Bringing all that fantastic domain knowledge as well as enthusiasm to pick up new skills that would allow her to pivot into a data career. For team member 4 she tapped into the organization's graduate recruitment program, bringing in a the Maths graduate.
Finally, about a year in to forming the team, for team member 5, she identified that they needed to increase their overall engineering maturity so she hired an experienced software engineer who wanted to pivot into data. 2 years on they have gone through the forming, norming, storming cycle and are now a high performance data team. They are well on the way to helping the organisation to becoming data driven with numerous high value data products in use across the organisation. This example shows you don't need to hire stars. With a vision and a bit of patience you can grow your own talent.
Trait 2 - Product Mindset
The second trait of a high performance data team is that they adopt a product mindset. In doing so they will Discover, Build and Own data products through their entire lifecycle. To succeed with a product mindset they need to be outward looking. In other words, they're not an order taker sitting there waiting to get work allocated to them. They are out there actively engaging with the business looking for opportunities to digitally transform the organisation using data.
It's also about balancing total cost of ownership against the value that's going to be delivered. Being prepared to pivot or fail fast in the Discovery phase if that balance is tipping in the wrong direction. We hear that over 70% of digital transformations fail, so better to fail early when the costs that have been sunk are low, extract useful learning and move on to the next idea.
They are therefore operating as an innovation engine, helping the organisation to cycle through ideas rapidly to discover the data products that will have a genuine transformational impact. You will notice that I've used the term "data product" throughout this presentation, this is to reinforce product thinking. It is also one of the four principles of the Data Mesh architecture that is gaining a lot of attention in the data industry.
Trait 3 - Engineering Maturity
The third trait is engineering maturity. High performance data teams embrace the principles, processes and tools that mainstream engineering teams use. This is to achieve a single objective: the rapid and safe delivery of value. By rapid we mean taking humans out of the loop through automation and adopting tools that boost productivity. By safe, we mean removing risk and reducing costs. The team are aiming to build, test and release new features in a repeatable, reliable, autonomous manner with a high degree of confidence and with little or no ceremony.
For example, by considering non-functional requirements up front it allows you to build master data management and programmatic governance into your solution from the outset, rather than trying to bolt them on later on in the product life cycle when it is more expensive and risky to do so.
The data tech landscape has matured significantly in the last 5 years. Cloud native platforms such as Azure Synapse Analytics now provide all of the capabilities you need adopt good engineering practice.
Trait 4 - Organisational Readiness
The final trait I wanted to talk about is the readiness of the wider organisation to adopt data products. Data teams don't exist in isolation. They need to interact with and influence other parts of the organisation to succeed. Firstly they establish productive partnerships with technology peers in a way where those interactions don't inhibit the flow of value. This is typically achieved by encouraging other teams to interact either as:
Enabling team engaging for a short periods to bring in specialist expertise that the team needs to unblock some kind of impediment; or as a Platform team by providing well documented IP, platforms or services that can be consumed through an API or command line script. Note: in many organisations these technology peers will be external parties.
Secondly, interaction with the users of their data products is critical in order to create those feedback loops, to build empathy and understand their evolving needs. Key to this is being able to establish a common language. For example, one organisation we worked with had 20 different definitions of Gross Margin. Until we helped the data team to resolve this, it was difficult for them to move on to develop the data product.
Another common issue we have with users is mis-aligned incentives that can act as a barrier to adoption. To quote Upton Sinclair "It is difficult to get a person to understand something, when their salary depends upon them not understanding it." This is where the relationship between the data team and the leadership of the organisation becomes important in order to make the necessary operating model changes that are required to remove these types of barriers.
High performance data teams are also able to influence upwards. Challenging the status quo, encouraging leadership to embrace data opportunities, helping them understand they can achieve their goals by integrating data into the wider strategy of the organisation.
Summing Up
In summary:
- Data is a socio-technical affair.
- Have you mapped out how value flows in your organisations?
- Are there opportunities to optimise it? Do you have feedback loops in place? What are they telling you?
- Are you wielding Conway-s Law and the homomorphic force for good?
Think about applying some of the traits we described today:
- Small multi-disciplinary teams, that are self organising, adopt agile practices and have all of the skills needed to build and own data products.
- Adopting a product mindset, act as an innovation engine, looking outwards into the organisation, to understand user needs, and discover high value data products. Being prepared to fail fast when ideas are not going to achieve high value relative to their TCO.
- Engineering maturity, what practices and tools can you adopt from mainstream software engineering to achieve rapid and safe delivery of value.
- Finally, are you helping the wider organisation to become data driven. Influencing through your relationships with technology peers, users and the leadership of the organisation.
I'm aware that I've covered a lot of ground in the last 20 minutes, for deeper insight, I'd recommend a visit our web site where we have a range of blogs covering all of the topics I've talked about today. That completes the talk, thanks for listening.
I'd love to get your feedback, the QR code here will take you to the link to do just that! I'm attending the rest of the conference so please feel free to catch me any time if you have questions. Perhaps we have time for a few now?