Nineteenth Blog Birthday: Fabric, The Hedgehog And The Fox

Every year, on the anniversary of the first-ever post on this blog, I write a post reflecting on what has happened to me professionally in the past year and this year the subject is obvious: Fabric. Of course I knew about, and indeed had been playing with, Fabric for a long time before even the private previews but it was only with the public preview and then GA that I felt the real impact on my job. I now work on the Fabric Customer Advisory Team at Microsoft rather than the Power BI Customer Advisory Team, for example, and a lot more than my job title has changed. Instead of helping customers with, and collecting feedback on, just Power BI I now have to know about the whole of Fabric – which is a challenge. But how much can anyone know about the whole of Fabric? Is it all too much for one person?

This is already a problem with Power BI of course. Back when Power BI first appeared I felt reasonably confident that I knew a fair amount about everything to do with it. Soon, though, I gave up on trying to know much about data visualisation and now when I see the amazing things people do with Deneb I’m full of admiration but I accept this is something I’ll never have time to learn. I think I’m better than average at DAX but I’ll never be anywhere near as good as Marco and Alberto. Even with Power Query and M my knowledge of creating custom connectors is limited. And that’s all ok, I know what I need to know to do my job and luckily I’m not the kind of person who feels overwhelmed by the constant flow of updates although I know a lot of people do (see this great article by Kurt Buhler on ways of dealing with this). If I don’t know everything about Power BI today then I can say confidently that I’ll never know everything about Fabric in the future. No-one can. You need a team. The question then is, then, how will job roles evolve within the Fabric community? Who will end up doing what, and how will we organise Fabric development?

There are two ways you can approach Fabric development, and the two ways remind me of Isiah Berlin’s famous essay on “The Hedgehog and the Fox” (who says arts degrees are useless?). Unless you’re a fan of Tolstoy you don’t need to read it all; the essay turns on a line from the Greek poet Archilochus which says: “The fox knows many things, but the hedgehog knows one big thing” and explores how thinkers and writers, indeed all human beings, can be classified as either ‘foxes’ or ‘hedgehogs’. I think this classification can be useful when thinking about Fabric development too.

I’m definitely a hedgehog: I like to go deep rather than wide. Things were easier in the old days when the Microsoft BI stack was made of discrete products (Analysis Services, Integration Services, Reporting Services, SQL Server) and you could specialise in just one or two – the dividing lines were clear and pre-drawn and everyone working in BI had to be a hedgehog. It’s tempting to take the same approach with Fabric and encourage people to specialise in individual workloads, to become Fabric Spark or Fabric Warehouse professionals for example. It’s unsurprising that people from an IT background are natural hedgehogs but I also think this risks repeating the mistakes of the past: going back to the old days again, when I was an Analysis Services consultant I regularly saw teams who were focused on building their data warehouse in SQL Server but who never gave a second thought to what was meant to happen downstream, so the cubes, reports and spreadsheets suffered accordingly and so the solution as a whole was seen as a failure by the business. While it might seem as though a team of hedgehogs will give you a team of experts in all the areas you need, it can often result in a team of people focused on building specific parts of a solution rather than a functioning whole.

The alternative is to be a fox and I think most Power BI users who come from the business or are data analysts are foxes: they have to know a bit about everything in order to build an end-to-end solution because they work on their own and have to get stuff done. Self-service BI is inherently foxy. While a single Fabric fox can build a whole solution, though, and is more likely to deliver value to the business, the quality of the constituent parts may be variable depending on how well the developer knows each workload. A team of foxes won’t solve that problem either: two people who half-understand DAX aren’t equivalent to one person who is a DAX expert. Too much self-service development delivers short-term business value which is cancelled out in the long-term under the weight of the technical debt incurred.

I don’t think you choose to be a fox or a hedgehog; your personality, learning style and job circumstances push you in one direction or the other. In some organisations there will only be foxes or only be hedgehogs with all the attendent consequences. However, the fact that Fabric brings together everything you need for BI development in a single, tighly-integration platform and the fact that its functionality and price means it’s suitable for self-service and enterprise BI development, means that there’s an opportunity to combine hedgehogs and foxes in a new, more effective way.

The magic formula for Fabric must be similar to what the most successful Power BI organisations are already doing today: the hedgehogs in IT build the foundations, provide the core data, model it, and then the foxes in the business take what these components and turn them into solutions. This will require a big change in the way the hedgehogs work and think: no more BI projects, no more building end-to-end. It should be rare for a hedgehog to ever build a report for the business. Instead the aim is to build a platform. Equally the foxes will have to change the way they work too: no more direct access to data sources, no more building end-to-end. It should be rare for a fox to ever bring external data into Fabric. Instead the aim is to build on the platform that has been provided by the hedgehogs. If the hedgehogs have done their job properly then the foxes will be happy – although that’s a big “if”. Finding the right organisational structures to bring hedgehogs and foxes together will be a challenge; it will be even more difficult to overcome the mutual distrust that has always existed between IT and the business.

So, to go back to the original question, who will do what in Fabric? On the hedgehog side there will need to be a platform architect and various specialists: in data ingestion, Spark and/or in Fabric Warehouse, in real-time analytics, in designing core semantic models and DAX. On the fox side the required skills will be basic data transformation using Dataflows, basic semantic modelling/DAX using OneLake, and advanced report design and data visualisation. This seems to me like a division of labour that results in each team member having an area of Fabric they have a chance of learning and mastering and one that will deliver the best outcomes.

2 thoughts on “Nineteenth Blog Birthday: Fabric, The Hedgehog And The Fox

  1. Hi Chris, Thanks for the great article! I am currently attempting – though you are probably right on this – the futile task of knowing the “whole of Fabric”… I am in the early stages of the process but already can see how seemingly disparate parts move in unison and complement one another to create a unique platform that covers a broad range of Analytical Workloads. If Microsoft now were to add the ability to create Azure SQL Databases for transactional workloads that auto embed DML into a Power Apps like UI, there would be really no need to leave Fabric at all 😉 There is just one thing I wanted to add to hedgehogs and foxes: it seems to me, and I think I am beginning to see a general trend here, that we ought to add another dimension to the organization level where hedgehogs will reside in the future – the Azure datacenter (for ease I will refer to these as Org_Hedgehog and AZ_Hedgehog). In particular, I have noted this with Spark in Fabric in two ways: 1) There is the gentle steering of Power BI Analysts towards Delta under the guise of performance 2) The corollary of this for Org_Hedgehogs, which for anyone familiar with debugging Spark will be apparent, it is rather complex and the degree of expertise required would facilitate the need for some kind of “Super Hedgehog” within the organization which does not make much monetary sense. Yet, in Fabric you can easily configure amount of nodes, libraries and Runtimes which auto-scale and get started in your notebook without really knowing much about the underlying architecture of Spark at all. Sprinkle in some V-Order & Optimize write, even better! As long as Microsoft does a decent job at maintenance / optimization and keeps rolling out those updates (I think you allude to this with Power BI) I am afraid that there won’t be much left to do for Org_Hedgehogs that hasn’t already been done better through the implementation of Spark Engines in Fabric… I just wanted to clarify here that I am not saying that this has been done in the best possible way, there is always room for improvement, just that the skill and computational cost of setting up a local Spark Cluster will certainly outweigh the benefits to each individual organization on its own. Other considerations may facilitate a need for this kind of architecture but for something that is effectively an invisible background process in Fabric perhaps a new kind of hybrid Fox / Hedgehog || Engineer / Analyst role may be emerging.

  2. The hard part about being a hedgehog is that tools like Fabric don’t actually allow you to dig very deep. It is supposed to be “easy” platform and giving a surface area for digging is the antithesis of what Fabric is all about.

    I’m using Spark in Synapse at the moment, which is a precursor to the Fabric stuff. Digging is impossible . It requires me to contact several layers of people before I can see the underlying errors and logs. Even then, if Microsoft is trying to hide something, they might not let you see the errors and logs. (EG. I’m suffering with an incredibly large number of socket exceptions related to Microsoft’s private networking = managed vnets and “private endpoints”. Microsoft is deliberately hiding the relevant exception details and there is nothing I can do to dig into these chronic failures on my own. )

Leave a Reply