Happy families are all alike; every unhappy family is unhappy in its own way. – Leo Tolstoy
It is funny – If I look back a year ago I had two very big software projects (since I can’t say what they were we’ll call one Y and the X). Y went really well, and by all accounts is a big success! What about X – not so much. It is slowly turning out that the second project is going to have a much more lasting effect on my design and thought processes than the first. I mean I haven’t worked directly on X in months – it is still teaching me lessons. Proving not only that I have a lot to learn – but also that your mistakes can teach you a lot more than your successes.
I recently gave a presentation on a number of lessons I learned from the project ( A couple of those ideas I’m hoping to sit down and actually write out for posterity – and hopefully as a reminder so I don’t repeat them). But just yesterday, I got another lesson out of X! It really is the gift that keeps on giving.
First, let me say that my development world is fairly constrained. Although I have worked on some seriously scaling infrastructure – by and large my focus isn’t on those kind of problems. I always end up focusing on really complex domains. I largely credit one book with really having an impact – Domain Driven Design. I first read it back in 2004. It does what a great book of this should do – it introduced me to great new ideas and also confirmed things I was already doing (plus giving me a language to finally describe it).
I make this point – because it means I spend a lot more thinking about what if represented in a given system (and how it is represented) more than I do about about many transactions the database can handle. Because of that – the details of the domain are seriously important.
Actual vs Logical
Let’s talk about something that is totally outside of the domain I work in. How about cars?
I have one – it has the best feature of all the cars on the road – it is paid off! (I highly recommend that as a feature).
I have VW Beetle.
For most circumstances – that is plenty of information. Meaning – I have a car. If you want to know if I can go some place – that is probably enough information.
When it comes time to go to lunch – we have to figure out who is going to drive. My car can fit 4 people. That detail is important depending on how many people want to go.
If I want to find out how much my car is worth on Kelley Blue Book I have to add the details that it is a 2000 GLT Turbo (plus lot of other details – 5 speed, seat warmers, etc).
The point of this is – if you asked me about my car – what you want to know about it is driven by the problem you are trying to solve by asking the question. If you ask:
Can you get to work?
Yes, I have a car. (Totally sufficient answer)
Yes, I have a 2000 VW New Beetle GLS Turbo Hatchback 2D with a 5 speed manual and seat warmers. (Is a bit nutty)
I’ll say that this is a scale from the logical to the actual. Logical being a sort of encapsulation of details that provides very limited information to the actual which has a ton of details.
|Large Coke||24oz Wax Coated Paper Cup – Coke with 35% ice , lid – no straw|
|31D||31D on an American 777-200 Ver. 2 – which means it is an awesome seat|
|Some Chips||Corn, Vegetable Oil (Contains One or More of the Following: Corn, Soybean or Sunflower Oil), Buttermilk Solids, Salt, Tomato Powder, Partially Hydrogenated Soybean Oil, Corn Syrup Solids, Corn Starch, Whey, Onion Powder, Garlic Powder, Monosodium Glutamate, Cheddar Cheese (Cultured Milk, Salt, Enzymes), Nonfat Milk Solids, Sugar, Dextrose, Malic Acid, Sodium Caseinate, Sodium Acetate, Artificial Color (Including Red 40, Blue 1, Yellow 5), Spice, Natural and Artificial Flavor, Sodium Citrate, Disodium Inosinate, and Disodium Guanylate.source|
I realized one of the problems with Project X was exactly this problem. Namely that it collected enough information to drown an actuarial. (Amazingly there this an entire site devoted to jokes about actuarials – www.actuarialjokes.com).
Why did it collect so much detail?
Data Pack Rat
I’m a 5th generation pack rat in the real world (I’ve been through some counseling and my wife has helped a lot), but I’m ten times worse in the digital realm. I collect data – of all sorts. This inevitably bleeds over to the things I work on.
Solving The Hard Problems
That was not the real reason – even if it is a convenient answer. The real reason is that the really tough problems in the domain for X required the data. Sure the day to day stuff only needed to know the basics – but the hard problems are hard precisely because they depend on knowing the details involved. This is the same reason that Kelley Blue Book needs so much detail about my car – it is hard to give an accurate estimate if you don’t know what you are estimating.
CSI & The Infinite Zoom
This is the other part of the reason why getting the actual can be so important. On CSI, they strive for what seems like scientific accuracy (One can only assume as a layperson). That being said – there is one situation where all the rules are thrown out. They love to zoom into pictures infinitely – even when they are digital. The worst case was the time they used footage from a security camera to get a picture of someone’s eye – and then look at the reflection to catch the killer. Going to have to get one of these futuristic security cameras – since everyone I’ve ever seen is seriously blurry.
What is the point of that little rant? Simple, if you don’t have the resolution you can’t just make it up. The same rules apply. If you only collect the make and model of cars (like they did to get my parking tag) you can’t really say how much it would be to replace that vehicle. If that is your goal – you’re screwed. My solution is typically to think about all the data we are going to need and work backward.
This doesn’t always work out – Project X being an example.
The Thin Vertical Slice
There is another complication in this whole process. We do a lot of Thin Vertical Slicing. It is incredibly useful since it means we get better feed back from the users and it keeps us from building a lot of stuff we don’t end up needing. The down side is that when you are dealing with a lot of data that has to be collected (i.e. a world wide audit of something) – you have to decide – do you collect what you need now – or while you are there – do you grab as much as you can while you are there. Once you grab it – what do you do with it. Since if you track it but don’t include it in the tool you roll out – how do you keep the data up to date? If you ignore it – you’re going to be back to audit the rest eventually (or maybe you won’t? YAGAI (You Aren’t Gonna Audit It) a subset of YAGNI)
Matching The User
This is the real crux of the issue. Both Project X & Y obsessed over the actual. Project Y was a success – X was not. Why not? At the end of the day, a simple answer – matching the domain data to the user.
In the case of Project Y, the users of the system were also obsessed with the actual data. The tool let them manage the data down to the smallest detail. Which made them more accurate and more effective. They felt no sense of overload because they were steeped in the details as part of the job. The tool just made it easier to switch to a new task. Awesome!
In the case of Project X, the users of the system didn’t care as much about the actual data. They just needed a couple of logical collections to make their decisions. The tool forced them to wade through so much data to get something done. Which meant when you were doing something very very complicated it was great – but most of the time things weren’t complicated – which means it sucked.
The lesson here is that we could have had both. We could have had the detail underneath (because we actually needed it to solve problems), but exposed less of it unless you wanted to dive in. We actually moved this way at the end of the project (some what unconsciously) – but by that time – other issues with the system were in the way of success.
Final Thoughts: So basically – keep in mind the problem you are trying to solve and the data required to do it, but overwhelm your users with that data at your own risk!
p.s. There is probably a corollary to this idea relating to integrating applications via services. I haven’t finished this idea – but its probably along these lines (Almost stolen verbatim from a friend). Namely – that when you are consuming services – focus on what you need – and nothing more. Meaning :
- Don’t validate data from the service you don’t care about. Since you don’t want to throw out a response because it includes stuff that doesn’t effect you
- Don’t ask for more data than you need – since you’re wasting resources and confusing the issue on what data you actually care about
Both things are part of the process of keeping the services from becoming to intertwined, but still manage to loop back to this idea of getting the detail level right.