Rating:  Summary: Integration Of Inmon & Kimball Thinking Review: "Mastering Data Warehouse Design" is an excellent book to help readers understand how to take maximum advantage of the strengths of diverse approaches associated with Bill Inmon and Ralph Kimball.The main reason I bought a copy of this book, even before it arrived in bookstores, was that I was leading a team to figure out how to merge Inmon and Kimball views for data modelling standards. We had already developed a DW architecture using Inmon's approach, with its associated relational/ERD method, but believed that it lacked rigour in the area of data marts. We also reviewed Kimball's books, and acknowledged the strengths of his dimensional modelling approaches, but were concerned that it lacked rigour for the diversity of analytical requirements in the manufacturing environment, e.g. data exploration/mining on a massive scale. We were struggling to figure out to combine the best of both - and then we discovered the imminent release of "Mastering Data Warehouse Design". After checking the Table of Contents on the publisher's web site, we had the book couriered directly from the publishers warehouse because it would not be available in local bookstores fast enough to meet our work schedule. Chapter 1 has an impressive 'sound bite' version of Inmon's DW architecture thinking, but extended to include broader Business Intelligence concepts. Chapter 2 does a commendable job of explaining a tiered approach to data models, e.g. subject area model, business model, Operational system model, DW model. At first, this chapter was confusing because we had just finished a rigourous definition of data modelling standards, using more conventional terminology, e.g. logical/entity model, physical/table model. So the book's terminology didn't seem to fit in with our thinking. But after re-reading it, we realized that it added value in forcing us to look at the whole issue of modelling from a deliverables or outcomes perspective, rather than a modelling process perspective. Chapter 4 discusses how to develop a DW data model. The content outlines the sequence or steps involved in developing a DW data model, and it's rare that I've been able to find as good coverage of the topic as I found in this chapter. Chapters 5 - 11 cover topics like keys, modelling time/hierarchies/transactions, with some solid content on how to model for on-going business change and how to maintain the tiered models. However, I'm not fully conversant with some of these topics, so am not in a good position to evaluate their content. Chapter 12 has a very good discussion on how to deal with a proliferation of legacy data marts, and strategies for migrating to a central DW that feeds a variety of data marts. It also introduces Chapter 13 which has a classic discussion on comparing the relational and dimensional modelling approaches - including the best discussion I've ever seen on the strengths and weaknesses of each approach. While our team didn't buy into all this chapter's points, the clear logical explanation of strengths and weaknesses helped facilitate a consensus agreement among two groups aligned with the Inmon/relational and Kimball/dimensional approaches. The consensus solution, mostly based on Chapter 13's content, would have been difficult to achieve without this book, i.e. chapter 13's content alone was worth much more than the price of the book. So if you're struggling with the merits of the Inmon and Kimball architecture/modelling approaches, this book is a valuable resource to help take advantage of the best of both.
Rating:  Summary: Integration Of Inmon & Kimball Thinking Review: "Mastering Data Warehouse Design" is an excellent book to help readers understand how to take maximum advantage of the strengths of diverse approaches associated with Bill Inmon and Ralph Kimball. The main reason I bought a copy of this book, even before it arrived in bookstores, was that I was leading a team to figure out how to merge Inmon and Kimball views for data modelling standards. We had already developed a DW architecture using Inmon's approach, with its associated relational/ERD method, but believed that it lacked rigour in the area of data marts. We also reviewed Kimball's books, and acknowledged the strengths of his dimensional modelling approaches, but were concerned that it lacked rigour for the diversity of analytical requirements in the manufacturing environment, e.g. data exploration/mining on a massive scale. We were struggling to figure out to combine the best of both - and then we discovered the imminent release of "Mastering Data Warehouse Design". After checking the Table of Contents on the publisher's web site, we had the book couriered directly from the publishers warehouse because it would not be available in local bookstores fast enough to meet our work schedule. Chapter 1 has an impressive 'sound bite' version of Inmon's DW architecture thinking, but extended to include broader Business Intelligence concepts. Chapter 2 does a commendable job of explaining a tiered approach to data models, e.g. subject area model, business model, Operational system model, DW model. At first, this chapter was confusing because we had just finished a rigourous definition of data modelling standards, using more conventional terminology, e.g. logical/entity model, physical/table model. So the book's terminology didn't seem to fit in with our thinking. But after re-reading it, we realized that it added value in forcing us to look at the whole issue of modelling from a deliverables or outcomes perspective, rather than a modelling process perspective. Chapter 4 discusses how to develop a DW data model. The content outlines the sequence or steps involved in developing a DW data model, and it's rare that I've been able to find as good coverage of the topic as I found in this chapter. Chapters 5 - 11 cover topics like keys, modelling time/hierarchies/transactions, with some solid content on how to model for on-going business change and how to maintain the tiered models. However, I'm not fully conversant with some of these topics, so am not in a good position to evaluate their content. Chapter 12 has a very good discussion on how to deal with a proliferation of legacy data marts, and strategies for migrating to a central DW that feeds a variety of data marts. It also introduces Chapter 13 which has a classic discussion on comparing the relational and dimensional modelling approaches - including the best discussion I've ever seen on the strengths and weaknesses of each approach. While our team didn't buy into all this chapter's points, the clear logical explanation of strengths and weaknesses helped facilitate a consensus agreement among two groups aligned with the Inmon/relational and Kimball/dimensional approaches. The consensus solution, mostly based on Chapter 13's content, would have been difficult to achieve without this book, i.e. chapter 13's content alone was worth much more than the price of the book. So if you're struggling with the merits of the Inmon and Kimball architecture/modelling approaches, this book is a valuable resource to help take advantage of the best of both.
Rating:  Summary: A hint toward a real design Review: A description of data warehouse issues well balanced between business orientation and technical/scientific principles and problems is a very difficult task. Even if this book does not fully accomplish that goal, but it represents a very interesting step toward that. Meaningful digressions (for example use and classification of data models in a company environment) alternates with prolix explanations in which significant hints get lost (e.g. time dimension in calendar modeling). I think the authors would have been better of deepening the material at a bit more advanced and principle-founded level (for example, it is usually considered a must referring to complexity formalism when dealing with binary trees as indexing technique), but I agree too that other readers may not: in Italy such books are considered, in spite of their generality, technical, but foreign professionals I have met consider them suitable, even if not mandatory, to business consultants. Using more formal approaches would have likely discouraged many potential readers. Besides, I have appreciated the clear statement about technological convenience of relational versus the dimensional technology: though the latter is undeniably more advanced, in my experience conditions that make it preferable are very seldom met.
Rating:  Summary: What's the message? Review: A few random comments.... *The back cover says it "addresses head-on" the issues from Ralph's famous letter. I'm familiar with that letter. Either I skimmed over a couple pages too fast - and those pages had some "answer" buried in them, or, they did not really, fully, address many of the issues Ralph wrote about. *I kept getting confused - some times the book acted like it loved a synergy and partnership between the normalized and the dimensional approaches. Other it seemed to slam the dimensional approach as not working in many areas. In particular, I was shocked at the paragraph in the center of page 386. I've had no problem, using what may appear to be unrelated star schema data, in doing significant analysis and data mining. *The paragraph on page 394, under "Flexibility", says I can't do sophisticated or advanced analytics from my star schemas. I have. What am I (or, the authors) missing? *Chapter 6 - Modeling the Calendar... I feel for anyone new to this arena trying to decipher the information. I have no problems with my date or time dimensions and I can explain them to my students in a lot less time than it took me to read that chapter! *Chapter 7 - Modeling Hierarchies... Seemed a little long. I should not comment on it - when I finished reading it, I realized I had been sleeping through most of it. *I found the chart on page 100 a little scary - do they really mix the facts in a fact table? The chart shows sales and sales objectives in the same fact table. Is this just a "logical" star? Or, is their basic understanding of the dimensional model in need of an upgrade? *Not enough real world "how-to" examples. *Again, either I skimmed a few pages, or, they refer to "we'll address this in a later chapter" a few times and never did. *I don't know the authors - did not have any pre-conceived opinions about them. Now, I felt like, as a team, they did not always agree on what to write, so they compromised - picked middle ground and sent inconsistent messages. I finished the book with a very unclear picture of what message they were sending. *Too much extraneous data in many of the examples - tough to weed out the needed from the excess... *I won't argue with the overall concept of a staging area/data warehouse/data mart philosophy. I do take exception to the inference that I cannot be successful if I don't follow it. I've implemented using that model and variations of the approach, as well as taking real-time transactional data directly into a star. Final thought. In my experience, anyone taking this book as an "absolute" will spend more time on I/T "stuff" than the users I know will want to put up with. Fifty-some years ago, Aritotle Onasis said the secret of business is in knowing something that no one else knows. That is no longer a reality. This is called the "information age" for a reason. The winners are those that realize they know no more than their competition, but do more, faster, with what they have. In my version of the "real world", executives want results, NOW. I did not feel the authors ever had to deal with "urgency".
Rating:  Summary: What's the message? Review: A few random comments.... •The back cover says it "addresses head-on" the issues from Ralph's famous letter. I'm familiar with that letter. Either I skimmed over a couple pages too fast - and those pages had some "answer" buried in them, or, they did not really, fully, address many of the issues Ralph wrote about. •I kept getting confused - some times the book acted like it loved a synergy and partnership between the normalized and the dimensional approaches. Other it seemed to slam the dimensional approach as not working in many areas. In particular, I was shocked at the paragraph in the center of page 386. I've had no problem, using what may appear to be unrelated star schema data, in doing significant analysis and data mining. •The paragraph on page 394, under "Flexibility", says I can't do sophisticated or advanced analytics from my star schemas. I have. What am I (or, the authors) missing? •Chapter 6 - Modeling the Calendar... I feel for anyone new to this arena trying to decipher the information. I have no problems with my date or time dimensions and I can explain them to my students in a lot less time than it took me to read that chapter! •Chapter 7 - Modeling Hierarchies... Seemed a little long. I should not comment on it - when I finished reading it, I realized I had been sleeping through most of it. •I found the chart on page 100 a little scary - do they really mix the facts in a fact table? The chart shows sales and sales objectives in the same fact table. Is this just a "logical" star? Or, is their basic understanding of the dimensional model in need of an upgrade? •Not enough real world "how-to" examples. •Again, either I skimmed a few pages, or, they refer to "we'll address this in a later chapter" a few times and never did. •I don't know the authors - did not have any pre-conceived opinions about them. Now, I felt like, as a team, they did not always agree on what to write, so they compromised - picked middle ground and sent inconsistent messages. I finished the book with a very unclear picture of what message they were sending. •Too much extraneous data in many of the examples - tough to weed out the needed from the excess... •I won't argue with the overall concept of a staging area/data warehouse/data mart philosophy. I do take exception to the inference that I cannot be successful if I don't follow it. I've implemented using that model and variations of the approach, as well as taking real-time transactional data directly into a star. Final thought. In my experience, anyone taking this book as an "absolute" will spend more time on I/T "stuff" than the users I know will want to put up with. Fifty-some years ago, Aritotle Onasis said the secret of business is in knowing something that no one else knows. That is no longer a reality. This is called the "information age" for a reason. The winners are those that realize they know no more than their competition, but do more, faster, with what they have. In my version of the "real world", executives want results, NOW. I did not feel the authors ever had to deal with "urgency".
Rating:  Summary: What's the message? Review: A few random comments.... •The back cover says it "addresses head-on" the issues from Ralph's famous letter. I'm familiar with that letter. Either I skimmed over a couple pages too fast - and those pages had some "answer" buried in them, or, they did not really, fully, address many of the issues Ralph wrote about. •I kept getting confused - some times the book acted like it loved a synergy and partnership between the normalized and the dimensional approaches. Other it seemed to slam the dimensional approach as not working in many areas. In particular, I was shocked at the paragraph in the center of page 386. I've had no problem, using what may appear to be unrelated star schema data, in doing significant analysis and data mining. •The paragraph on page 394, under "Flexibility", says I can't do sophisticated or advanced analytics from my star schemas. I have. What am I (or, the authors) missing? •Chapter 6 - Modeling the Calendar... I feel for anyone new to this arena trying to decipher the information. I have no problems with my date or time dimensions and I can explain them to my students in a lot less time than it took me to read that chapter! •Chapter 7 - Modeling Hierarchies... Seemed a little long. I should not comment on it - when I finished reading it, I realized I had been sleeping through most of it. •I found the chart on page 100 a little scary - do they really mix the facts in a fact table? The chart shows sales and sales objectives in the same fact table. Is this just a "logical" star? Or, is their basic understanding of the dimensional model in need of an upgrade? •Not enough real world "how-to" examples. •Again, either I skimmed a few pages, or, they refer to "we'll address this in a later chapter" a few times and never did. •I don't know the authors - did not have any pre-conceived opinions about them. Now, I felt like, as a team, they did not always agree on what to write, so they compromised - picked middle ground and sent inconsistent messages. I finished the book with a very unclear picture of what message they were sending. •Too much extraneous data in many of the examples - tough to weed out the needed from the excess... •I won't argue with the overall concept of a staging area/data warehouse/data mart philosophy. I do take exception to the inference that I cannot be successful if I don't follow it. I've implemented using that model and variations of the approach, as well as taking real-time transactional data directly into a star. Final thought. In my experience, anyone taking this book as an "absolute" will spend more time on I/T "stuff" than the users I know will want to put up with. Fifty-some years ago, Aritotle Onasis said the secret of business is in knowing something that no one else knows. That is no longer a reality. This is called the "information age" for a reason. The winners are those that realize they know no more than their competition, but do more, faster, with what they have. In my version of the "real world", executives want results, NOW. I did not feel the authors ever had to deal with "urgency".
Rating:  Summary: A top class encyclopedia! Review: An excellent compendium of Data Warehousing, Modeling, and Management processes. It is a detailed practical-guide for IT implementers and a terrific framework for Architectes to optimize productivity. Reality based discussion of trade-offs in fast changing market, enterprise, and customer will come in handy in everyday decisions. Even though I live in the DW-BI World, I found kernels of truth often compromized in favor of saving time, and appreciated a refreshing view of pros and cons for doing it right, whether from start or while upgrading / integrating. You will find yourself going back to many sections to share with your staff. A great read and terrific reference for your DW-BI reading. It sits on my reference shelf with many dogeared pages and underlined sections.
Rating:  Summary: All in one Guide, Text-Book, and handy Reference Review: Experienced and new to DW-BI professionals will find this work useful. It shows how to cure the ills from past systems and streamlie going forward. It packs not just technical methodologies and structure, but also practical guidelines for real-time implementation. You will be surprized and delighted by how much can be improved even in a well-run shop.
Rating:  Summary: Not worth the money Review: I've been to seminars by Inmon, Kimball and Imhoff, as well as read many of their books. Kimball on the one hand, is generally clear and concise on the subject and obviously understands not only DW design and implementation concepts, but how they relate to various businesses and how the business really uses the data. He's also a fairly humble man in person.
Both Inmon and Imhoff on the other hand are rather self-aggrandizing (Inmon once waltzed into one of his keynote speeches dressed like a boxer to the theme from Rocky!), and both Inmon and Imhoff seem to have based their careers around bashing Kimball. In their desperation to present an alternative to Kimball's methodology and carve out their own niche, they've presented mostly incoherent, illogical and unusable ideas sometimes laced with anti-Kimball baggage. I get the feeling Inmon is kind of like James Martin was back in the 80's, churning out countless cookie-cutter style books of dubious quality.
I've designed a number of dimensional data warehouses and data marts that actually work years later using the Kimball approach, but honestly, every book I've read by Inmon and/or Imhoff has left me wondering who in the world actually uses their approach (if you can call it that) to build real-world data warehouses.
If you want to have a complete library and money is no object, by all means, read everyone's ideas on data warehousing and compare and contrast for yourself (I did - I must own fifty books on the subject - but I rely on only about 5-6 books in my day to day work as a DW architect - the rest are just taking up shelf space and reminding me how nice it is to be able to read reviews at places like Amazon before you buy). If money is an object and/or you are just starting out in the field and trying to learn the basics of DW design, do yourself a big favor and get the three excellent Kimball books (The Data Warehouse Toolkit, The Data Warehouse Lifecycle Toolkit and The Data Warehouse ETL Toolkit). The Adamson/Venerable book: Data Warehouse Design Solutions is a very useful adjunct for additional examples of real-world dimensional designs.
Rating:  Summary: Misinformation And Missing The Mark Review: If you want to build a Corporate Information Factory (CIF) I suppose this book is better than many of the previous attempts at teaching how to accomplish that goal. However, like many of the previous Inmon/Imhoff books, there is too much theory (unfocused at that) and not nearly enough practical/tactical content. If you are on the CIF bandwagon however, you will find the book very helpful as compared to most of the previous books on the topic. But that begs the question. Many a CIF or enterprise-wide project has been launched... yet most are cancelled long before reaching the finish line. This is reality. In the REAL world we have REAL deadlines and REAL budgets imposed by REAL business executives who have REAL problems to solve and it involves... oh by the way... REAL MONEY! We have to deliver NOW! Well, ok, maybe not quite that fast, but you get the idea. The hard part is getting the data! Or is it? Using simple tools and a powerfully designed, highly detailed dimensional database, we have, for example, clients pulling their own data sets ready for import into statistical and mining packages. They think they have died and gone to heaven! Foist a third normal form (3NF) design on them and their eyes roll... "Now, which of the available join paths is the right one for this business question?" and "Why is it taking so long for the query?" and "Will you pull the data for me?" Now we hear... "Instead of spending 80% or 90% of my time getting the data prepared, I spend 5% or 10% of my time doing that... so I have that much more time to actually think about the business." We have seen clients' ability to understand and drive their business expand beyond their own wildest imagination in very short order. It shows on their bottom line and they are very happy with that! The whole point of BI - beyond all the data capture and cleaning and integrating and turning "data into knowledge", and making it easy for the user without dumbing it down, and all that stuff - the point of BI can be distilled down to one word: "Publish!" Booksellers don't hand you a photocopy of a handwritten manuscript. They do a lot of work with the "raw data" - typesetting and page numbers and table of contents and indexing and so on - and turn it into something accessible and useable... something we call a book. That's the point of BI. This book doesn't get it. Too many CIF or "enterprise" projects have imploded under their own weight to slavishly duplicate the same mistakes. Too many dimensional systems have succeeded with huge return on investment to relegate the ideas to a dark corner. If we stop the religious discussions (Mac vs. Windows, or the "Inmonites vs the Kimballites") and get to see how truly successful Business Intelligence (BI) systems work, we find the emphasis must be on using proper theory (not arguing it) and applying techniques that work NOW. More often than not, can you say "Dimensional!" Yes, CIF and all that has its place... but not nearly to the degree that this book would have you believe. The most successful clients have been the ones who bypassed all the "modeling wars" and used the data bus architecture of conformed dimensions. They didn't pick and chose a modeling idea or two; they actually studied Kimball and did it the right way. Dr. Codd, while addressing this question one day, asked me this question: "Would you run an OLTP system against a dimensional model?" My obvious answer was: "Of course not." "Why then," he asked, "do so many people try to do the opposite?" The biggest "problem" with the dimensional approach is that people who do not truly understand it try to pick and chose techniques from it and graft those into their current ways... and fail... and bash it. Or, they don't understand it at all. Uh, sorry, it isn't the technique that is the problem. The book purports to "answer" a message reply that Ralph Kimball posted on a discussion board some time ago. It does not. One can be certain that Ralph Kimball did not give permission to use his name on or in the book, as is done. Instead, the book does a very poor job of showing how to design and use dimensionally designed databases as a part of a larger architecture, illustrates a complete lack of understanding of the underlying principles, and then criticizes and limits the technique and its application. This does a terrible disservice to the reader... especially a reader who is trying to decide how to meet a real business need and is new to BI. I dislike speaking impolitely like this, but the truth is more important in this context. Also, on the back cover, they state that Ralph Kimball's "letter" was a challenge. It was not. It was merely a listing of many of the crucial issues in a useful BI environment addressed to an individual who had asked legitimate questions about BI. As for addressing these issues "head-on", the book does not do this at all. Does this matter? Of course it does. Real people buy this book and are led down a path that rarely leads to success. I realize that much of this review is not directly about specific details of the book. The details in the book are inconsistent, often unfocused, and sometimes downright misleading. The larger issue, and thus the focus of this review, is that the entire book is based on a premise that the CIF is "The Way" and that dependent dimensional data marts are grudgingly "ok". This is not the reality that many of us see in the business and education worlds.
|