Andyblg's Blog

November 18, 2011

OLAP Versus Big Data

Filed under: OLAP — andyblg @ 21:37
Tags: ,

Very interesting discussion OLAP Versus Big Data from  datawarehouse.ittoolbox.com and very interesting answer Peter J

“It doesn’t have to be Big data vs BI or OLAP.
Big data has become the hot trend and most people have no clue what it
really is or how to actually use it.
Great they can get more data and not do anything with it. It’s the whole
grab as much data and it’ll ‘automagically’ just work for you. Tons of
companies market GUI front ends that don’t do anything for a company if the
data is crap. Sell and market whatever you want, if the data stinks, it
just doesn’t matter. And right now big data is more about the technology
than the business.

Yeah yahoo, google, facebook, linkedin, groupon etc can throw out their big
data usage, their massive work on big data, but those companies rely on
unstructured data, streaming data, near real time metrics, etc. They all
are more or less advertising companies. Without advertising, those
companies make no money. Banks and wall street have a need for big data
because they need tons of data, lots of unstructured data, and they need it
yesterday.

A lot of big data is really just unstructured data. How to get the logs,
the web streams, the emails, the social media status updates, maybe image
and video info and find useful ways to help your business or organization.
It can be a few terabytes or a few petabytes. It’s more about structured
or barely structured over tons of data.

OLAP and BI systems are mostly structured, single versions of truth. Do
it the Kimball way, Inmon, or what not doesn’t matter because you take
different systems, different data, normalize it or denormalize it, wrap it
up into star schemas, and run calculations and aggregations on it and then
build cubes or reports based off of that. It’s all the same in regards to
having one large EDW or a ton of silo data marts. it’s mostly structured
data.

So it’s not about doing OLAP or Big Data, it’s figuring a way to integrate
Big Data, mostly unstructured data, with your BI and MDM systems.
Integrate it all together and it becomes useful. Building one big data
system, another BI system, another MDM system, a bunch of OLTP systems, and
so on just means you’re still doing what’s always been done. And that’s
having a ton of different systems that really don’t talk to one another or
relate to one another.

The technology doesn’t matter. It’s about how to use that data to get
some kind of return(more $$$, savings, better data, reports, charts, etc)
for an organization or business. Starting a big data project just because
it’s hot and you need to do it won’t change the facts. If the data still
stinks, it’ll just be another system nobody really uses. At least not use
it well. There is a big reason why, even after all these years, most people
still use Excel spreadsheets to do half their analytics and reporting
needs. More technology isn’t going to help a secretary or mid level
manager who could care less how you get the data.”

Share

Ranet OLAP

Filed under: OLAP — andyblg @ 20:54
Tags: ,

” What is Ranet OLAP?

Components library Ranet.UILibrary.OLAP is intended for creating full-featured business intelligence applications (RIA, Rich Internet Applications) based on the Microsoft SQL Server Analysis Services platform.
Components library Ranet.UILibrary.OLAP is implemented based on the most resent Web-technologies: Microsoft .NET and Microsoft Silverlight.
It can be used for displaying data from Microsoft SQL Server Analysis Services 2005/2008 using most popular Web-browsers: Mozilla Firefox, Google Chrome, Internet Explorer. “

MDX Parser,Builder,DOM and OLAP visual controls with Writeback for Silverlight

November 9, 2011

DWH Basics By Dylan Wan (remark)

Filed under: DW — andyblg @ 10:52
Tags:

Data Warehouse Basics By Dylan Wan on Feb 15, 2008

September 4, 2011

Big Data Now

Filed under: Literature — andyblg @ 21:57
Tags:

Big Data Now: Current Perspectives from O’Reilly Radar
It’s free for downloading

May 4, 2011

Thoughts about “Enterprise Data Modeling: 7 Mistakes You Can’t Afford to Make”

Filed under: Data management — andyblg @ 22:23
Tags: ,

My notes about interesting article “Enterprise Data Modeling: 7 Mistakes You Can’t Afford to Make

About “Mistake 1: Forgetting that an enterprise architecture is a living framework”
An enterprise architecture is a living framework but it is not 100%-true.
“Living framework” in the projection to complement and minimal changes in the main part of schema (model).
If you have an application using the schema (er-schema) will be very expensive (time, cost) to change the application.
Talk about the life of the model (logical) is not correct in my view, correct to speak of the complement – the development of the model.
In my opinion development of the schema (in particular logical) is very important and requires the most expensive in the design phase, as further changes are overhead.
I think we should not talk about “an enterprise architecture”, we need to talk about data models and management of these models within the enterprise.

About “Mistake 2: Keeping data models invisible”
Not entirely clear what is meant by the term “definitions”.
“A model without definitions is just a diagram that could be interpreted in many ways…”. It’s not true.
If you have application which use er-model (for example HR) – how it will be interpreted in many ways? Not entirely clear =/
What is “invisible”? Invisible from whom? from the members of project? It isn’t true. Then he will not be able to develop applications =/

About “Mistake 3: Assuming that business users can’t understand or review models”
Who is “business users”?
Suppose we have a person(1) who develops an application and the person(2) who develops the database schema. Who’s Who?
If it’s =) person(1) then how he will develop an application if he does not understand what are some objects in the database (all the objects are not important to him).
If it’s =) person(2) then how he will develop the database schema? :-0

About “Mistake 4: Thinking that data models are only about databases”
Data models are not only a databases. It’s true.
Let’s see definitions of “Data model” on wiki.

About “Mistake 5: Throwing models “over the wall””
“A data modeler is the mediator between business requirements and physical implementations.”
Nice remark.

About “Mistake 6: Forgetting about the sizzle”
Not entirely clear.
In my opinion it is not as important (color, guides for nonmodeler) as the completeness and rationality of model.

About “Mistake 7: Thinking of them as “your” models”
Correct note.
“That means sharing them openly, providing access to those who want it,”.
Totally agree.

Thanks Karen Lopez, Kamille Nixon

Share

March 21, 2011

Useful SQL Server DB Tools

Filed under: Common — andyblg @ 00:56
Tags:

Interesting article on sqlservercentral.com about DB tools.

List of tools:

  1. SSMS Addins – This is available on codeplex.  One of the features that intrigues me is the ability to script out the data from the table.
  2. OpenDBDiff - This one seems to have promise.  Similar in function to Visual Studio 2010 or RedGate SQL compare tools, this one is free and compares the schema and objects.  If you are on a tight budget and need to be able to compare two databases, this just might be the tool to try.
  3. SQL Monitor – A tool to monitor various things in SQL Server like jobs and executing queries.  Kind of low-level, but I figured I would test this app out and see if it had some merit.
  4. SQL nexus – This is a tool to help evaluate performance issues with SQL Server.  You can evaluate wait stats along with PSSDiag files.
  5. SQL Powershell Extensions – I recently learned of this tool on Codeplex.  This is a high priority item for me to download and test.  This tool helps to create “intuitive functions around the SMO objects.”
  6. PowerShellPack - Download from Microsoft to enhance the powershell experience.
  7. Data Dictionary – This software is on my list to evaluate.  It is mostly out of curiosity because I have something in place to create data dictionaries already.  This tool allows you to update the extended properties from a GUI.
  8. US Census Data – I think this one is intriguing as a sample data set.
  9. SQL Schema Source Control – This is an SVN plugin
  10. ScriptDB4SVn - Another SVN Plugin to get your database projects into source control.
  11. SQL Source Control (RedGate) – Do you detect a theme going on now?  This is a commercial product to integrate into SVN or TFS.  It integrates into SSMS and has received many great reviews.  I have seen it in use and it is a good product.
  12. SQL Diagnostic Manager (Idera) – I used this tool a lot a few years back.  The tool has gotten better since.  I need to get another license for it and try it again.
  13. Confio Ignite – I was a part of a focus group testing this tool.  I was highly impressed by the tool.  Ignite allows you to gather waitstats and other diagnostic information to monitor the health of the server.  I would highly recommend this tool.
  14. TOAD (Quest Software) – I used this tool a few years ago and liked it.  This tool is useful for determining quickly the alternatives to writing your query in a few different ways and to view the performance impact of those changes.
  15. DBA Bundle and Developer Bundle (RedGate) – Alternatively, you could look for the Toolbelt by RedGate.  The Bundles are chock full of high value great tools to do the job.
  16. SQL Scripts Manager – This is a collection of Scripts from various contributors that has been made available for free by our friends at RedGate.
  17. Dr. DMV – Glenn Alan Berry has some awesome scripts for use on your 2005 and 2008 servers.  These scripts utilize greatly the DMVs in SQL Server.
  18. DBA Dashboard – This is a set of reports put together to help you identify resource usage and the source of that resource consumption.
  19. SQLPing3 – Security type tool to help you discover SQL Servers on the network.
  20. Discovery Wizard for SQL Server (Quest Software) – A tool to help discover SQL Instances on the network.
  21. SQLCentric - By Robert Pearl, this tool is a web based monitoring and alerting tool for your SQL Servers.
  22. Power Architect – I used this tool largely for helping to document some data models.  This is a reasonably priced tool and it works quite well.
  23. SQLIO - This one is from our friends at Microsoft and I think the name explains it.
  24. SQLIOSim - Another tool from Microsoft that I think the name explains it.
  25. IOMeter – Another IO tool
  26. GeekBench - This tool will quickly measure processor and memory and provide some benchmarks.
  27. Plan Explorer (SQLSentry) – I find this tool extremely useful.  The execution plans are much easier to read in this tool than in SSMS.  I use both to compare and contrast and am able to more quickly ascertain the pain points of a query.  The readability of Plan Explorer is great and the additional features really help augment your abilities to query tune based on Execution Plans.
  28. SSMS Tools Pack is an add-in for Microsoft SQL Server Management Studio (SSMS) 2005, 2008, 2008 R2, 2011 (Denali) CTP1 and their respective Express versions.

Share

February 27, 2011

Magic Quadrant for DW DBMS (January 2011)

Filed under: DW — andyblg @ 21:54
Tags:

Source of Magic Quadrant for Data Warehouse Database Management Systems, 28 January 2011

Share

December 26, 2010

DW paradigm remark

Filed under: DW — andyblg @ 01:45
Tags: ,

Bill Inmon’s paradigm: Data warehouse is one part of the overall business intelligence system. An enterprise has one data warehouse, and data marts source their information from the data warehouse. In the data warehouse, information is stored in 3rd normal form.

Ralph Kimball’s paradigm: Data warehouse is the conglomerate of all data marts within the enterprise. Information is always stored in the dimensional model.

There is no right or wrong between these two ideas, as they represent different data warehousing philosophies. In reality, the data warehouse in most enterprises are closer to Ralph Kimball’s idea. This is because most data warehouses started out as a departmental effort, and hence they originated as a data mart. Only when more data marts are built later do they evolve into a data warehouse.
@source

December 11, 2010

Analysis Services BI Semantic Model (Denali)

Filed under: Analysis Solution — andyblg @ 19:38
Tags: , , ,

At the PASS 2010 conference in Seattle MS announced the plans for Business Intelligence in SQL Server “Denali”. Note new Business Intelligence Semantic Model (BISM) in Analysis Services.

Quotation from technet blog (full article in Analysis Services – Roadmap for SQL Server “Denali” and Beyond)

The purpose of this article is to elaborate on the BI Semantic Model and how it compares to the existing BI models in the Microsoft BI stack, specifically the UDM (OLAP models) and SMDL (report models). Very simply put, the BI Semantic Model is a relational (tables and relationships) model with BI artifacts such as hierarchies and KPIs. It unifies the capabilities of SMDL models with many of the sophisticated BI semantics from the UDM. However it does not replace the UDM.

The BI Semantic Model can be viewed as a 3-layer model:

  • The Data Model layer that is exposed to client applications. Even though the BISM model is fundamentally relational, it can expose itself using both relational as well as multidimensional interfaces. OLAP-aware client applications such as Excel can consume the multidimensional interface and send MDX queries to the model. On the other hand, a free-form reporting application such as Crescent can use the relational interface and send DAX queries.
  • The Business Logic layer that encapsulates the intelligence in the model. The business logic is created by the model author using DAX (Data Analysis Expressions) or MDX (Multidimensional Expressions). DAX is an expression language based on Excel formulas that was introduced in PowerPivot and built on relational concepts. It does not offer the power and flexibility that MDX does, but it is simpler to use and requires minimal tuning. There will always be sophisticated BI applications that need the power of MDX calculations and we envision that the BI Semantic Model will offer the choice of MDX as well, but this will likely come in a release after Denali.
  • The Data Access layer that integrates data from various sources – relational databases, business applications, flat files, OData feeds, etc. There are two options for data access – cached and realtime. The cached mode pulls in data from all the sources and stores in the VertiPaq in-memory column store. VertiPaq is a breakthrough technology that encapsulates state-of-art data compression algorithms along with a sophisticated multi-threaded query processor that is optimized for the latest multi-core chipsets, thereby delivering blazing fast performance with no need for indexing, aggregates or tuning. The realtime mode, on the other hand, is a completely passthrough mode that pushes the query processing and business logic evaluation down to the data source, thereby exploiting the capabilities of the source system and avoiding the need to copy the data into the VertiPaq store. Obviously there is a tradeoff between these two modes (the high performance of VertiPaq versus the latency and overhead associated with copying the data into VertiPaq) and that choice is left to the model author

Share

December 7, 2010

From basic analytics to advanced analytics

Filed under: Analysis Solution — andyblg @ 23:07
Tags: ,

In “Advance Your Analytics Strategies” by James Kobielus‘ post he explain about advanced analytics, overlap between advanced analytics and BI, data warehousing.

Many authors talk about BI as about Historical analysis. May be it’s true or may be not all is true.

BI can be considered as common term for analytic tools with different methods and algorithms.

past present future
BI with statistical algorithms Real-time BI ?

Which tool is suitable for the analysis of the future?

So the future is predictable only with probability Then requires probabilistic methods and/or models for BI tools. This may be BI based on predictive modeling :)

So We can talk about BI tools as about implementation of approaches to analytics.

Share

Next Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.