Hi,
Please share me your thoughts on the usage of Storage area network (SAN) storage within a SQL Server 2008 R2 production Enterprise Business Intelligence Server.
I read quite a lot on this subject but it is difficult to find any hard and fast answers when it comes to guidance.
I understand that the general advice is to have a dedicated Business Intelligence Server with dedicated resources including storage. However of course this depends on scale. So for the purposes of this particular question let’s assume the following; -
The solutions has; -
Approximately; - 50 dimensions and 20 Fact tables. The largest dimension has 6 million records, the largest fact table has 6 million new records per day (may increase to 10 million records per day). So including history the largest fact table has 2 billion+ records leaving a data warehouse physical size of 1 terra bytes using page compression.
There will be up to 30 users.
The background reading I’ve carried out includes the following; -
Microsoft Fast Track Data Warehouse; - http://www.microsoft.com/sqlserver/en/us/solutions-technologies/data-warehousing/fast-track.aspx
SQL Server 2008 White Paper: Analysis Services Performance Guide; - http://www.microsoft.com/en-us/download/details.aspx?id=17303
Analysis Services Operation Guide; - http://msdn.microsoft.com/en-us/library/hh226085.aspx
Introducing Microsoft SQL Server 2008 R2 eBook Free Download; -
http://blog.sqlauthority.com/2010/04/18/sqlauthority-news-free-ebook-download-introducing-microsoft-sql-server-2008-r2/ (especially chapter 6)
There is I understand a business case for having 3 separate physical dedicated servers for the Data Warehouse, Analysis Services and Reporting Services. However due to common budget constraints, Enterprise licenses being quite expensive and licenses being on a per server basis
Storage setup I would propose; -
RAID 10 OS and Windows installation
RAID 10 Transaction Log
RAID 10 TempDB
RAID5 Data Warehouse
RAID5 Cube
RAID5 On line backup
RAM; - 256Gb
CPU; - 2 x 12 Core
With the option of upgrading; -
the storage at a later date, e.g. upgrading Cube to RAID 10.
RAM to 1 terra byte
I understand that within the SAN methodology you can group one or more physical spindles as a SAN aggregate and present this SAN aggregate to the BI developer on the BI server as a logical drive. If anyone has any experience of doing this, do they notice any loss of performance?
I notice by looking at some other threads within this forum that SAN is a valid technology for Business Intelligence Servers; -
http://social.msdn.microsoft.com/Forums/en-US/sqlanalysisservices/thread/e1624327-f5c4-4dc9-a9fc-bfddc61b62b8/
I understand that from a network infastructure management perspective there are many reasons to favour SAN since it's easier to implement a detailed disaster recovery plan using SAN for example.
I am also aware that another of my clients uses SAN with no known performance issues where they receive several hundred thousand new records per day into their BI server. However my scenario is focusing on an average daily ETL of between 6 million and 10 million new records per day.
So to summaries my question what are the advantages / disadvantages in terms of performance optimisation and performance diagnostics of using Storage area network (SAN) as opposed to a dedicated physical Business Intelligence storage solution.
Kind Regards,
Kieran.
If you have found any of my posts helpful then please vote them as helpful. Kieran Patrick Wood MCTS BI,MCC, PGD SoftDev (Open), MBCS http://www.innovativebusinessintelligence.com/ http://uk.linkedin.com/in/kieranpatrickwood