Hi
I have several DB's, and only one datawarehouse. I need to make sure that I'm not replicating dimensions. And then i need to map the references in the sources, to the references from the fact tables to the dimension table.
I use SQLServer2K
ThanksNot sure on your question, I got two possible answers.
1) if you are trying to use multiple facts or dim tables from multiple data sources you can create a cube based on a standard SQL view.
2) You typically use a surrogate key and a natural key for each dimension, the fact table should be able to reference your dim table by its natural key.
If neither addresses what you are asking, let me know.
HTH|||Table Worker on source1
ID name
---- ----------------
1 John
2 Paul
3 George
Table Worker on source2
ID name
---- ----------------
1 Paul
2 John
3 Micheal
what sort of primary key should I use in de Dimension table?
And having a table with sales referencing the worker by it's ID, how can I get the dimension table primary key?
Thanks|||You typically use a natural and surrogate key, surrogate key would be arbitrary identity field while your natural key is what your OLTP system uses to reference the row. DO you mean how to get the natural key in your fact table? This you do by just specifying to pull the primary key of the transaction for all your dims in your ETL process (process used to pull your data from the OLTP store)
HTH|||I'm making a tool to do the ETL process... and my problem is that the same member of one dimension may by present in multiple OLPT's, naturally with different natural keys.
What key sould I use in the DW as natural key?
When loading the fact table, what sould how can I get the surrogate key?
Thanks|||a Surrogate key could be just a sequential identity column, some people don't use them but they can be helpful in case there are changes in your natural key among other things.
I see what you are saying in multiple natural keys, would like an example of what you are seeing but I'll try to take a stab at what you mean.
Let's say I have 4 applications each with customer addresses. Each of thsoe applications would typically have their own data mart consisting of subject oriented cubes. You could either have multiple dimensions for each customer address (one for each application) or you could store within your dimension table all natural keys for that row, Something like this:
Dim_cust_address
Columns - sur_key (only one if you are storing all natural keys) - nat_key1 - nat_key2 - nat_key3 - nat_key4 - any other dimension properties (street, house_nbr, prefix,suffix,etc)
Now for your subject oriented data marts you would only need to store one natural key per fact table. (nat_key1, nat_key2, etc.)
Then when building your cubes you can either use a view to link the dims to the different facts or make shared dimensions and access that that way. I'll admit I haven't done something like this but it should work OK.
HTH
BTW: The best book for these type of questions and much more is Ralph Kimball's Data Warehouse Lifecycle toolkit.
Wednesday, March 7, 2012
Multiple Sources, One Dimension
Labels:
database,
datawarehouse,
dimension,
dimensions,
hii,
map,
microsoft,
multiple,
mysql,
oracle,
replicating,
server,
sources,
sql
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment