Type 3: Very limited
history (finite number of changes) is maintained (generally current and
previous). The original
record
is modified to reflect the change.
Customer Key
|
Customer ID
|
Name
|
State
|
Prev_State
|
Last_Eff_Dt
|
101
|
2345
|
John
|
California
|
Illinois
|
01-Jan-2003
|
In this approach the number of histories that are to be maintained
has
to be determined at the very
beginning of
the
warehouse or dimension design.
Based
on
that, if there has to 2 histories/changes
allowed (1 current + 2 history) then you would
add
2 additional
columns (current, prev1, prev2) in
the
dimension table so that each change is tracked
in one column.
Major 4 Steps in Dimensional Design Process:
1. Select
the business process to model - Example for
business processes include raw materials
purchasing, orders, shipments, invoicing, inventory, and general ledger.
2. Declare the grain of the business process -
Declaring the grain means specifying exactly
what an individual fact
table row represents. Examples for grain
include…
a. An
individual boarding
pass to get on
a flight
b. A daily snapshot of
the
inventory levels for each product in a warehouse
c. A monthly snapshot for each bank account
3. Choose the dimensions that apply to each fact table row - Examples of common dimensions
include date, product, customer, transaction
type, and
status.
4. Identify the numeric facts that will
populate
each fact table row - Examples include quantity
ordered, dollar cost amount etc…
Approaches for loading Dimension and Fact tables:
• Dimensions are loaded following their corresponding
SCD
type whether the Dimension
adopts Type 1 or 2 or 3 strategy.
Below picture depicts a
relatively complex
data warehouse environment (ETL + Reporting):
Topics you may
be further interested in:
Types of Dimensions:
Conformed Dimension:
A conformed dimension is a
dimension that means the same thing
with every possible fact table to
which it
can be joined.
Generally this means that a conformed dimension is identically the same dimension
in each
data mart. Examples of obvious conformed dimensions include customer, product, and
location, deal
(promotion), and calendar (time).
Date dimension
is regarded as mandatory dimension in
any
business process (fact table/star
schema) and it would generally be a conformed
dimension
shared by several
fact
tables/star schemas.
Degenerated Dimension:
The natural key of the parent (from OLTP source) is generally left over
as an orphan in the design
process. It also
means it is not mandatory that ORDER_ID (or
SALE_ID, INVENTORY_ID) which
is a primary key in thetransaction table at the source (also called
natural key) to load into fact table in
warehouse.
However, though it is
not
mandatory, ORDER_ID is also loaded into the fact table along
with all other dimension related
foreign
keys and
measures into ORDERS fact table and here we call
ORDER_ID as degenerate (or empty) dimension.
In other simple words, degenerated
dimension
is a
dimension key, such as a transaction
number, invoice number, ticket number, or bill-of-lading number, that has no
attributes and hence does not
join to an actual
dimension
table and so are stored
directly in fact
table.
Here Order Number
is neither a dimension attribute (represented
by
FKs) nor
it
is a measure
(represented
by
fact).
Factless Fact Table:
Transactions often
are
modeled as a fact table containing
a series of keys, each
representing a
participating
dimension in the transaction.
These fact tables sometimes have no obvious numeric
facts associated with
them (like quantity sold or ordered, billed
amount, payed amount etc...) and hence are called factless fact tables. See the data model below...
The only peculiarity
in this example is that we don’t have a numeric fact tied
to this
registration data.
As
such, analyses of this
data
will be based largely on counts (not sum, average, min, max etc...)
Even though
this fact table does not contain any measures, it can answer few interesting questions like...
• Which students registered for which courses?
• How many declared
engineering
majors are taking
an out-of-major finance course?
• How many students have registered for a
given faculty member’s courses during the last three years?
• How many students have registered for more than one course from a
given faculty member?
Few Reporting Terms:
Drill across: It is used to analyse data
across multiple hierarchies.
In reports you do this by requesting similarly
labeled data from two or more fact tables.
This almost always involves separate queries which are merged together in
a second pass by matching
row headers.
Drill down:
By drilling
down
you
would go from summary data to detailed
data within the same hierarchy.
In reports, you
do
this by adding or replacing
a row header to break down
the
rows of the
answer set more finely.
Drill up (or Roll up): By drilling down you would
go
from detailed data to summary data within the
same
hierarchy. In reports you
do
this by removing
or
replacing a row header in a report to summarize the rows of the answer set.
All the Best ... !!!
I feel this is the best way of providing some useful and more aspects of Informatica in order to make it more usable for the IT world.
ReplyDeleteinformatica read rest api