Navigation
SYSTEM ANALYSIS AND DESIGN Data Flow Diagrams (DFD)
Unit 1: Data Flow Diagrams (DFD)
1. What is a Data Flow Diagram?
A Data Flow Diagram (DFD) is a graphical representation that illustrates the flow of data through a system. It shows:
- Processes that transform data
- Data stores where data is held
- External entities that send or receive data
- Data flows connecting them
Key distinction from flowcharts: Flowcharts show control flow (sequence of steps). DFDs show data flow – what data moves, where it goes, and how it is transformed.
1.1 Why Use DFDs?
| Benefit | Explanation |
|---|---|
| Clarity | Non-technical users understand the system’s data movement. |
| Communication | Common language between analysts, users, and developers. |
| Scope definition | Shows system boundary (what’s inside vs. outside). |
| Requirements verification | Identifies missing inputs, outputs, or storage. |
| Modular design | Supports top-down decomposition (leveling). |
| Documentation | Part of SRS and design specifications. |
2. DFD Symbols (Notations)
There are two main notations: Yourdon & DeMarco and Gane & Sarson. We’ll use Yourdon (more common in academic SAD), but both are equivalent.
2.1 The Four Basic Symbols
| Symbol Name | Yourdon Notation | Gane & Sarson | Meaning |
|---|---|---|---|
| Process | Circle or rounded rectangle (bubble) | Rectangle with rounded corners | Transforms input data into output data. Has a name (verb phrase) and number. |
| Data Store | Two parallel lines (open on right side) | Rectangle with open right side | Repository of data (file, database, table). Has a name (noun) and number. |
| External Entity | Rectangle (square) | Rectangle (sometimes shaded) | Person, organization, or system outside the system boundary that sends or receives data. |
| Data Flow | Arrow | Arrow | A packet of data moving from one symbol to another. Labeled with data name. |
2.2 Example Symbols (Yourdon style - text representation)
Process: ( Calculate Tax )
Data Store: ========== Customer_File
External: [ Customer ]
Data Flow: ----> (with label, e.g., "Order Details")
In actual diagrams, processes are numbered (1, 2, 1.1, 2.3.2, etc.) to show decomposition.
3. Rules for Drawing DFDs
3.1 Process Rules
- Every process must have at least one input data flow and one output data flow.
- A process cannot have only inputs (it would be a black hole) or only outputs (a miracle).
- Process names should be verb phrases (e.g., “Calculate Discount”, “Validate Customer”).
- Processes should transform data – not just pass it through.
3.2 Data Flow Rules
- Data flows must go from a process, data store, or entity to a process, data store, or entity.
- A data flow cannot go directly from one external entity to another (no system involvement).
- A data flow cannot go directly from one data store to another (must go through a process).
- Data flows should be labeled with nouns that describe the data (e.g., “Invoice”, “Customer Details”).
- A data flow can split into multiple flows (fan-out) if the data is routed to different destinations.
3.3 Data Store Rules
- Data stores must have at least one incoming (write) or outgoing (read) data flow.
- Data stores are internal to the system (unlike external entities).
- Do not show data flows between data stores directly.
3.4 External Entity Rules
- External entities are outside the system boundary.
- They can be sources (origin of data) or sinks (destination of data).
- An external entity cannot directly read or write a data store – must go through a process.
4. Levels of DFD (Top-Down Decomposition)
DFDs are built in levels – from a high-level overview to detailed views. This is called leveling or exploding.
4.1 Level 0: Context Diagram (Highest Level)
- Single process representing the entire system (labeled with system name, number 0).
- All external entities that interact with the system.
- Major data flows between entities and the system.
Purpose: Shows system boundary and scope. Used to discuss with management and users.
Example: Context Diagram for a Library System
[ Librarian ] ----- (Book Details) -----> ( 0. Library System )
[ Librarian ] <-----( Borrowing Record )---- ( 0 )
[ Member ] ----- ( Borrow Request ) ----> ( 0 )
[ Member ] <-----( Due Date Receipt )---- ( 0 )
[ Member ] <-----( Overdue Notice )----- ( 0 )
[ System Clock ] ----- ( Date ) ---------> ( 0 ) (for automatic notices)
4.2 Level 1 Diagram (Major Processes)
- Decompose the context diagram process (0) into 3–7 major processes (numbered 1, 2, 3, …).
- Show internal data stores (files/databases) that were hidden at context level.
- Show data flows between processes, data stores, and external entities.
Purpose: Shows main functional areas and how they interact.
4.3 Level 2, Level 3, etc. (Detailed Processes)
- Each major process from Level 1 can be exploded into its own lower-level diagram (e.g., process 1 becomes diagram 1 with processes 1.1, 1.2, …).
- Continue until each process performs a single function (e.g., “Calculate Tax” not “Process Order”).
Purpose: Provides enough detail for programmers to implement.
4.4 Balancing
Balancing ensures that the inputs and outputs of a parent process match the inputs and outputs of its child diagram (when exploded).
Data flows entering and leaving a parent process must appear on the child diagram.
Example balancing check:
If process 3 “Calculate Salary” has inputs “Employee Hours” and “Pay Rate”, and output “Gross Pay”, then the child diagram (processes 3.1, 3.2, etc.) must collectively have the same net inputs and outputs.
5. Step-by-Step Example: Order Processing System
Let’s build a DFD for a simple online order system.
Step 1: Identify External Entities
- Customer – places order, receives invoice.
- Warehouse – receives shipping request.
- Management – receives sales reports.
Step 2: Context Diagram (Level 0)
[ Customer ] ---( Order )---> ( 0. Order System )
[ Customer ] <---( Order Confirmation + Invoice )--- ( 0 )
[ Warehouse ] <---( Picking List )--- ( 0 )
[ Warehouse ] ---( Shipment Data )---> ( 0 )
[ Management ] <---( Sales Summary )--- ( 0 )
Step 3: Identify Major Processes (Level 1)
- Verify Order – check customer, inventory.
- Calculate Total – apply discounts, tax, shipping.
- Process Payment – charge customer.
- Generate Picking List – send to warehouse.
- Record Shipment – update order status.
- Produce Reports – for management.
Step 4: Identify Data Stores
- Customer File (customer details, credit limit)
- Product File (price, stock on hand)
- Order File (order header, lines, status)
- Shipment File (tracking numbers, dates)
Step 5: Draw Level 1 DFD (text representation)
External entities: [Customer], [Warehouse], [Management]
Data stores: =Customer File=, =Product File=, =Order File=, =Shipment File=
[Customer] --Order Details--> (1. Verify Order)
(1) --Valid Order--> (2. Calculate Total)
(1) --Rejection Notice--> [Customer]
(2) --Order with Total--> (3. Process Payment)
(3) --Payment Confirmation--> (2)
(2) --Order to Fulfill--> (4. Generate Picking List)
(4) --Picking List--> [Warehouse]
[Warehouse] --Shipment Data--> (5. Record Shipment)
(5) --Shipment Update--> =Order File=
(5) --Tracking Info--> [Customer]
(2) --Completed Order Record--> =Order File=
(3) --Payment Record--> =Order File=
(4) --Picking List Copy--> =Order File=
=Order File= --Sales Data--> (6. Produce Reports)
(6) --Sales Summary--> [Management]
Step 6: Decompose a Process (Level 2 for Process 2 “Calculate Total”)
Process 2 explosion:
2.1: Calculate Subtotal (price * quantity)
2.2: Apply Discount (based on customer loyalty)
2.3: Calculate Tax
2.4: Add Shipping Cost
2.5: Calculate Grand Total
6. Physical vs. Logical DFDs
| Type | Focus | Example |
|---|---|---|
| Logical DFD | What the system does, regardless of implementation (technology-free). | “Validate customer” – no mention of screen or database. |
| Physical DFD | How the system does it (specific technologies, manual steps, devices). | “Enter customer ID on web form”, “Query SQL database”. |
Analysis phase – Logical DFDs; Design phase – Physical DFDs.
7. Common Mistakes and How to Avoid Them
| Mistake | Why It’s Wrong | Correction |
|---|---|---|
| Data flow between two external entities | No system involvement. | Remove or add a process. |
| Data flow between two data stores | No transformation. | Insert a process. |
| Process with only inputs (black hole) | Data disappears. | Add an output flow. |
| Process with only outputs (miracle) | Data appears from nowhere. | Add an input flow. |
| Data flow without label | Unclear what data is moving. | Always label with a noun. |
| Too many processes on one diagram | Cluttered, unreadable. | Limit to 7 ± 2; decompose further. |
8. DFD in the Context of SAD
8.1 Where DFDs Fit in the Process
Fact Finding → Requirements → DFDs → ER Diagrams → System Design
DFDs are created during analysis after fact finding, and before database design.
8.2 DFDs Complement Other Models
- Use Case Diagram: Each use case may map to a major process in Level 1 DFD.
- Activity Diagram: Shows detailed logic inside a process.
- ER Diagram: Data stores in DFD become entities in ERD.
8.3 DFDs and Data Dictionary
All data should be defined in a dictionary. Example:
- Order: Customer_ID + {Product_Code, Quantity} + Order_Date
- Invoice: Order_ID + Customer_Name + Address + {Line_Item} + Total_Amount
9. Extended Example: Hospital Outpatient Registration System
9.1 Context Diagram (Level 0)
- Patient --(Reg. Info)--> (0. Registration)
- Patient <--(Card/Receipt)-- (0)
- Insurance <--(Claim Request)-- (0)
- Insurance --(Approval)--> (0)
- Billing <--(Reg Record)-- (0)
- Doctor Office <--(Arrival Notice)-- (0)
9.2 Level 1 DFD (Major Processes)
- Capture Patient Info
- Verify Insurance
- Calculate Co-pay
- Print Appt Card
- Notify Doctor Office
- Send to Billing
10. Drawing DFDs – Practical Tips
10.1 Tools
- Lucidchart, diagrams.net (Draw.io), Visio.
10.2 Process Numbering Convention
- Level 0: 0
- Level 1: 1, 2, 3
- Level 2: 1.1, 1.2, 2.1...
10.3 Naming Conventions
- Process: verb + noun (“Calculate Tax”)
- Data flow: noun (“Invoice”)
- Data store: plural noun (“Customers”)
- External entity: noun (“Bank”)
11. Summary Table – DFD Concepts
| Concept | Definition |
|---|---|
| Context Diagram | One process representing the system and external entities. |
| Level 0 | Same as context diagram. |
| Level 1 | Decomposes system into 3–7 major processes. |
| Balancing | Parent and child diagrams have matching inputs/outputs. |
| Logical DFD | Independent of technology. |
| Physical DFD | Shows implementation details. |
12. Key Takeaways
- DFDs focus on data flow, not control flow.
- Start with a context diagram to set system boundaries.
- Decompose step-by-step and Balance rigorously.
- Label every element clearly with meaningful names.
