Unit 1: Process Organization and Integration

1. Introduction

A real‑world information system consists of many processes working together.

Process organization refers to how these processes are structured, grouped, and managed – hierarchically, modularly, or in workflows.

Process integration refers to how processes communicate, share data, and synchronize their actions to achieve a common goal.

Key idea: A well‑organized and well‑integrated system is easier to understand, build, test, maintain, and scale.

2. Process Organization

Process organization answers: How do we arrange the many processes identified in DFDs into a manageable structure?

2.1 Hierarchical Decomposition (Top‑Down)

This is the natural extension of DFD leveling.

Top level: The entire system as a single process (Context diagram).
Lower levels: Processes decomposed into sub‑processes.
Leaf processes: Processes that are not decomposed further (primitive processes).

Example: Order Processing System

Level 0: Process 0 – Order System
Level 1: Process 1 – Verify Order, Process 2 – Calculate Total, Process 3 – Process Payment, Process 4 – Generate Picking List, Process 5 – Produce Reports
Level 2 (for Process 2): 2.1 Calculate Subtotal, 2.2 Apply Discount, 2.3 Calculate Tax, 2.4 Add Shipping, 2.5 Calculate Grand Total

Benefits:

Manages complexity (divide and conquer).
Allows different teams to work on different branches.
Matches natural business hierarchies.

Hierarchical Process Decomposition (Top-Down)

2.2 Modularity

A module is a self‑contained, reusable unit of functionality with well‑defined inputs and outputs.
In process terms, a module corresponds to a primitive process or a small group of cohesive processes.

Principles of Modular Design (from structured design):

Principle	Explanation
High cohesion	A module should perform one and only one logical task (e.g., “Calculate Tax”, not “Calculate Tax and Print Invoice”).
Low coupling	Modules should depend on each other as little as possible – communicate only through necessary data flows.
Information hiding	Internal details of a module are hidden; only the interface is exposed.

Principles of Modular Design: Cohesion vs Coupling

Example of Good Modularity (Order System):

Module A: ValidateCustomer – inputs: CustomerID; outputs: Valid/Invalid, CustomerTier.
Module B: CalculateDiscount – inputs: CustomerTier, Subtotal; outputs: DiscountAmount.
Module C: ApplyTax – inputs: Subtotal, TaxCode; outputs: TaxAmount.

These modules can be developed, tested, and reused independently.

Example of Poor Modularity (low cohesion):

Module X: ProcessOrder – validates customer, calculates tax, updates inventory, prints invoice, emails customer. (Too many unrelated tasks → hard to test and maintain.)

2.3 Process Flow / Workflow Organization

Instead of a pure hierarchy, processes can be organized according to the flow of work:

Sequential – Process A → Process B → Process C (e.g., Order → Payment → Shipping).
Parallel – Process A splits into B and C that run simultaneously (e.g., Update Inventory and Notify Warehouse run in parallel).
Conditional – Process A decides which of B or C to execute (e.g., if payment method = credit card go to Process X, else go to Process Y).
Iterative – Process A repeats until a condition is met (e.g., request more data until all fields valid).

Representation: These flow patterns are often shown in Activity Diagrams (UML) or Flowcharts, complementing DFDs.

2.4 Process Specification (Structured English / Pseudocode)

Each primitive process must be specified in a way that can be coded. Structured English is a common technique.

Example – Process 2.2 “Apply Discount”

PROCEDURE ApplyDiscount (Subtotal, CustomerTier, HasPromoCode)
    Discount = 0
    IF CustomerTier = "Gold" THEN
        Discount = Subtotal * 0.10
    ELSE IF CustomerTier = "Silver" THEN
        Discount = Subtotal * 0.05
    ENDIF
    IF HasPromoCode = TRUE THEN
        Discount = Discount + 10.00   // fixed $10 off
    ENDIF
    IF Discount > Subtotal THEN
        Discount = Subtotal   // cannot discount more than total
    ENDIF
    RETURN Discount
END PROCEDURE

3. Process Integration

Integration answers: How do these separate processes work together as one system?

3.1 Types of Integration

Type	Description	Example
Data integration	Processes share data through common data stores or messages.	Order process writes to Order_File; Shipping process reads from Order_File.
Control integration	One process triggers or coordinates another.	After Payment completes, it sends a “payment confirmed” signal to Shipping.
Interface integration	Processes expose APIs or user interfaces for other processes or external systems.	Inventory process provides a REST API for Sales process to check stock.
Presentation integration	Multiple processes appear as one unified interface to the user.	A single dashboard showing order status, payment status, and shipping status from three different back‑end processes.

3.2 Integration Patterns

Pattern 1: Shared Database (Most Common)

All processes read/write to a common database (or data warehouse).

Processes are loosely coupled – they only need to know the database schema, not each other.
Advantages: Simple, transactional integrity, easy to query.
Disadvantages: Database becomes a bottleneck; schema changes affect all processes.
Example: Order system, Payment system, Shipping system all use the same OrderDB.

Pattern 2: Message‑Based Integration (Asynchronous)

Processes communicate via messages placed on a queue or topic (e.g., RabbitMQ, Kafka, MSMQ).

The sender does not wait for an immediate reply (decoupled in time).
Advantages: High scalability, fault tolerance, ability to replay messages.
Disadvantages: Complexity; eventual consistency.
Example: Order Process publishes “OrderPlaced” event; Payment Process listens and processes it; Shipping Process listens for “PaymentConfirmed”.

Pattern 3: API‑Based (Synchronous / Request‑Response)

One process calls another process directly over a network (HTTP/REST, gRPC, SOAP).

The caller waits for a response.
Advantages: Simple, real‑time, widely understood.
Disadvantages: Tight temporal coupling (caller fails if callee is down); can create cascading failures.
Example: Checkout process calls PaymentGateway.charge() API and waits for success/failure.

Pattern 4: Batch File Transfer (Legacy / High‑Volume)

Processes exchange data via files (CSV, XML, EDI) placed in shared directories or FTP servers.

Typically scheduled (e.g., nightly).
Advantages: Works with legacy systems; high volume; no real‑time requirements.
Disadvantages: Latency (not real‑time); file locking and error handling can be tricky.
Example: Nightly batch: Sales system exports orders to a CSV; Warehouse system imports it at 2 AM.

Pattern 5: Enterprise Service Bus (ESB) / Mediator

A central middleware routes, transforms, and orchestrates messages between processes.

Processes only need to know the ESB, not each other.
Advantages: Centralized governance, transformation capabilities, monitoring.
Disadvantages: Single point of failure; can become complex.
Example: An ESB receives a “NewOrder” message, transforms it to the format needed by Inventory, Payment, and Shipping, and routes accordingly.

3.3 Integration Levels (From Loose to Tight)

Level	Description	Coupling	Example
None	No communication	–	Standalone spreadsheet
Data only	Share data store	Loose	Shared database
Message	Asynchronous events	Loose	Order placed event
API (sync)	Request‑response	Medium	REST API
Orchestrated	One process controls others	Tight	Workflow engine calling services
Shared memory / in‑process	Very tight	Tightest	Two modules calling each other’s functions

Design guideline: Prefer looser coupling unless performance or transaction integrity demands tighter integration.

Process Integration Patterns and Coupling Levels

4. Process Coordination and Orchestration

When multiple processes must work together to achieve a business goal (e.g., “fulfill order”), we need coordination.

4.1 Orchestration vs. Choreography

Concept	Description	Example
Orchestration	A central coordinator (e.g., workflow engine) tells each process what to do and when.	A “Order Fulfillment” service calls Payment, then calls Shipping, then calls Notification.
Choreography	No central controller. Processes react to events they observe, following agreed rules.

Orchestration vs Choreography Coordination Styles
Payment publishes “Paid” event; Shipping subscribes to that event and acts without being told by a controller. |

When to use which?

Orchestration – when you need strict control, complex branching, error compensation, or auditing.
Choreography – when you want loose coupling, high scalability, and processes are independent.

4.2 Example: Order to Cash Orchestration

Orchestrated flow (central coordinator – OrderService):

1. OrderService receives order.
2. OrderService calls PaymentService.charge().
3. If success, OrderService calls InventoryService.reserve().
4. If success, OrderService calls ShippingService.createShipment().
5. If any step fails, OrderService calls compensation (e.g., PaymentService.refund()).

Choreographed flow (event‑driven):

OrderService publishes OrderCreated event.
PaymentService listens, processes payment, publishes PaymentCompleted (or PaymentFailed).
InventoryService listens to PaymentCompleted, reserves stock, publishes StockReserved.
ShippingService listens to StockReserved, creates shipment, publishes ShipmentCreated.

4.3 Transaction Management Across Processes

When multiple processes update shared data, we need consistency.

ACID transactions – Only possible within a single database or within a tightly coupled system (e.g., two‑phase commit). Not practical across independent services.
Sagas – A sequence of local transactions, each with a compensating action. If a step fails, the saga runs compensations backwards.

Example Saga for Order:

Create Order (pending).
Process Payment (if fails → cancel Order).
Reserve Inventory (if fails → refund Payment, cancel Order).
Ship Order (if fails → unreserve Inventory, refund Payment, cancel Order).

Sagas are the standard pattern for process integration in microservices.

Saga Pattern for Distributed Transaction Management

5. Process Organization & Integration in DFD Context

How does this relate to the DFDs you have already learned?

DFD processes are the units we organize and integrate.
Data flows between processes represent the integration mechanism (shared data store, message, API call).
Decomposition levels show process organization hierarchy.
Physical DFDs can explicitly show integration technologies (e.g., a data flow labeled “HTTP POST”, or a data store labeled “Kafka topic”).

Example – Physical DFD with Integration Details:

[Web Browser] --HTTP POST /order--> (1. Order Entry)
(1) --SQL Insert--> =Order DB=
(1) --Publish to Queue--> [Order Queue]
(2. Payment Processor) --Subscribe--> [Order Queue]
(2) --API Call to Bank--> [Bank System]

6. Best Practices for Process Organization

Practice	Explanation
Single Responsibility	Each process does one well‑defined task.
Minimize interfaces	Fewer data flows between processes reduces complexity.
Standardize naming	Use verb‑noun for processes; noun for data flows.
Use levels of detail	Don’t show 50 processes on one diagram – decompose.
Group related processes	Put processes that work on the same data or same business function together (e.g., all payment‑related processes in one package).
Design for change	Organize so that a change in one business rule affects few processes.

7. Best Practices for Process Integration

Prefer asynchronous messaging over synchronous calls when possible (better resilience).
Design idempotent processes – same message received twice should not cause duplicate effects.
Use versioned APIs – allow old processes to continue working while new versions are introduced.
Monitor integration points – track message queues, API response times, error rates.
Document integration contracts – for each data flow, specify format, semantics, error handling, and expected behavior.
Implement dead‑letter queues for failed messages to avoid data loss.
Test integration in isolation – using contract testing (e.g., Pact) or API mocks.

8. Real‑World Example: E‑commerce System Integration

Let’s design process organization and integration for an online store.

8.1 Process Organization (Hierarchy + Modularity)

Level 1 Processes:

Order Management – accepts orders, checks inventory, manages status.
Payment Processing – charges customer, handles refunds.
Inventory Management – updates stock, reserves items.
Shipping Management – creates labels, tracks deliveries.
Customer Notification – sends emails/SMS.

Level 2 Decomposition (Payment Processing):
2.1 Validate Credit Card
2.2 Call Payment Gateway
2.3 Record Transaction
2.4 Handle Failure (retry or decline)

8.2 Integration Design (Hybrid Approach)

Integration Point	Pattern	Technology
Order → Inventory	Synchronous API (check stock)	REST / gRPC
Order → Payment	Asynchronous message	RabbitMQ (OrderPlaced event)
Payment → Order	Asynchronous message	RabbitMQ (PaymentConfirmed / Failed)
Order → Shipping	Asynchronous message	RabbitMQ (OrderConfirmedForShipping)
Shipping → Order	Asynchronous message	RabbitMQ (ShipmentCreated)
All processes → Notification	Event listener	Listens to all relevant events
Shared data	Database (for order, payment, shipment records)	PostgreSQL (separate schemas per service but same DB for simplicity)

8.3 Saga Coordination (Choreography)

Order service creates order with status “PENDING”.
Order service publishes OrderCreated event.
Payment service listens, processes payment, publishes PaymentSucceeded or PaymentFailed.
If PaymentSucceeded, Inventory service reserves stock, publishes StockReserved.
If StockReserved, Shipping service creates shipment, publishes ShipmentCreated.
Order service listens to PaymentFailed or StockReservationFailed and updates order status to “FAILED” and publishes OrderFailed.
Notification service listens to all final events (OrderFailed, ShipmentCreated) and emails customer.

8.4 Benefits Realized

Each process can be developed by a separate team.
Payment failure does not block other processes – events handle it.
Shipping can be delayed without breaking order entry.
New process (e.g., Fraud Detection) can subscribe to OrderCreated without changing existing code.

9. Process Integration Pitfalls

Pitfall	Consequence	Solution
Too many synchronous calls	Cascade failures, poor performance	Use async messaging where possible
Shared database direct access	Tight coupling, schema changes break everything	Use APIs or events; each process owns its data
No idempotency	Duplicate messages cause double payments or orders	Assign unique message IDs; check before processing
No dead‑letter queue	Failed messages disappear or block the queue	Configure DLQ; monitor and reprocess
Ignoring eventual consistency	Users see inconsistent data	Design UI to handle “processing” states, or use sagas
Monolithic process	One process does everything – hard to change	Decompose into cohesive modules

10. Tools and Techniques for Process Integration

Tool / Technique	Purpose
Message brokers (RabbitMQ, Kafka)	Asynchronous messaging, pub/sub
API Gateways (Kong, AWS API Gateway)	Central entry for synchronous APIs, routing
Workflow engines (Camunda, Temporal)	Orchestration, sagas, processes
Service mesh (Istio, Linkerd)	Network-level integration (retries)
ESB (MuleSoft, Apache Camel)	Legacy integration, transformation
Contract testing (Pact)	Ensure processes agree on message formats

11. Summary Table – Process Organization vs. Integration

Aspect	Process Organization	Process Integration
Focus	Structure inside the system	Communication between processes
Key concepts	Decomposition, modularity, cohesion, coupling	Data sharing, APIs, messaging, orchestration, sagas
Representation	Leveled DFDs, activity diagrams	Physical DFDs, sequence diagrams
Decision driver	Manage complexity, parallel development	Data consistency, scalability, resilience
Common mistake	Low cohesion, high internal coupling	Synchronous coupling, no error handling

12. Key Takeaways

Process organization is about breaking a system into manageable, cohesive pieces.
Process integration is about making those pieces work together.
Loose coupling is a goal – prefer asynchronous messaging over synchronous calls.
Orchestration uses a central controller; choreography uses events.
Sagas handle transactions across separate processes.
The DFD and data dictionary must reflect integration decisions.

Part11of15

Online Study World

Navigation