These schemas are storing the employee and customer data for GROWYOURMONEY bank pvt ltd.
In the HR schema we have the EMPLOYEES table and the JOB_HISTORY table.
In the CUST schema we have the CUSTOMER table and the DAILY_TRANSACTIONS tables.
On a daily basis new employees are getting added to the company, and people are getting transferred to different departments because some departments are overstaffed and some departments have got new projects. GROWYOURMONEY bank has a good policy. It does not believe in laying off employees. So it cross trains its employees to work in other departments, which helps in the overall GROWTH of the employees and the GROWTH of its customer money.
This results in more customers getting added to the bank, for which the bank has to keep employing more people to mange the customer money more efficiently.
For this the bank also has to manage the growing data of its employees and customers, thus keeping the DBAs and IT staff on their toes.
To manage this growing employee and customer data, there is a need for the GROWYOURMONEY IT team to come up with a good data management policy.
One of the solutions they have come up with is an ARCHIVING strategy to archive the data of customers who dont want their money to grow and have left the bank. This strategy is working very well as has been demonstrated in the post below.
The new challenge being faced by the Database Administrators and IT is the huge high amount of time taken to generate the reports of customers as the customers of GROWYOURMONEY bank are increasing at a very large rate. For this they have done a high level analysis of the systems and have realized that the application system to manage the customer and daily transaction data and backend processes like report generation etc are using the same database.
While the software team is working on optimizing the queries which are slowing down the system, the IT team has decided that from a long term perspective they will need to run the application module and the backend reports module on 2 two different databases.
To achieve this they have to come up with a feasible oracle data replication solution.
Some of the well known oracle data replications solutions that can be considered are
- Oracle Golden Gate
- Oracle Advanced Queues
- Oracle change notification
- Oracle dataguard
Of the above, one approach they have considered is using the Oracle Dataguard configuration.
But considering the budget allocation for IT by the management, and the high licencing costs of Oracle Dataguard, and the fact that the replica database does not require data for all the tables but only a few tables which are required for the heavy reports in the reports module, they have decided to make an inhouse data replication module.
So the IT management has decided to use few supposedly underemployed employees who are not contributing to the bank ITs mission critical projects to build a data archive and data replication module.
This is to give a background on the need for the data archiving and replication.
Below are the steps of the data replication module developed to copy data from the required tables to the replica schema.
Create CUST schema objects
Once the objects in the replication schema, the HR schema and CUST schema are created, we now insert test data into these tables and start the data replication process.
Replication schema used : REPLICA
Below are few grants to be given as DBA or SYSTEM user on tables to be archived/replicated.
GRANT SELECT, DELETE ON HR.EMPLOYEES TO REPLICA;
GRANT SELECT, DELETE ON HR.JOB_HISTORY TO REPLICA;
GRANT SELECT, DELETE ON HR.REPL_QUEUE TO REPLICA;
GRANT SELECT, DELETE ON CUST.CUSTOMER TO REPLICA;
GRANT SELECT, DELETE ON CUST.DAILY_TRANSACTIONS TO REPLICA;
GRANT SELECT, DELETE ON CUST.REPL_QUEUE TO REPLICA;
-- this grant to be given by DBA user is required to shrink space after a delete
GRANT ALTER ANY TABLE TO REPLICA;
-- this grant required on DAILY_TRANSACTIONS to shrink space after a delete for ARCHIVING
ALTER TABLE CUST.DAILY_TRANSACTIONS ENABLE ROW MOVEMENT;
ALTER TABLE CUST.DAILY_TRANSACTIONS SHRINK SPACE CASCADE;
-- this grant required on CUSTOMER to shrink space after a delete for ARCHIVING
ALTER TABLE CUST.CUSTOMER ENABLE ROW MOVEMENT;
ALTER TABLE CUST.CUSTOMER SHRINK SPACE CASCADE;
-- this grant required on REPL_QUEUE to shrink space after a delete for data replication
ALTER TABLE CUST.REPL_QUEUE ENABLE ROW MOVEMENT;
ALTER TABLE CUST.REPL_QUEUE SHRINK SPACE CASCADE;
-- this grant required on EMPLOYEES to shrink space after a delete for ARCHIVING
ALTER TABLE HR.EMPLOYEES ENABLE ROW MOVEMENT;
ALTER TABLE HR.EMPLOYEES SHRINK SPACE CASCADE;
-- this grant required on JOB_HISTORY to shrink space after a delete for ARCHIVING
ALTER TABLE HR.JOB_HISTORY ENABLE ROW MOVEMENT;
ALTER TABLE HR.JOB_HISTORY SHRINK SPACE CASCADE;
-- this grant required on REPL_QUEUE to shrink space after a delete for data replication
ALTER TABLE HR.REPL_QUEUE ENABLE ROW MOVEMENT;
ALTER TABLE HR.REPL_QUEUE SHRINK SPACE CASCADE;
Configuration changes for data replication.
In PKG_ARCHIVE we need to change the RUN_MODE to REPLICATE.
RUN_MODE CONSTANT VARCHAR2(10) := 'REPLICATE';
NUM_RULE_ID | TXT_RULE_DESC | TXT_RULE_QUERY | TXT_ACTIVE_FLG | TXT_ARCHIVE_TABLE | TXT_DELETE_TABLE | TXT_TBL_OWNER |
---|---|---|---|---|---|---|
100002 | replicate the customer data to HR_ARCHIVE schema | SELECT UNIQUE_ID FROM CUST.REPL_QUEUE WHERE REPL_STAT = 'PNDNG' AND TBL_NAME = 'CUSTOMER' | Y | CUSTOMER | REPL_QUEUE | CUST |
100002 | replicate the customer data to HR_ARCHIVE schema | SELECT UNIQUE_ID FROM CUST.REPL_QUEUE WHERE REPL_STAT = 'PNDNG' AND TBL_NAME = 'DAILY_TRANSACTIONS' | Y | DAILY_TRANSACTIONS | REPL_QUEUE | CUST |
100002 | replicate the employee data to hr_archive schema | SELECT UNIQUE_ID FROM HR.REPL_QUEUE WHERE REPL_STAT = 'PNDNG' AND TBL_NAME = 'EMPLOYEES' | Y | EMPLOYEES | REPL_QUEUE | HR |
100002 | replicate the employee data to hr_archive schema | SELECT UNIQUE_ID FROM HR.REPL_QUEUE WHERE REPL_STAT = 'PNDNG' AND TBL_NAME = 'JOB_HISTORY' | Y | JOB_HISTORY | REPL_QUEUE | HR |
Before starting the replication process, we select the count of records in the tables from which data is to be replicated into the REPLICA schema.
SELECT COUNT(1) FROM HR.EMPLOYEES; --407
SELECT COUNT(1) FROM HR.JOB_HISTORY; -- 910
SELECT COUNT(1) FROM HR.REPL_QUEUE WHERE TBL_NAME = 'EMPLOYEES'; --300
SELECT COUNT(1) FROM HR.REPL_QUEUE WHERE TBL_NAME = 'JOB_HISTORY'; --900
SELECT COUNT(1) FROM REPLICA.EMPLOYEES; --0
SELECT COUNT(1) FROM REPLICA.JOB_HISTORY; -- 0
---------------------------------------------------------------------------
SELECT COUNT(1) FROM CUST.CUSTOMER; --16
SELECT COUNT(1) FROM CUST.DAILY_TRANSACTIONS; -- 376
SELECT COUNT(1) FROM CUST.REPL_QUEUE WHERE TBL_NAME = 'CUSTOMER'; --16
SELECT COUNT(1) FROM CUST.REPL_QUEUE WHERE TBL_NAME = 'DAILY_TRANSACTIONS'; --376
SELECT COUNT(1) FROM REPLICA.CUSTOMER; --0
SELECT COUNT(1) FROM REPLICA.DAILY_TRANSACTIONS; -- 0
We now run the archive_machine in replication mode and again verify the table data.
BEGIN
PR_ARCHIVE_MACHINE();
COMMIT;
--rollback;
END;
We will again run the script to get the count of records which are archived.
SELECT COUNT(1) FROM HR.EMPLOYEES; --407
SELECT COUNT(1) FROM HR.JOB_HISTORY; -- 910
SELECT COUNT(1) FROM HR.REPL_QUEUE WHERE TBL_NAME = 'EMPLOYEES'; --0
SELECT COUNT(1) FROM HR.REPL_QUEUE WHERE TBL_NAME = 'JOB_HISTORY'; --0
SELECT COUNT(1) FROM REPLICA.EMPLOYEES; --300
SELECT COUNT(1) FROM REPLICA.JOB_HISTORY; -- 900
---------------------------------------------------------------------------
SELECT COUNT(1) FROM CUST.CUSTOMER; --16
SELECT COUNT(1) FROM CUST.DAILY_TRANSACTIONS; -- 376
SELECT COUNT(1) FROM CUST.REPL_QUEUE WHERE TBL_NAME = 'CUSTOMER'; --0
SELECT COUNT(1) FROM CUST.REPL_QUEUE WHERE TBL_NAME = 'DAILY_TRANSACTIONS'; --0
SELECT COUNT(1) FROM REPLICA.CUSTOMER; --15
SELECT COUNT(1) FROM REPLICA.DAILY_TRANSACTIONS; -- 375
From the counts above we can see that records have got inserted into the corresponding tables in the replication schema.
The count of records in REPL_QUEUE table for each table whose data is to be archived matches with the count of records in the replication schema for that corresponding table.
The count of tables in the main tables is more than that of the correspondging table in the replication schema.
This is because when the replication process is enabled, already there was some data existing in the main schema tables.
For these exisiting records(before the replication is enabled) a one time manual insert operation can be done.
Scripts for manual one time insert of existing data into the replication schema.
AS SYSTEM or a DBA user we give the following grants
GRANT SELECT, INSERT ON REPLICA.JOB_HISTORY TO HR;
GRANT SELECT, INSERT ON REPLICA.EMPLOYEES TO HR;
GRANT SELECT, INSERT ON REPLICA.CUSTOMER TO CUST;
GRANT SELECT, INSERT ON REPLICA.DAILY_TRANSACTIONS TO CUST;
INSERT INTO REPLICA.CUSTOMER
( CUST_ID, CUST_NAME, PHONE_NUMBER, EMAIL, ADDR_LINE_1, ADDR_LINE_2, ADDR_LINE_3, COUNTRY_ID, REGION_ID, GENDER, ARCHIVE_LOG_HEADER_ID, ARCHIVE_DATE )
SELECT
CUST_ID, CUST_NAME, PHONE_NUMBER, EMAIL, ADDR_LINE_1, ADDR_LINE_2, ADDR_LINE_3, COUNTRY_ID, REGION_ID, GENDER, NULL, SYSDATE
FROM CUSTOMER
WHERE CUST_ID NOT IN ( SELECT CUST_ID FROM REPLICA.CUSTOMER);
INSERT INTO REPLICA.DAILY_TRANSACTIONS
( TRANS_ID, TRANS_DATE, CUST_ID, TRANS_AMT, TRANS_CRNCY, TRANS_IND, ACCT_NO, BNK_NAME, BNK_LOCATION, BANK_IFSC, ARCHIVE_LOG_HEADER_ID, ARCHIVE_DATE )
SELECT TRANS_ID, TRANS_DATE, CUST_ID, TRANS_AMT, TRANS_CRNCY, TRANS_IND, ACCT_NO, BNK_NAME, BNK_LOCATION, BANK_IFSC, NULL, SYSDATE FROM DAILY_TRANSACTIONS WHERE TRANS_ID NOT IN
(SELECT TRANS_ID FROM REPLICA.DAILY_TRANSACTIONS);
INSERT INTO REPLICA.EMPLOYEES
( EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, JOB_ID, SALARY, COMMISSION_PCT, MANAGER_ID, DEPARTMENT_ID, ARCHIVE_LOG_HEADER_ID, ARCHIVE_DATE )
SELECT EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, JOB_ID, SALARY, COMMISSION_PCT, MANAGER_ID, DEPARTMENT_ID, NULL, SYSDATE FROM EMPLOYEES
WHERE EMPLOYEE_ID NOT IN (SELECT EMPLOYEE_ID FROM REPLICA.EMPLOYEES);
INSERT INTO REPLICA.JOB_HISTORY
( EMPLOYEE_ID, START_DATE, END_DATE, JOB_ID, DEPARTMENT_ID, ARCHIVE_LOG_HEADER_ID, ARCHIVE_DATE )
SELECT EMPLOYEE_ID, START_DATE, END_DATE, JOB_ID, DEPARTMENT_ID, NULL, SYSDATE FROM JOB_HISTORY
WHERE EMPLOYEE_ID NOT IN (SELECT EMPLOYEE_ID FROM REPLICA.JOB_HISTORY);
Now if we run the counts on both the schemas it should match.
SELECT COUNT(1) FROM HR.EMPLOYEES; --407
SELECT COUNT(1) FROM HR.JOB_HISTORY; -- 910
SELECT COUNT(1) FROM REPLICA.EMPLOYEES; --407
SELECT COUNT(1) FROM REPLICA.JOB_HISTORY; -- 910
---------------------------------------------------------------------------
SELECT COUNT(1) FROM CUST.CUSTOMER; --16
SELECT COUNT(1) FROM CUST.DAILY_TRANSACTIONS; -- 376
SELECT COUNT(1) FROM REPLICA.CUSTOMER; --16
SELECT COUNT(1) FROM REPLICA.DAILY_TRANSACTIONS; -- 376
Thought for the day
Do not grow weary when you pray;
do not neglect to give alms
Sirach 7:10
And do not forget the birth pangs of your mother
Remember that it was of you parents you were born
How can you repay what they have given to you?
Sirach 7:28
Do not hate hard labor
or farm work
which was created by the Most High.
Sirach 7:13
No comments:
Post a Comment