View Discussion
Improve Article
Save Article
View Discussion
Improve Article
Save Article
Overview :
Data Redundancy and Data Inconsistency are the important terms used in the Database. A good Database Design is the
one in which there is minimum Data Redundancy and Data Inconsistency. In this article, we will tell what are these two terms and what is the difference between them.
Data Redundancy :
It is defined as the redundancy means duplicate data and it is also stated that the same parts of data exist in multiple locations into the database. This condition is
known as Data Redundancy.
Problems with Data Redundancy :
Here, we will discuss the few problems with data redundancy as follows.
- Wasted Storage Space.
- More Difficult Database Update.
- It will lead to Data Inconsistency.
- Retrieval of data is slow and inefficient.
Example –
Let us take an example of a cricket player table.
Step-1 :
Consider cricket player table as
follows.
Virat Kohli | 32 | India | 1 |
Rohit Sharma | 34 | India | 1 |
Ross Taylor | 37 | New Zealand | 2 |
Shikhar Dhawan | 35 | India | 1 |
Kane Williamson | 30 | New Zealand | 2 |
Step-2 :
We can clearly see that the Team Name and Team Id are repeated at multiple places. we can make a separate table to store this information and reduce data redundancy.
Virat Kohli | 32 | 1 |
Rohit Sharma | 34 | 1 |
Ross Taylor | 37 | 2 |
Shikhar Dhawan | 35 | 1 |
Kane Williamson | 30 | 2 |
Step-3 :
This is known as Normalization used to reduce Data Redundancy.
1 | India |
2 | New Zealand |
Data Inconsistency :
When the same data exists in different formats in multiple tables. This condition is known as Data Inconsistency. It means that different files contain different information about a particular object or person. This can cause unreliable and meaningless information. Data Redundancy leads to Data Inconsistency.
Example –
If we have an address of someone in many tables and when
we change it in only one table and in another table it may not be updated so there is the problem of data inconsistency may occur.
Differences :
It will be applicable when the duplicate data exists in multiple places in the database. | It will be applicable when the duplicate data exists in different formats in multiple tables. |
we can use normalization to minimize Data Redundancy. | we can use constraints on the database to minimize Data Inconsistency. |
What Does Data Redundancy Mean?
Data redundancy is a condition created within a database or data storage technology in which the same piece of data is held in two separate places.
This can mean two different fields within a single database, or two different spots in multiple software environments or platforms. Whenever data is repeated, it basically constitutes data redundancy.
Data redundancy can occur by accident but is also done deliberately for backup and recovery purposes.
Techopedia Explains Data Redundancy
Within the general definition of data redundancy, there are different classifications based on what is considered appropriate in database management, and what is considered excessive or wasteful. Wasteful data redundancy generally occurs when a given piece of data does not need to be repeated but ends up being duplicated due to inefficient coding or process complexity.
For example, wasteful data redundancy might occur when inconsistent duplicates of the same entry are found on the same database. Accidental data redundancy could occur due to inefficient coding or overcomplicated data storing processes, and represent an issue in terms of efficiency and costs.
Since the existence of duplicate or unnecessary data fields should be resolved, the reconciliation, integration, and normalization operations required to remove inconsistencies can be costly and time-consuming. Errors generated by accessing the wrong redundant data sets might lead to many issues with clients. Lastly, the additional space taken up by redundant data might start to add up over time, leading to bloated databases.
A positive type of data redundancy works to safeguard data and promote consistency. Multiple instances of the same datasets could be leveraged for backup purposes, disaster recovery [DR], and quality checks.
Redundant data can be stored on purpose by creating compressed versions of backup data that can be restored, and become part of specific DR strategies. In the event of a cyberattack or data breach, for example, having the same data stored in several different places can be critical to ensure the continuity of operations as well as damage mitigation.
Data redundancy can also be leveraged to improve the speed of updates and data access if it’s stored on multiple systems that can be accessed by different departments.
Many developers consider it acceptable for data to be stored in multiple places. The key is to have a central, master field or space for this data, so that there is a way to update all of the places where data is redundant through one central access point. Otherwise, data redundancy can lead to big problems with data inconsistency, where one update does not automatically update another field. As a result, pieces of data that are supposed to be identical end up having different values.
Whenever prevention is not enough, database normalization or reconciliation operations can be required to eliminate already existing redundancies. A series of standardization rules are first defined to set what “normal data” actually is. Then, the database is checked to ensure that the dependencies in all columns and tables are enforced correctly and that all unnecessary duplicates are correctly addressed.