When the same data is duplicated in numerous files of a database This is known as data?

View Discussion

Improve Article

Save Article

  • Read
  • Discuss
  • View Discussion

    Improve Article

    Save Article

    Overview :
    Data Redundancy and Data Inconsistency are the important terms used in the Database. A good Database Design is the one in which there is minimum Data Redundancy and Data Inconsistency. In this article, we will tell what are these two terms and what is the difference between them.

    Data Redundancy
    It is defined as the redundancy means duplicate data and it is also stated that the same parts of data exist in multiple locations into the database. This condition is known as Data Redundancy. 

    Problems with Data Redundancy :
    Here, we will discuss the few problems with data redundancy as follows.

    1. Wasted Storage Space.
    2. More Difficult Database Update.
    3. It will lead to Data Inconsistency. 
    4. Retrieval of data is slow and inefficient.

    Example –
    Let us take an example of a cricket player table.

    Step-1 :
    Consider cricket player table as follows. 

    Player NamePlayer AgeTeam NameTeam ID
    Virat Kohli 32 India 1
    Rohit Sharma 34 India 1
    Ross Taylor 37 New Zealand 2
    Shikhar Dhawan 35 India 1
    Kane Williamson 30 New Zealand 2

    Step-2 :
    We can clearly see that the Team Name and Team Id are repeated at multiple places. we can make a separate table to store this information and reduce data redundancy.

    Player NamePlayer AgeTeam Id
    Virat Kohli 32 1
    Rohit Sharma 34 1
    Ross Taylor 37 2
    Shikhar Dhawan 35 1
    Kane Williamson 30 2

    Step-3 :
    This is known as Normalization used to reduce Data Redundancy.

    Team IdTeam Name
    1 India
    2 New Zealand

    Data Inconsistency : 
    When the same data exists in different formats in multiple tables. This condition is known as Data Inconsistency. It means that different files contain different information about a particular object or person. This can cause unreliable and meaningless information. Data Redundancy leads to Data Inconsistency. 

    Example – 
    If we have an address of someone in many tables and when we change it in only one table and in another table it may not be updated so there is the problem of data inconsistency may occur.

    Differences :

    TopicData RedundancyData Inconsistency
    ConditionIt will be applicable when the duplicate data exists in multiple places in the database. It will be applicable when the duplicate data exists in different formats in multiple tables.
    How to minimize it? we can use normalization to minimize Data Redundancy. we can use constraints on the database to minimize Data Inconsistency.

    What Does Data Redundancy Mean?

    Data redundancy is a condition created within a database or data storage technology in which the same piece of data is held in two separate places.

    This can mean two different fields within a single database, or two different spots in multiple software environments or platforms. Whenever data is repeated, it basically constitutes data redundancy.

    Data redundancy can occur by accident but is also done deliberately for backup and recovery purposes.

    Techopedia Explains Data Redundancy

    Within the general definition of data redundancy, there are different classifications based on what is considered appropriate in database management, and what is considered excessive or wasteful. Wasteful data redundancy generally occurs when a given piece of data does not need to be repeated but ends up being duplicated due to inefficient coding or process complexity.

    For example, wasteful data redundancy might occur when inconsistent duplicates of the same entry are found on the same database. Accidental data redundancy could occur due to inefficient coding or overcomplicated data storing processes, and represent an issue in terms of efficiency and costs.

    Since the existence of duplicate or unnecessary data fields should be resolved, the reconciliation, integration, and normalization operations required to remove inconsistencies can be costly and time-consuming. Errors generated by accessing the wrong redundant data sets might lead to many issues with clients. Lastly, the additional space taken up by redundant data might start to add up over time, leading to bloated databases.

    A positive type of data redundancy works to safeguard data and promote consistency. Multiple instances of the same datasets could be leveraged for backup purposes, disaster recovery (DR), and quality checks.

    Redundant data can be stored on purpose by creating compressed versions of backup data that can be restored, and become part of specific DR strategies. In the event of a cyberattack or data breach, for example, having the same data stored in several different places can be critical to ensure the continuity of operations as well as damage mitigation.

    Data redundancy can also be leveraged to improve the speed of updates and data access if it’s stored on multiple systems that can be accessed by different departments.

    Many developers consider it acceptable for data to be stored in multiple places. The key is to have a central, master field or space for this data, so that there is a way to update all of the places where data is redundant through one central access point. Otherwise, data redundancy can lead to big problems with data inconsistency, where one update does not automatically update another field. As a result, pieces of data that are supposed to be identical end up having different values.

    Whenever prevention is not enough, database normalization or reconciliation operations can be required to eliminate already existing redundancies. A series of standardization rules are first defined to set what “normal data” actually is. Then, the database is checked to ensure that the dependencies in all columns and tables are enforced correctly and that all unnecessary duplicates are correctly addressed.

    What is duplication of data in database?

    Data duplication means that a data source has multiple records, usually with different syntaxes for the same object. This problem has been recognized as extremely important to many organizations, due to the size and complexity of today's database systems.
    Answer: Data redundancy occurs when different divisions, functional areas, and groups in an organization independently collect the same piece of information.

    What is the duplication of data or the storage of the same data in multiple places?

    Redundancy is the duplication of data or the storing of the same data in more than one place.

    What is data redundancy and inconsistency?

    The main difference between data redundancy and data inconsistency is that data redundancy is a condition that occurs when the same piece of data exists in multiple places in the database whereas data inconsistency is a condition that occurs when the same data exists in different formats in multiple tables.