Category: How do I do X?
Updated

This solution is summarized from an archived support forum post. This information may have changed. If you notice an error, please let us know in Discord.

How do I identify redundant data through a query?

Issue

I need to identify repeating values in a column (ID numbers) in a Google Sheet that contains user input from a form. I also want to compare the row index and mark the repeated entry with the text "Repeated" in a specific column. This will help me ensure that no employee can have two requests at once, preventing double booking of leave or day-off requests.

Resolution

To identify and mark repeated values in a specific column of a Google Sheet, you can use the following approach:



  1. Create a new column (for example, "Duplicates") that will contain the "Repeated" text for identified duplicates.

  2. Add a formula to check if the value in the ID column (let's say it's in column A) appears multiple times. The formula would be something like: =if(countif($A$2:$A,A2)>1,"Repeated","")

  3. Drag the formula down the new column to apply it to all rows.

  4. If you want to identify the exact duplicate row, you can add another column (for example, "Index") and use the formula =MATCH(A2,$A$2:$A,0).

  5. To display a message or prevent duplicate requests, you can use data validation rules or create a custom form using Google Apps Script that checks if the ID already exists in the worksheet.

Here's a sample code snippet for checking duplicates and marking them as "Repeated":

function markDuplicates() {
var sheet = SpreadsheetApp.getActiveSheet();
var dataRange = sheet.getDataRange();
var data = dataRange.getValues();
var duplicates = {};
for (var i = 1; i < data.length; i++) {
if (duplicates[data[i][0]]) {
duplicates[data[i][0]].push(i+1);
} else {
duplicates[data[i][0]] = [i+1];
}
}
for (var key in duplicates) {
var indices = duplicates[key];
if (indices.length > 1) {
for (var j = 0; j < indices.length; j++) {
sheet.getRange(indices[j], data[0].length+1).setValue("Repeated");
}
}
}
}

This script checks for duplicates in the first column (replace data[i][0] with the relevant column). It then marks all duplicate rows with the "Repeated" value in a new column to the right of the data. You can customize this script to fit your specific needs.