If this article doesn’t help you, consider dropping me a note with your concern. I have thousands of lines of code for hundreds of Oracle problems I’ve faced and would publish ones that people told me would be helpful. I have written hundreds of macros for Excel to solve basic to even quite complex tasks.
Once in a while I want to write an article on technology. Here is the first one.
Note: In Excel 2010 – this feature is built in and much more sophisticated than this script.
Have you ever had a large dataset (or small) that you imported into Excel somehow and you just want to remove all the rows that duplicate on a certain column? Here’s a couple quick macros you can use to solve your problem. Just copy and paste this code into a module in Excel VBA and run them like any other macro.
This first macro is really good if you just have a single column of data. Your focus (the currently selected cell) must be in the column of your data, and your first row of data should be in row 1. Additionally, you should not have any blanks in the dataset. If you do, just sort the data first and run this macro. It can remove a few hundred duplicates in seconds. If you are in the thousands…be patient, it can run a little while. I just ran it against a dataset with 10000 rows with only 386 unique values and it finished in just over 3 minutes. This doesn’t sound like long, but if you’re sitting there while excel is flashing, you may be tempted to end it early.
Sub DeleteDups() Application.CutCopyMode = False Dim offset, firstRow, lastRow, currCol, i As Integer currCol = ActiveCell.Column firstRow = 1 Cells(firstRow, currCol).Select Selection.End(xlDown).Select lastRow = ActiveCell.Row Cells(firstRow, currCol).Select Range(Selection, Selection.End(xlDown)).Select Selection.sort Key1:=Cells(ActiveCell.Row, ActiveCell.Column), Order1:=xlAscending, Header:=xlGuess, _ OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _ DataOption1:=xlSortNormal offset = 0 For i = firstRow To (lastRow - 1) Step 1 If Cells(i - offset, currCol) = Cells(i + 1 - offset, currCol) Then Rows(i - offset).Delete Shift:=xlUp offset = offset + 1 End If Next i End Sub
The second macro is good for any situation where you have multiple columns of data. You must first SORT the data on the field you are going to remove the duplicates based upon. Your focus (the currently selected cell) must be in the column of your data we are checking for duplicates as well, and your first row of data should be in row 1. It can remove a few hundred duplicates in seconds. If you are in the thousands…be patient, it can run a little while.
Sub DeleteDupsNoSort() Application.CutCopyMode = False Dim offset, firstRow, lastRow, currCol, i As Integer currCol = ActiveCell.Column firstRow = 1 Cells(firstRow, currCol).Select Selection.End(xlDown).Select lastRow = ActiveCell.Row Cells(firstRow, currCol).Select Range(Selection, Selection.End(xlDown)).Select offset = 0 For i = firstRow To (lastRow - 1) Step 1 If Cells(i - offset, currCol) = Cells(i + 1 - offset, currCol) Then Rows(i - offset).Delete Shift:=xlUp offset = offset + 1 End If Next i End Sub
As always, SAVE your work before running this macro. You may even make a copy of your data first, just in case. If I had a penny for every iteration of an infinite loop I’ve written…
If anyone has any other programming needs, especially little things you might just want to be able to do at home, like track a budget, let me know. I’ll post more or just help you one on one.
If this article doesn’t help you, consider dropping me a note with your concern. I have thousands of lines of code for hundreds of Oracle problems I’ve faced and would publish ones that people told me would be helpful. I have written hundreds of macros for Excel to solve basic to even quite complex tasks.