I have an issue with duplicates in a table but the whole rows are not
duplicated just certain values. Could anyone point me in the right
direction for creating a script to fix the problem as described
below. I have a basic understanding of tsql and have used the count
function but this is a bit beyond me at the moment:
Table1
col1 col2 col3 col4 col5
101 abc xyz A other
101 abc uvw A other
101 def xyz A other
102 def uvw A other
103 abc xyz A other
104 def uvw A other
col 2 and col 3 should only ever have the same values in a single row
but as you can see rows 5 and 6 contain duplicates I'd like to be able
to runa script that would find these duplicates and then check col1
and update col 4 from A to D thats should rectify my issues.
Any help much appreciated
RobFirst, once you have corrected the current data problem add a UNIQUE
constraint . The basis of database integrity is to prevent problems
from happening in the first place.
The basic query to find duplicates:
SELECT col2, col3, count(*) as DupCount
FROM Sometable
GROUP BY col2, col3
HAVING count(*) > 1
I think this does what you want:
UPDATE Whatever
SET col4 = 'D'
WHERE EXISTS
(SELECT * FROM Whatever as B
WHERE Whatever.col2 = D.col2
AND Whatever.col3 = D.col3
AND Whatever.col1 < D.col1)
AND col4 <> 'D'
Roy Harvey
Beacon Falls, CT
On Sat, 12 Jan 2008 11:00:34 -0800 (PST), rob.stevenson@.isoftplc.com
wrote:
>I have an issue with duplicates in a table but the whole rows are not
>duplicated just certain values. Could anyone point me in the right
>direction for creating a script to fix the problem as described
>below. I have a basic understanding of tsql and have used the count
>function but this is a bit beyond me at the moment:
>Table1
>col1 col2 col3 col4 col5
>101 abc xyz A other
>101 abc uvw A other
>101 def xyz A other
>102 def uvw A other
>103 abc xyz A other
>104 def uvw A other
>col 2 and col 3 should only ever have the same values in a single row
>but as you can see rows 5 and 6 contain duplicates I'd like to be able
>to runa script that would find these duplicates and then check col1
>and update col 4 from A to D thats should rectify my issues.
>Any help much appreciated
>Rob|||Hi Rob
I don't understand your description of how you want to 'fix' the problem.
Can you give more details, including what this data should like after your
fix?
This query will find your duplicates:
SELECT col2, col3
FROM Table1
GROUP BY col2, col3
HAVING count(*) > 1
--
HTH
Kalen Delaney, SQL Server MVP
www.InsideSQLServer.com
http://blog.kalendelaney.com
<rob.stevenson@.isoftplc.com> wrote in message
news:364750da-441c-41f0-82f9-a7285fc23ba3@.l32g2000hse.googlegroups.com...
>I have an issue with duplicates in a table but the whole rows are not
> duplicated just certain values. Could anyone point me in the right
> direction for creating a script to fix the problem as described
> below. I have a basic understanding of tsql and have used the count
> function but this is a bit beyond me at the moment:
> Table1
> col1 col2 col3 col4 col5
> 101 abc xyz A other
> 101 abc uvw A other
> 101 def xyz A other
> 102 def uvw A other
> 103 abc xyz A other
> 104 def uvw A other
> col 2 and col 3 should only ever have the same values in a single row
> but as you can see rows 5 and 6 contain duplicates I'd like to be able
> to runa script that would find these duplicates and then check col1
> and update col 4 from A to D thats should rectify my issues.
> Any help much appreciated
> Rob|||On 12 Jan, 19:18, "Roy Harvey (SQL Server MVP)" <roy_har...@.snet.net>
wrote:
> First, once you have corrected the current data problem add a UNIQUE
> constraint . =A0The basis of database integrity is to prevent problems
> from happening in the first place.
> The basic query to find duplicates:
> SELECT col2, col3, count(*) as DupCount
> =A0 FROM Sometable
> =A0GROUP BY col2, col3
> HAVING count(*) > 1
> I think this does what you want:
> UPDATE Whatever
> =A0 =A0SET col4 =3D 'D'
> =A0WHERE EXISTS
> =A0 =A0 =A0 =A0(SELECT * FROM Whatever as B
> =A0 =A0 =A0 =A0 =A0WHERE Whatever.col2 =3D D.col2
> =A0 =A0 =A0 =A0 =A0 =A0AND Whatever.col3 =3D D.col3
> =A0 =A0 =A0 =A0 =A0 =A0AND Whatever.col1 < D.col1)
> =A0 =A0AND col4 <> 'D'
> Roy Harvey
> Beacon Falls, CT
> On Sat, 12 Jan 2008 11:00:34 -0800 (PST), rob.steven...@.isoftplc.com
> wrote:
>
> >I have an issue with duplicates in a table but the whole rows are not
> >duplicated just certain values. =A0Could anyone point me in the right
> >direction for creating a script to fix the problem as described
> >below. =A0I have a basic understanding of tsql and have used the count
> >function but this is a bit beyond me at the moment:
> >Table1
> >col1 col2 col3 col4 col5
> >101 =A0abc =A0xyz =A0A =A0other
> >101 =A0abc =A0uvw =A0A =A0other
> >101 =A0def =A0 xyz =A0A =A0other
> >102 =A0def =A0 uvw =A0A =A0other
> >103 =A0abc =A0xyz =A0A =A0other
> >104 =A0def =A0 uvw =A0A =A0other
> >col 2 and col 3 should only ever have the same values in a single row
> >but as you can see rows 5 and 6 contain duplicates I'd like to be able
> >to runa script that would find these duplicates and then check col1
> >and update col 4 from A to D thats should rectify my issues.
> >Any help much appreciated
> >Rob- Hide quoted text -
> - Show quoted text -
Hi Roy,
Thanks for that the second script was spot on with some minor tweaking
very much appreciated.
Thanks, Robsql
No comments:
Post a Comment