Tuesday, January 28, 2014

Deduping rows in Netezza / Deleting rows in Netezza.

netezza has a hidden rowid for every row in all tables. to dedup, you can do this:

delete from omszip_stage
where rowid in (
select  max(rowid) as rowid
from OMSZIP_STAGE
group by couponid, zip, pid
having count(*) > 1)


14 comments:

  1. rowid is a hidden column in all netezza tables

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. My favorite query for finding/deleting duplicates is using rowid:

    select *
    --delete
    from table_with_duplicate t_remove_this
    where exists (select 1 from table_with_duplicate t_keep_this
    where t_remove_this.keyfield1 = t_keep_this.keyfield1
    and t_remove_this.keyfield2 = t_keep_this.keyfield2
    and t_remove_this.keyfield3 = t_keep_this.keyfield3
    and t_remove_this.keyfield1.rowid > t_keep_this.rowid)

    ReplyDelete
    Replies
    1. last line must be
      and t_remove_this.rowid > t_keep_this.rowid)

      Delete
  4. This comment has been removed by a blog administrator.

    ReplyDelete
  5. This comment has been removed by a blog administrator.

    ReplyDelete