WHY this SQL Fails
First, I may state my problem.
I am using SQL Server 2005, and participating the 07 kdd cup project.
I have a table named VoteScore(UserId int, MovieId int, Score int, Date datetime) which involves over 100 million records in this table.
Now I want to add a new column RateIndex, which numbers from 1 - 100 Million. So I create a new table with the following SQL code:
create table newTable(RateIndex int identity(1,1), UserId int,
MovieId int, Score Int, Date datetime);
And I want to transfer the data from VoteScore to newTable by this code:
insert into newTable(UserId, MovieId, Score, Date)
select UserId, MovieId, Score, Date from VoteScore;
However, the result frustrates me. I witnessed the log file increases very fast until consumes all my disk space and no record has been insert into this newTable. So would like to give me any direction ?
Thanks a lot
wxuf
[952 byte] By [
wxuf] at [2007-11-20 7:50:31]

# 2 Re: WHY this SQL Fails
Well. Thanks you Shuja Ali.
I am adding a new column RateIndex by executing the SQL
ALTER TABLE VoteScore ADD RateIndex int Identity(1,1);
However, It seems the execution does not run out and log file still increases very fast. It's really weird, By the way, can we limit the log file size ? It's dangerous to run out of the disk space.
wxuf
wxuf at 2007-11-9 13:46:11 >

# 3 Re: WHY this SQL Fails
However, It seems the execution does not run out and log file still increases very fast. It's really weird, By the way, can we limit the log file size ? It's dangerous to run out of the disk space.This is the reason why I always suggest keeping Transaction Log File on a different partition.
Take a look at this KB Article
http://support.microsoft.com/kb/873235
# 6 Re: WHY this SQL Fails
Yes, Thanks again.
I am using another way to update the rateindex.
I group the records by movieId, then each group is relatively small(have
about 10,000 records). The rateIndex for each group is updated. However,
it seems this operation costs much time beyond my expectation: updating 1 Million records in an hour.
My codes are easy to understand.
set nocount on;
declare @movie int, @riValue int, @ri int;
declare @message varchar(100) ;
select @riValue = 1;
declare movie_cursor cursor
for select distinct movieid from kddcup.VoteScore
order by movieid
open movie_cursor;
fetch next from movie_cursor
into @movie
while @@fetch_status = 0
begin
select @message = 'updating Movie : ' + char(@movie + 48) ;
print @message;
declare rate_cursor cursor
for select rateindex from kddcup.VoteScore
where movieid = @movie
for update;
open rate_cursor;
fetch next from rate_cursor into @ri;
while @@fetch_status = 0
begin
update kddcup.VoteScore set rateindex = @riValue
where current of rate_cursor;
select @riValue = @riValue + 1 ;
fetch next from rate_cursor into @ri;
end;
close rate_cursor;
deallocate rate_cursor;
fetch next from movie_cursor
into @movie;
end;
select @message = 'Operation Finished!';
print @message;
close movie_cursor;
deallocate movie_cursor;
wxuf at 2007-11-9 13:50:14 >

# 7 Re: WHY this SQL Fails
Using Cursors will sure slow down the process. The better thing would be to move the transaction log on to a different disk or partition. This way you will eliminate any future problems.
You can also take a look at
http://msdn2.microsoft.com/en-us/library/ms175495.aspx