Groups | Blog | Home
all groups > sql server replication > january 2006 >

sql server replication : trans replication of large tables...


Hilary Cotter
1/14/2006 8:35:41 AM
1.is it possible to replicate such big tables?

Yes, but I tend not to use the snapshot agent to generate the data, and the
distribution agent to send it to the subscribers. I do a no-sync
subscriptions, and put the tables into the subscriber by manual methods like
bcp. Then the problem is getting this in sync with what is in production.
This task, although a moving target is normally easier than having
replication generate and send a subscription. Note in SQL 2005, it has
restartable subscriptions which make this task less painful, and more
resilient to failure. In fact it will restart where it left off.
2.should I use more publications? for each big table separate publication?

Yes, this will work for you if you use the independent agent option AND make
sure you group publications which are grouped by DRI/dependencies.

3.should I use pull subscription instead of push?

- Push is generally more scalable and used when you have lots of
subscribers. With pull you lose a central point of administration
4.what is the cause of this error? and why is it "jumping" from table to
table???

What do you mean by this? can you give me an example.
5. how to increase the push speed?

Independent agents options. I take it you have a multi-proc box on your
subscriber. Also put your subscription database into bulk copy recover
model.

6. how to check these snapshot data files? Before the distributor pushes
them?

You can bcp them into the tables yourself to check their integrity.

7. is there a way to add additional table\column to the publication without
recreating the snapshot? i used sp_add_article +
@force_invalidate_snapshot = 1 with a sp_add_subscription but that doesn't
work the snapshot recreates all the tables not just the needed one (tested
on the dev server)

I would have a look at a separate publication for this.
8. How to setup the –UseInprocLoader on the distributor agent so it don’t
scream about the rights. Both agents starts as an domain\local admins

What error message are you getting here?

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com

[quoted text, click to view]

benamis
1/14/2006 2:58:39 PM
Hi


i hope somebody can help me as i can't resolve it for 2 weeks now....


situation:

trans. replication on win2k3 sp1;sql2000 sp4 as a distributor and a
publisher. source and distributor db are on a separate RAID array's.
8cpu+4gb ram. 100mbs LAN.

subscriber win2k3 sp1;sql2005 ENT 4cpu 2 gb ram (reporting and analyses
server) 100mbs LAN.

i need to replicate 15 tables:
5 tables with ~63mil. records each(with column filters)
1 tables ~333mil. records (with column filters)
9 tables ~ 600 mil. records all (the smallest part of replication)

snapshot generates files in ~1h period with -MaxBCPThreads set to 9 at
the distributor\publisher server. the snapshot directory = 59GB

it takes ~ 18h for the distributor agent to push the data to the
subscriber. that is too long and i can't figure out how to improve it ,
but it is not such a big problem for now ....(-MaxBCPThreads set to 5,
LAN utilization is only 25% max and 5% average)

the problem:

at the end of the push job the error accrues:
Error messages:
Unexpected EOF encountered in BCP data-file (Source: MSSQLServer, Error
number: S1000)
Get help: http://help/S1000

I have tried to recreate the snapshot once again, but no luck. this
error shows every time on a different table (from the big once as the
smallest are already replicated.)
at the subscriber all the tables are created and all of them has the
data. the table with the error also is created and has the data.
the last time the difference (source <-->replicated table) was just in
13000 rows ... and i was so "happy" at this moment.....

i have checked the snapshot history and it states that the snapshot was
created successfully and the No. of rows were correct.

during this process log reader was working and tracked changes.

My questions:
1.is it possible to replicate such big tables?
2.should i use more publications? for each big table separate publication?
3.should i use pull subscription instead of push?
4.what is the cause of this error? and why is it "jumping" from table to
table???
5. how to increase the push speed?
6. how to check these snapshot data files? Before the distributor pushes
them?
7. is there a way to add additional table\column to the publication
without recreating the snapshot? i used sp_add_article +
@force_invalidate_snapshot = 1 with a sp_add_subscription but that
doesn't work the snapshot recreates all the tables not just the needed
one (tested on the dev server)
8. How to setup the –UseInprocLoader on the distributor agent so it
don’t screem about the rights. Both agents starts as an domain\local admins

benamis
1/14/2006 5:07:36 PM
thanx 4 a quick answer.

[quoted text, click to view]

I would lieke to stay with automatic sync as i am not that good in
writing manual sync\check SP

[quoted text, click to view]

"grouped by DRI/dependencies" what is that??? i guess BOL will answer :)

[quoted text, click to view]

i have only one subscriber, so pull will be better\fater???

[quoted text, click to view]

i have generated the snapshot 3 times and every time the error :
"Unexpected EOF encountered in BCP data-file" occurred on a different
table. first time i t was on the table with ~63mill records, second on
the ~50mil. and the third on ~300mil. and in all the cases all other
tables were delivered without any errors. So this error rises on a
random big table. Is it possible that snapshot agent for some reason
don't reads all the data from the source table without an error msg?


[quoted text, click to view]

yes the subscriber has 4 Xeon CPU units. No other applications just SQL
2005 agents. sql\reporting\analyses

Ok. ill change into bulk copy recover model. I forgot that :)

[quoted text, click to view]

but this process will take another ~18 h... isn't there any tool or
some sum check option?

[quoted text, click to view]

do you mean that if i have successfully publicized table A with 4
columns and if i want to add additional column to the same table A i'll
need to make another publication with table A and 1 column?

if so log, wouldn't log reader and distributor agents "lock" each
other? as the changes will arise on the same source table and 2
different jobs will "monitor" these changes and push simultaneously to
the subscriber.


[quoted text, click to view]


can't find the snapshot directory. something like:

file not found \\serverA\c$\repldata\200601010115\


I don’t have the exact error msg.





Hilary Cotter
1/15/2006 10:51:48 PM
1) the most significant part of this is generating the stored procedures.
You can use sp_scriptpublicationcustomprocs 'publicationName' to generate
them in the publication database, then run this on the subscription
database.

2) If the tables have primary key foreign key relationships they should be
in the same publication - or if they are queried in the same procs they
should be together.

3) Pull has more options to transfer the snapshot files to the subscriber
and load them there.

4) I can't explain this - is it possible you are running out of disk space
on the subscriber and its bumping into this ceiling at different points.

5) Sounds powerful enough to me:)

6) Not that I know of - perhaps PSS has some tool. You do know that you can
choose to bcp the last records in - there is a first row/last row switch so
you can choose what you think is the last row in the table and see if it
goes in ok.

7) No, adding a column should not entail using a separate publication. Nor
should this cause the locking you are seeing. Run profiler to see if you can
determine exactly what is causing this?

8) this is puzzling. It seems that you have been accessing this share all
along and suddenly it goes missing.

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com

[quoted text, click to view]

Michael Hotek
1/17/2006 2:10:34 AM
Just because it is set to local admins doesn't mean that it can
automatically access an admin share anywhere on the network. If you
publisher is on ServerA and the distributor is on ServerB, the local admins
group on ServerB doesn't have access to admin shares on ServerA unless you
explicitly grant them the authority.

The EOF error message is actually a bad error message. Snapshot capacity
has always been an issue for 2000 and prior. What you are hitting is
something as simple as a timeout during the application of the snapshot.
(It's a little more complicated, but that's the basics.) It appears to
happen randomly, because you are hitting a resource or response limit at
different points. Once I pass 10GB of data, I don't use the replication
engine to send the snapshot across. (In 2005, I'll kick a TB+ snapshot
across without worrying about it.) For data sizes above that, I'll take an
outage window to get replication setup. During that outage window, the data
is transferred with either DTS or backups. Once the data is on the
subscriber, the subscription is created specifying that the subscriber
already has the schema and data. That means you have to create all of the
replication procedures yourself using the sp_scriptpublicationcustomprocs
procedure.

You don't need multiple publications. The issue is with the snapshot and
can occur even with a single article. Define your publications based on
logical groupings of data flow.

--
Mike
http://www.solidqualitylearning.com
Disclaimer: This communication is an original work and represents my sole
views on the subject. It does not represent the views of any other person
or entity either by inference or direct reference.

[quoted text, click to view]
Hilary Cotter
1/17/2006 10:44:31 AM
Please notice that it is a PUSH subscription with a LOCAL distributor. Being
in the local admin group is sufficient.

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com

[quoted text, click to view]
Hilary Cotter
1/17/2006 12:09:48 PM
May I note that with multiple publications and the independent agent set to
on, you get parallel streams going into your subscription database. In other
words more than one table is being loaded at the same time. This will
decrease the time required to deploy the snapshot and if your line is
disconnected frequently this decrease in time is important to you. For large
snapshots, as Mike points out, it is best to try alternative methods to
deploy the snapshot.

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com

[quoted text, click to view]
benamis
1/18/2006 3:11:02 AM
thank you very much for your replies.


[quoted text, click to view]
AddThis Social Bookmark Button