Monday, March 26, 2012

MSRepl_Commands question

SORRY FOR THE LONG POST.
I am using transactional replication. I have an update process that is
run by users (not dbas). The update process among many other things
will delete replication, perform updates, and then reinstall
replication. This process has been working great for us (in testing).
However, there are times when the distribution agent hasn't finished
processing all the rows in msrepl_commands and the replication gets
dropped which causes lost (not replicated) commands.
We started by just running sp_browereplcmds with the proper
publication_database_id value to see if any commands were outstanding.
The thought was to check the recordset, if empty, then proceed, else
wait, until specified time then cancel the update with a nice message
about waiting until replication was complete. Great, right? Wrong.
Even after replication is finished and EVEN after the cleanup agent is
run (we are considering running it manually and rechecking the list),
there are commands STILL in the table. They all have command type
values of -2147xxxx and they all appear to be snapshot commands.
Sorry for the long post, but here's the question: Does anyone know
where I can find a key to these values? I would like to just ignore any
-2147xxxx type command in the table and proceed with dropping
replication.
Any help would be appreciated!
*** Sent via Developersdex http://www.codecomments.com ***
Don't just participate in USENET...get rewarded for it!
this is as near as I can figure
-2147483641 directory check
-2147483598 file check
-2147483597 unknown
-2147483646 apply schema
-2147483646 apply index
-2147483645 sync command
-2147483596 unknown
Hilary Cotter
Looking for a book on SQL Server replication?
http://www.nwsu.com/0974973602.html
"PBarbin" <pbarbin@.nospam.com> wrote in message
news:%23lDmE10aEHA.2516@.TK2MSFTNGP10.phx.gbl...
> SORRY FOR THE LONG POST.
> I am using transactional replication. I have an update process that is
> run by users (not dbas). The update process among many other things
> will delete replication, perform updates, and then reinstall
> replication. This process has been working great for us (in testing).
> However, there are times when the distribution agent hasn't finished
> processing all the rows in msrepl_commands and the replication gets
> dropped which causes lost (not replicated) commands.
> We started by just running sp_browereplcmds with the proper
> publication_database_id value to see if any commands were outstanding.
> The thought was to check the recordset, if empty, then proceed, else
> wait, until specified time then cancel the update with a nice message
> about waiting until replication was complete. Great, right? Wrong.
> Even after replication is finished and EVEN after the cleanup agent is
> run (we are considering running it manually and rechecking the list),
> there are commands STILL in the table. They all have command type
> values of -2147xxxx and they all appear to be snapshot commands.
> Sorry for the long post, but here's the question: Does anyone know
> where I can find a key to these values? I would like to just ignore any
> -2147xxxx type command in the table and proceed with dropping
> replication.
> Any help would be appreciated!
> *** Sent via Developersdex http://www.codecomments.com ***
> Don't just participate in USENET...get rewarded for it!
|||Thanks for the reply. As far as I can tell, transactional replication
commands are of type 30.
As an alternate question, can anyone think of a way that I can be SURE
that transactional replication is finished before deleting it and
readding?
Paul
*** Sent via Developersdex http://www.codecomments.com ***
Don't just participate in USENET...get rewarded for it!
|||Paul,
you could compare the xact_seqno in MSDistribution_history (last processed
transaction sequence number) to the values in msrepl_commands. Presumably,
if the last row in each table has the same xact_seqno (after reformatting
slightly), then you're ok.
HTH,
Paul Ibison
|||Paul, that may help tremendously! I'll look into it. Have you noticed
that sp_browsereplcmds shows commands even when there are none to be
processed. I was hoping it was just because we were in a test
environment and we weren't always playing by the rules, but I didn't
want to ship this to all customers without being pretty confident that
my solution would work. I'll try this get back...
Thanks.
Paul
*** Sent via Developersdex http://www.codecomments.com ***
Don't just participate in USENET...get rewarded for it!
|||Background
When my program is updated in the future, the first step will be to
delete the transactional replication, make all database changes to the
publisher and the subscriber and then reinitialize replication. We want
to make sure that the distribution agent has finished processing all
outstanding transactions before dropping it.
Problem
We considered checking the status of the distribution agent but realized
that we couldn't. One reason is that it never "finishes". This agent
runs continuously. Also, its possible to have the status of the agent
as "Idle" even when there are commands left to be processed (this
process is asynchronous).
So, we decided to query the system table, MSRepl_Commands, directly.
This table has a few interesting problems. One is that commands are not
dropped immediately after being processed by replication. Another is
that rows exist in this table for both snapshot and transactional
replication. As usual, the only identifier to distinguish them was not
documented and we were not completely sure which rows to filter. The
stored procedure that Microsoft recommends using doesn't take these
items into consideration and will return rows when transactional
replication is "finished".
Solution
Profiling the agent gave us all the information we needed. We found
that the subscription database contains a table that tracks the last
transaction number processed for each distribution agent that hits the
database (we only have 1). Now we can filter out all the old commands
by checking their transaction number against this value. Also, we saw
that one of the stored procedures defines a snapshot bit (0x80000000)
and uses a bitwise AND on this undocumented "Type" column to see which
commands are snapshot and which are not. Using both of these pieces of
info, we are able to query the table to accurately determine the status
of transactional replication.
Hope this helps somebody. I find it amazing how little documentation
exists regarding replication. If it weren't for the great posters (Paul
I. and Hilary C) on this group, a LOT of us would be in trouble.
Thanks!
*** Sent via Developersdex http://www.codecomments.com ***
Don't just participate in USENET...get rewarded for it!

No comments:

Post a Comment