A parallel computer system which has a primary task processor, a second primary task processor, a secondary task processor acting as a backup for the second primary task processor transfers messages by: sending messages from the primary task processor to the second primary processor with the second primary task processor operating on the messages by initially storing a received message in a queue and thereafter reading the message from the queue for processing in accordance with the task associated therewith and accumulating a count of the messages read from its queue; and sending the same messages from the first primary task processor to the secondary task processor which stores the messages in a message queue for possible use if the second primary task processor fails. If a primary task processor fails after processing a given number of messages, the secondary task processor associated therewith starts processing the messages in its queue but after having discarded the first given number of messages.
Great software patents: fault tolerance