Resume IO if fencing fails
If DRBD fencing fails, IO is suspended indefinitely on the live node. You can find this line in the /var/log/messagess log (notice the susp part):
peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) susp( 0 -> 1 )
A similar information is also available in /proc/drbd (note the d flag in s---d-):
0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C s---d- ns:1125043199 nr:506010 dw:1467758306 dr:1835091217 al:14382773 bm:0 lo:0 pe:9 ua:0 ap:9 ep:1 wo:f oos:0
To resume IO after checking that the broken node is down for sure, run:
drbdadm resume-io <resource>