I vote for the slowest most reliable option for recovery. $400 bytes of ram is plenty enough. Anyway the original code is set to use 0-1ff as comm loop 300-on as buffer for write message and 200-2ff for code buffer. Each routine is sent on demand. You send erase routine at $200, it is executed and you got positive response, Than send write routine at $200 and overwrite the write buffer.

That way you can make small chunks of code that are send on demand to save ram.

The change of speed can be handled that way. You send some code that change the baud with some wait time after that, for the tool to switch speed.