It seems weird that the calculated yaw is changing because that indicates that the magnetic reading is changing, independent of the calibration. There are a few things you can do that might help me determine what is going on:
Can you take your phone and place it in the exact spot you used for the IMU and move it along the exact path and see if the compass reading changes? (If it does, it might indicate that something is interfering with the magnetic reading like metal in the table or the surface the devices are resting on.)
Can you try running the LSM303 "heading" example (which only uses the magnetometer and accelerometer, not the gyro) and doing the same test and see if the reported heading changes?
Can you also post a picture that shows your entire setup?