VLX53L0X timeout issues

fll-freak · December 28, 2019, 9:59pm

I found a thread on this same issue from two years ago that never seemed to have a conclusion. Rather than bumping the thread, I am starting a new one.

I have three units on a device, but I can see the problem even when I hold two of the units in reset. The issue is that after a random-ish amount of time, the Pololu library readRangeContinuousMillimeters returns -1 (65535). From code inspection this only happens if the RESULT_INTERRUPT_STATUS bit stay low for the time out period. When this happens there is no recovery. It stays stuck from them on. Also the IR light from the sensor stops coming out as seen using night vision goggles.

I have not tried to re-initialize the part when it gets stuck, but I think it would likely come back to life. As this is what happens when I do a reset holding power on to the sensors.

My initialization is to call setTimeout(500), init(), and then startContinuous().

Somehow I think the sensor is losing its programing.

The I2C master is an ESP32. It gets its power either via a USB cable or via battery. The battery architecture is two 18650’s that feed into a Pololu 5 volt buck regulator. The 5V feeds the ESP32 that has its own 3V3 buck to power the board.

I do find the time it takes for the VL53 to enter the "timeout error mode "takes less time (on average) running from the USB. Sample size of about 20.

The wiring is point to point solder with 26 gauge wire of lengths less than 8 inches longs.

The environment can be very benign when the failure occurs. The sensors are mounted in a robot with two drive motors, but the failure can happen even when the motors are disconnected from the system. The reset of the robot includes four reflective light sensors connected to analog pins on the ESP32.

The failure SEEMS to occur if the sensor is looking “into the void” or at a close object.

I am using the Pololu Arduino library with one exception. Since I could not tolerate the busy loop wait in the readRangeContinuousMillimeters routine, I create a polling method to determine if a measurement exists before I call the above to retrieve the new measurement. The function looks like this:
bool VL53L0X::rangeAvailable(void)
{
return (readReg(RESULT_INTERRUPT_STATUS) & 0x07) ? true: false;
}
(By the way, I suggest this as an addition to your library.)

So was any progress ever make in finding out why people’s sensors would stop reporting?

-Skye

fll-freak · December 28, 2019, 10:49pm

Ran the experiment where I re-initialize the sensor if it gets into the timeout error condition. It solves brings the sensor back from the dead. The questions now is why is it dying?

Bad power? I guess I can monitor the power at the sensor with a scope to see.
Bad I2C command? I guess I can monitor the bus with a logic analyzer.
Issue with these sensors?
Overheat? Think not as the re-init works right away.

Right now I am seeing a re-init every minute or so. Sometimes I see a few within just a few seconds. Grrr.

fll-freak · December 30, 2019, 3:07am

Often I can toggle the X_SHUT signal to reboot the processor and then reprogram it. After than it will work for another period of time. Unfortunately at other times, toggling the X_SHUT signal (20 ms down, 20ms up) before programming does not work. The very first I2c transaction (getting the model number) fails with a NAK. In fact, at this point every things fails with a NAK.

And yet I do not think the bus is hung because the SDA and SDL lines are not stuck.

Try as I might, I have yet to catch the transition from working to failed on a logic analyzer. The event is just to random and I do not have the patience to catch 10 seconds of data at a time hoping to catch the event that might happen up to 20 minutes in the future.

I guess the next solution is to switch the power (3V3) to the sensors to see if that will kick them free.

fll-freak · December 30, 2019, 7:28pm

Heisenberg’s Uncertainty Principle: The act of measuring something changes that something.

I have instrumented the sensor hoping to catch a failure occurring. But it has now been running for 3 hours straight without a failure.

I am wondering if the slight capacitance of the logic analyze probes might be fixing the problem.

kevin · January 3, 2020, 6:01am

Hello, Skye.

Thanks for sharing your findings on this issue. I don’t think we at Pololu ever really looked into it before now (since it doesn’t seem to be a problem on other platforms, including standard 8-bit AVR-based Arduinos), but I tested this today with an ESP32 board and was able to reproduce something like what you described.

What I observed was that even though the library reports a sensor reading of 65535 when it gets stuck, it is not reporting a timeout, meaning it is actually reading 0xFFFF from the sensor over I2C.

In turn, this seems to be caused by some strange behavior that takes place while the library is polling for a new reading to be available. Sometimes, instead of writing just the register address for RESULT_INTERRUPT_STATUS (and then proceeding to read the register value), the ESP32 apparently keeps toggling SCL for a while without driving SDA, effectively writing 0xFF values into all of the VL53L0X’s registers. This goes on for pretty much exactly 50 ms, after which the program tries to continue running normally, but it will then only read 0xFF from the sensor.

If you do manage to confirm that this is the same thing that happens in your setup, that would be useful to know. At this point, I think there’s something wrong with either the ESP32’s I2C hardware or the Wire library implementation for it, but since I’m not aware of any widespread ESP32 I2C issues (do you know of any?), I suspect there’s a specific interaction with something in our library code that is causing this problem.

Kevin

fll-freak · January 3, 2020, 2:06pm

Kevin,

First let me express my admiration that you took the time to actually connect up an ESP32 to the sensor, write some code, and capture my issue. That is truly remarkable. I was expecting more of a “Never seen that one before” type of answer. Kudos.

Second, I am thrilled that you caught something on a scope/logic analyzer. The error happens on my system now so infrequently (sometimes as much as 6 hours when I am looking for the problem) that the chance I catch it in the buffer is small. When I am not looking for the problem (and not instrumented to catch it) it seems to happen every few minutes. Go figure.

I think a runaway I2C peripheral in the ESP32 seems like the probable source of my problem. Armed with this information, I can set up the triggers on my LA to look for an extended I2C transfer. I will try to do that over the weekend and report back. This would explain much but possibly not everything. Once the system does “go crazy” I can connect up a LA and see what is going on. In at least one case, all the reads to the sensor were NAKed. Making me believe that the sensor was not responding at all. In fact, in these cases, I would toggle the shutdown pin to ‘reboot’ the sensor. That did not seem to have any effect. But writing this down, I could see a case where the runaway writes would nuke the I2C address register making the device respond to a different address than I am using. That still does not explain why the ‘reboot’ did not restore the system unless the ‘reboot’ does not reset the internal registers.

Again, you have given me a fantastic clue as to where to look. I have also gotten some logic level FETs on order to fully power cycle the sensor if it goes spastic in the future. Now it is time to Google weird I2C behavior on the ESP32. I would not be surprised if this was as issue since it is a multiple core running an RTOS. A poorly written driver could get corrupted if it was not multithreaded safe.

You folks rock.

anpl0907 · April 15, 2020, 6:29pm

@fll-freak Any updates here? I am working on a project that is surprisingly similar and I am running into a similar issue.

roy.zahor · June 28, 2020, 8:01am

@FlintheartGlomgold, @anpl0907 I have the exact same issue. Did someone manage to solve it?
Thanks!

basic-settings · August 19, 2020, 9:41pm

I’ve got the same issue on a ESP32; tried with multiple VLX libraries to no avail. Just as described, it works initially and after a while it times out and all read requests return xFFFF. Powering the ESP32 off/on solves the issue temporarily, so just like OP reported.

Edit: setting the timeout to 0 kept it working for more than 25mins (for now); will see if it handles more than 4h.
@roy.zahor @anpl0907 FYI and also the comment below.
Using pololu library (not the adafruit one) and setting timeout to 0 with setTimeout(0)

basic-settings · August 20, 2020, 5:31am

@fll-freak @kevin After 8hrs of single measurements, it’s still working for myself.

The solution was to use the polulu library and use setTimeout(0).

I did have some strange spikes at sunrise and these happened 5 times:

Reading a measurement… Distance (mm): 87
06:57:50.738 -> Reading a measurement… Distance (mm): 85
06:57:51.431 -> Reading a measurement… Distance (mm): 169
06:57:52.129 -> Reading a measurement… Distance (mm): 146
06:57:52.825 -> Reading a measurement… Distance (mm): 114
06:57:53.523 -> Reading a measurement… Distance (mm): 94
06:57:54.220 -> Reading a measurement… Distance (mm): 86

84 - 95 is the expected range, so everything above 110 is fairly strange, but these did appear during sunrise so I am expecting it due to the big lightning difference.

I perform 25 SingleReads with a read every 500ms and then return an average, so I can easily discard the seemingly false values and check for new ones.

Regards

kevin · August 20, 2020, 9:35pm

Hi, basic-settings.

Thanks for reporting your results; I’m glad to hear you found something that seems to work. I’ll try to see if I can reproduce the behavior you observed.

Kevin

basic-settings · August 24, 2020, 6:30pm

Sadly my joy was only quick to last, about 2 days after which the read error issue started to show up. Seemingly no matter how much I try now (different settings), it randomly enters the error mode.

sethtmf · March 28, 2021, 11:14pm

I am seeing a very similar problem so would love to know if anyone found a solution?

Thanks

fll-freak · March 28, 2021, 11:33pm

I have not found a solution. I heard that there might be a beta version of an I2C driver that might fix the problem. But that was a year ago. I gave up on the ESP32 and migrated to the Teensy giving up the WiFi.

parthbhat13 · April 26, 2021, 6:16pm

Hey everyone,
i was working with this sensor for a smartAquaguard project. never thought i would stumble upon this forum and see many people are facing issues. as i was able to figure out a work around for my problem i thought to drop a solution for all the people out there who is facing the issue.
As well, i am using some inductive load like valves, and esp32 for the project. ill drop some pictures if anyone is interested.

Let me first list down the problems i had –

Sensor would send random values greater than 1000
Sensor would send 65535 if any inductive load is operated, the sensor would fail or the i2c would.
Sensor would have to be reset by power On/Off for it to work again.

The solution was quite simple. I had ran a 5 Wire shielded cable from my device to the sensor, and was checking the sensor data based on the Interrupt generated by the GPIO1 . I decided to discard that idea and go with the generic code to simply read the data in the continuous mode. ofcourse i did play around with some variables before i did that, and i will talk about them below. now that i had the Wire left with me out of 5 wires i had used, i thought to follow @fll-freak idea, and use the X-SHUT Pin to reset the sensor itself.

Enough of talking. lets go through the code.

Initialization Code

void initSensor(uint8_t sdaPin, uint8_t sclPin, uint32_t i2cFreq, uint8_t resetPin) {
  // Start with Wire Library

  pinMode(resetPin, OUTPUT);
  digitalWrite(resetPin, HIGH);  // Keep this HIGH as we need to pull it low when we reset the sensor

  Wire.begin(sdaPin, sclPin, i2cFreq);

  sensor.init();
  sensor.setTimeout(500);

#if defined LONG_RANGE
  // lower the return signal rate limit (default is 0.25 MCPS)
  sensor.setSignalRateLimit(0.1);
  // increase laser pulse periods (defaults are 14 and 10 PCLKs)
  sensor.setVcselPulsePeriod(VL53L0X::VcselPeriodPreRange, 18);
  sensor.setVcselPulsePeriod(VL53L0X::VcselPeriodFinalRange, 14);
#endif

#if defined HIGH_SPEED
  // reduce timing budget to 20 ms (default is about 33 ms)
  sensor.setMeasurementTimingBudget(20000);  // minimum timing budget 20 ms
#elif defined HIGH_ACCURACY
  // increase timing budget to 200 ms
  sensor.setMeasurementTimingBudget(200000);
#endif

  // Start continuous back-to-back mode (take readings as
  // fast as possible).  To use continuous timed mode
  // instead, provide a desired inter-measurement period in
  // ms (e.g. sensor.startContinuous(100)).
  sensor.startContinuous();

}

Code to get the SensorData

void getSensorData(uint16_t* myData) {
/* keep this counter for saving the error counts */
  static uint8_t counter = 0;

  *myData = sensor.readRangeContinuousMillimeters();

  Serial.println(*myData);
/* Here we check that we have the error values, i have kept it greater than 1000 */
  if (*myData >= 1000) {
    /* Incriment The Counter , so we check this for 10 times and then decide to reset */
    counter++;

    /* Try 10 times and still you get false values then reset */
    if (counter == 10) {
      /* reset the Vlx over here */
      vlxReset(SENSOR_RESET_PIN);
      counter = 0;

    }
  }
  else {
    counter = 0;
  }
}

The reset Function

void vlxReset(uint8_t resetPin) {

  
  Serial.println("Resetting Sensor");
 /* Stop the Sensor reading and even stop the i2c Transmission */
  sensor.stopContinuous();
  Wire.endTransmission();

/* Reset the sensor over here, you can change the delay */
  digitalWrite(resetPin, LOW);
  vTaskDelay(600 / portTICK_PERIOD_MS);  // Change to delay(600); if not using rtos
  digitalWrite(resetPin, HIGH);
  vTaskDelay(600 / portTICK_PERIOD_MS);


  if (!sensor.init()) {
    ESP_LOGE(TAG, "Failed To Detect Sensor.. Restarting!!");
    ESP.restart();
  }
  sensor.setTimeout(500);

#if defined LONG_RANGE
  // lower the return signal rate limit (default is 0.25 MCPS)
  sensor.setSignalRateLimit(0.1);
  // increase laser pulse periods (defaults are 14 and 10 PCLKs)
  sensor.setVcselPulsePeriod(VL53L0X::VcselPeriodPreRange, 18);
  sensor.setVcselPulsePeriod(VL53L0X::VcselPeriodFinalRange, 14);
#endif

#if defined HIGH_SPEED
  // reduce timing budget to 20 ms (default is about 33 ms)
  sensor.setMeasurementTimingBudget(20000);  // minimum timing budget 20 ms
#elif defined HIGH_ACCURACY
  // increase timing budget to 200 ms
  sensor.setMeasurementTimingBudget(200000);
#endif

  sensor.startContinuous();

}

I am sure the code is quite self explanatory, you will have to pass some variables in the functions like, i2c pins, frequency and even the reset pin.
so far with my tests, things are working like a charm, as well i am using RTOS on my esp32 and this is running in a task.
if you guys have any suggestions to this, feel free to. and you can even add the timeout functionality to the same. where we check the (*myData >= 1000) we can add sensor.timeoutOccurred() .

hope this helps.
Cheers! Parth
The Picture…