Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCA Multiplexer can permanently lock the I2C Bus #104

Open
dbgen1 opened this issue Jan 21, 2025 · 4 comments
Open

TCA Multiplexer can permanently lock the I2C Bus #104

dbgen1 opened this issue Jan 21, 2025 · 4 comments
Labels
bug Something isn't working cube killer Bugs that might cause loss of mission for a CubeSat Understood Tag for bugs that have been solved!

Comments

@dbgen1
Copy link
Contributor

dbgen1 commented Jan 21, 2025

If the following scenario occurs, it is possible that the I2C bus can become permanently locked, causing the code to hang indefinitely without actually crashing or rebooting:

How to reproduce

  • The TCA device must be malfunctioning or unreachable at its I2C address. This can be achieved in testing by trying to run the code on just the flight controller with no battery board attached.
  • The TCA try_lock() must be called. This happens naturally when running the code normally. What happens is the following:

Sequence of events

  1. After boot, the device is initialized in pysquared.py. Despite the device not physically existing at the specified address, the initialization of the TCA9548A succeeds.
  2. In main, once operations reaches the point of trying to get data from the faces, the TCA9548A_Channel#try_lock is called (which is correct).
  3. Since the TCA class was initialized, this locks the I2C bus. It then attempts to write to the physical TCA device at its I2C address, but since it doesn't exist, the code errors out.
  4. The I2C bus is never unlocked. The next time the try-lock is called, it enters an infinite loop.

Potential Solution

This may also be an issue for other hardware devices. Ideally, the device drivers should not get to the point of being successfully initialized if the device itself is unreachable at its address. One possible way of fixing this potential issue is by checking the I2C addresses where a device is meant to be before attempting to initialize the device. Another possibility is modifying the device driver to check this itself before doing anything else, and especially before entering any loops.

@dbgen1 dbgen1 added bug Something isn't working cube killer Bugs that might cause loss of mission for a CubeSat labels Jan 21, 2025
@Mikefly123
Copy link
Member

Mikefly123 commented Jan 22, 2025

Very interesting! #25 describes how I discovered the last part of what you're talking about where the try-lock() can enter an infinite loop!

This is how I fixed the infinite loop thing:
Image

@dbgen1
Copy link
Contributor Author

dbgen1 commented Jan 22, 2025

For now, we decided to fork the TCA library to a repo in the proveskit org and add a check for the i2c address in the initialization of the TCA. This issue is also the cause of the board going to sleep forever when tested without a battery board attached (caused by the I2C lock). Preventing the TCA initialization seems to fix the problems, so I am marking this as understood for now, but leaving it open as more testing is required with the forked library.

@dbgen1 dbgen1 added the Understood Tag for bugs that have been solved! label Jan 22, 2025
@Mikefly123
Copy link
Member

I'm linking our forked TCA library here: https://github.com/proveskit/Adafruit_CircuitPython_TCA9548A

@Mikefly123
Copy link
Member

@dbgen1 do you think we could maybe make a PR to merge both our modified TCA library and your updates to the RTC library with main so we can shake down if the changes to the lib are sufficient to quash this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cube killer Bugs that might cause loss of mission for a CubeSat Understood Tag for bugs that have been solved!
Projects
None yet
Development

No branches or pull requests

2 participants