← index #12270PR #555
Off-topic · high · value 0.537
QUERY · ISSUE

ESP32C3 no output on serial after flashing finished.

openby weiyshayopened 2023-08-20updated 2023-08-21
bug

I am trying to build micropython and flash it to an ESP32C3 board while it seems that there is
no output/REPL on the serial port. I have successfully verified my env by:

1. Run the 'hello_world' demo in espressif via 'idf.py build flash'.   
2. Flash the official mp image 'esp32c3-20230426-v1.20.0.bin' and run some demos via mpremote.

My env:

- Ubuntu 22.04
- idf: v5.1
- Micropython: 
commit a18d62e06727e0c424da1f43f80a5b0cb76bcc04 (HEAD -> master, origin/master, origin/HEAD)
Author: robert-hh <robert@hammelrath.com>
Date:   Wed Aug 16 08:41:40 2023 +0200

Flashing cmd & output:

  idf.py -D MICROPY_BOARD=GENERIC_C3 clean
  idf.py -D MICROPY_BOARD=GENERIC_C3 build
  idf.py -p /dev/ttyACM0 flash
  idf.py -p /dev/ttyACM0 monitor
  
esp32-dev-micropython#idf.py -D MICROPY_BOARD=GENERIC_C3 flash
/data/2023/src/esp32c3/micropython/idf/esp-idf/tools/check_python_dependencies.py:12: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  import pkg_resources
Executing action: flash
Serial port /dev/ttyS0
Connecting.......................
/dev/ttyS0 failed to connect: Failed to connect to Espressif device: No serial data received.
For troubleshooting steps visit: https://docs.espressif.com/projects/esptool/en/latest/troubleshooting.html
Serial port /dev/ttyACM0
Connecting....
Detecting chip type... ESP32-C3
Running ninja in directory /data/2023/src/esp32c3/micropython/micropython/ports/esp32/build
Executing "ninja flash"...
[1/225] Performing build step for 'bootloader'
[1/1] cd /data/2023/src/esp32c3/micropython/micropython/ports/esp32/build/bootloader/esp-idf/esptool_py && /root/.espressif/python_env/idf5.1_py3.10_env/bin/python /data/2023/src/esp32c3/micropython/idf/esp-idf/components/partition_table/check_sizes.py --offset 0x8000 bootloader 0x0 /data/2023/src/esp32c3/micropython/micropython/ports/esp32/build/bootloader/bootloader.bin
Bootloader binary size 0x3fb0 bytes. 0x4050 bytes (50%) free.
[2/223] cd /data/2023/src/esp32c3/micropython/micropython/ports/esp32/build/esp-idf/main_esp32c3 && /usr/bin/cmake -E make_directory /data/2023/src/esp32c3/micropython/micropython/ports/esp32/build/genhdr && /root/.espressif/python_env/idf5.1_py3.10_env/bin/python /data/2023/src/esp32c3/micropython/micropython/py/makeversionhdr.py /data/2023/src/esp32c3/micropython/micropython/ports/esp32/build/genhdr/mpversion.h
[3/10] cd /data/2023/src/esp32c3/micropython/micropython/ports/esp32/build/esp-idf/main_esp32c3 && /root/.espressif/python_env/idf5.1_py3.10_env/bin/python /data/2023/src/esp32c3/micropython/micropython/tools/makemanifest.py -o /data/2023/src/esp32c3/micropython/micropython/ports/esp32/build/frozen_content.c -v MPY_DIR=/data/2023/src/esp32c3/micropython/micropython -v MPY_LIB_DIR=/data/2023/src/esp32c3/micropython/micropython/lib/micropython-lib -v PORT_DIR=/data/2023/src/esp32c3/micropython/micropython/ports/esp32 -v BOARD_DIR=/data/2023/src/esp32c3/micropython/micropython/ports/esp32/boards/GENERIC_C3 -b /data/2023/src/esp32c3/micropython/micropython/ports/esp32/build -f-march=xtensawin --mpy-tool-flags= /data/2023/src/esp32c3/micropython/micropython/ports/esp32/boards/manifest.py
[4/5] cd /data/2023/src/esp32c3/micropython/micropython/ports/esp32/build/esp-idf/esptool_py && /root/.espressif/python_env/idf5.1_py3.10_env/bin/python /data/2023/src/esp32c3/micropython/idf/esp-idf/components/partition_table/check_sizes.py --offset 0x8000 partition --type app /data/2023/src/esp32c3/micropython/micropython/ports/esp32/build/partition_table/partition-table.bin /data/2023/src/esp32c3/micropython/micropython/ports/esp32/build/micropython.bin
micropython.bin binary size 0x179dc0 bytes. Smallest app partition is 0x1f0000 bytes. 0x76240 bytes (24%) free.
[4/5] cd /data/2023/src/esp32c3/micropython/idf/esp-idf/components/esptool_py && /usr/bin/cmake -D IDF_PATH=/data/2023/src/esp32c3/micropython/idf/esp-idf -D "SERIAL_TOOL=/root/.espressif/python_env/idf5.1_py3.10_env/bin/python;;/data/2023/src/esp32c3/micropython/idf/esp-idf/components/esptool_py/esptool/esptool.py;--chip;esp32c3" -D "SERIAL_TOOL_ARGS=--before=default_reset;--after=hard_reset;--no-stub;write_flash;@flash_args" -D WORKING_DIRECTORY=/data/2023/src/esp32c3/micropython/micropython/ports/esp32/build -P /data/2023/src/esp32c3/micropython/idf/esp-idf/components/esptool_py/run_serial_tool.cmake
esptool.py --chip esp32c3 -p /dev/ttyACM0 -b 460800 --before=default_reset --after=hard_reset --no-stub write_flash --flash_mode dio --flash_freq 80m --flash_size 4MB 0x0 bootloader/bootloader.bin 0x10000 micropython.bin 0x8000 partition_table/partition-table.bin
esptool.py v4.7.dev1
Serial port /dev/ttyACM0
Connecting....
Chip is ESP32-C3 (QFN32) (revision v0.4)
Features: WiFi, BLE
Crystal is 40MHz
MAC: 48:31:b7:35:cd:bc
Changing baud rate to 460800
Changed.
Enabling default SPI flash mode...
Configuring flash size...
Flash will be erased from 0x00000000 to 0x00003fff...
Flash will be erased from 0x00010000 to 0x00189fff...
Flash will be erased from 0x00008000 to 0x00008fff...
Erasing flash...
Took 0.10s to erase flash block
Writing at 0x00000000... (6 %)
Writing at 0x00000400... (12 %)
Writing at 0x00000800... (18 %)
Writing at 0x00000c00... (25 %)
Writing at 0x00001000... (31 %)
Writing at 0x00001400... (37 %)
Writing at 0x00001800... (43 %)
Writing at 0x00001c00... (50 %)
Writing at 0x00002000... (56 %)
Writing at 0x00002400... (62 %)
Writing at 0x00002800... (68 %)
Writing at 0x00002c00... (75 %)
Writing at 0x00003000... (81 %)
Writing at 0x00003400... (87 %)
Writing at 0x00003800... (93 %)
Writing at 0x00003c00... (100 %)
Wrote 16384 bytes at 0x00000000 in 0.5 seconds (250.6 kbit/s)...
Hash of data verified.
Erasing flash...
Took 2.33s to erase flash block
Writing at 0x00010000... (0 %)
Writing at 0x00010400... (0 %)
Writing at 0x00010800... (0 %)
Writing at 0x00010c00... (0 %)
Writing at 0x00011000... (0 %)
Writing at 0x00011400... (0 %)
Writing at 0x00011800... (0 %)
Writing at 0x00011c00... (0 %)
Writing at 0x00012000... (0 %)
Writing at 0x00012400... (0 %)
Writing at 0x00012800... (0 %)
Writing at 0x00012c00... (0 %)
Writing at 0x00013000... (0 %)
Writing at 0x00013400... (0 %)
Writing at 0x00013800... (0 %)
Writing at 0x00013c00... (1 %)
...
Writing at 0x00188c00... (99 %)
Writing at 0x00189000... (99 %)
Writing at 0x00189400... (99 %)
Writing at 0x00189800... (99 %)
Writing at 0x00189c00... (100 %)
Wrote 1548288 bytes at 0x00010000 in 49.7 seconds (249.4 kbit/s)...
Hash of data verified.
Erasing flash...
Took 0.03s to erase flash block
Writing at 0x00008000... (33 %)
Writing at 0x00008400... (66 %)
Writing at 0x00008800... (100 %)
Wrote 3072 bytes at 0x00008000 in 0.1 seconds (260.7 kbit/s)...
Hash of data verified.

Leaving...
Hard resetting via RTS pin...
Done

  
CANDIDATE · PULL REQUEST

micropython/espflash: A minimal ESP32 bootloader protocol implementation.

closedby iabdalkaderopened 2022-10-20updated 2023-08-19

This is a minimal implementation of the ESP32 ROM bootloader protocol, which allows a board running MicroPython to flash a firmware image to an ESP32 chip. This is not a port of esptool.py because a) It's too complex b) It's GPL, and it's not implemented as a serial pass-through (from esptool.py <-> mcu <-> ESP32) either, because it would still require installing esptool first and running the serial pass-through code. Note this is for ESP32 not ESP32_S/C (although it may still work, because it talks to the ROM bootloader not a stub) and while it's mainly for updating the Nina WiFi firmware, I chose to implement it in Python (vs ioctls) because it makes easier to extend and use for other ESP chips if needed. Example usage, updating the Nina WiFi module firmware, copy firmware binary to storage (with mpremote or MSC) and:

import mesptool

# pre-calculated because RP2 doesn't have MD5.
md5sum = b"9a6cf1257769c9f1af08452558e4d60e"
path = "NINA_W102-v1.5.0-Nano-RP2040-Connect.bin"

esp = mesptool()
esp.bootloader()
esp.set_baudrate(921600) # or slower
esp.flash_init()
esp.flash_write(path)
esp.flash_verify(path, md5sum)
esp.reboot()

Fixes micropython/micropython#8896

Example output:

Failed to read response to command 8. # normal baudrate sync message, might repeat for 1-3 times.
Changing baudrate => 921600
Flash write size: 1129472 total_blocks: 276 block size: 4096
Writing sequence number 0/276...
Writing sequence number 1/276...
Writing sequence number 2/276...
Writing sequence number 3/276...
Writing sequence number 4/276...
Writing sequence number 5/276...
....
Writing sequence number 273/276...
Writing sequence number 274/276...
Writing sequence number 275/276...
Flash write finished
Flash verify file MD5 b'9a6cf1257769c9f1af08452558e4d60e'
Flash verify flash MD5 b'9a6cf1257769c9f1af08452558e4d60e'
Firmware write verified
206 comments
iabdalkader · 2022-10-20

@jimmo

I think you could plausibly make an argument for putting it in python-ecosys/esptool but perhaps micropython/esp32/esptool ? (Not sure the m prefix is necessary? Maybe we could call it espflash or something that conveys exactly what it does?)

No I agree it definitely belongs to micropython because it's not a compatible-with-reduced functionality module (in the sense of API compatibility, functionally maybe :thinking: ), but since it's a micropython-specific module it can be called anything else, the m is to avoid any issues with ESP/ "esptool" and means micropython's esptool. I can change it to espflash but I think if more functionality is added in the future, it might be limiting to name it espflash.

The other reason for putting it in micropython-lib is that we may (?) choose to freeze it into the Nano Connect

Yes, that's the plan once this settled and merged will update the manifest, name might change.

iabdalkader · 2022-10-27

I'll get around to processing all the comments later today, but espflash.py is okay I guess. I also wanted to try to make this work locally, so it doubles as something you can import if the firmware on storage, or a tool you can run like espflash.py --firmware NINA_1.5.0.bin --port /dev/ttyACM0.

dpgeorge · 2022-10-28

I also wanted to try to make this work locally,

Other tools (see eg mip and unittest) do this by having the library be a package, and then there being an additional, optional library which provides __main__.py for the (optional) command-line behaviour. So you'd install espflash for the basic behaviour, and espflash-cmdline for the command-line version (which depends on espflash).

iabdalkader · 2022-10-28

Where would that optional library be ? in micropython-lib ? It would be very convenient if this is somehow accessible from /tools/ like mpremote, because then you wouldn't have to install anything, but it's also probably going to import mpremote (or duplicate some of the functions). For this to work, it will need to switch the board to raw mode, execute a serial pass-through script (something like this one in the docs), and then read/write the from the port, was thinking of using mpremote for that.

iabdalkader · 2022-11-03

@dpgeorge Actually I realized I don't need to implement any of that serial passthrough stuff, because I could mpremote mount the firmware dir, which removes the copy firmware to storage step, so I just wrote this instead:

https://github.com/micropython/micropython-lib/blob/24abd009cc43accdc5be87b89a2d08d3e4380e56/micropython/espflash/espflash-cmd.py

If espflash is frozen, and mpremote is in the path, you can just do this:

python espflash-cmd.py -f ~/path/to//NINA_W102-v1.5.0-Nano-RP2040-Connect.bin

It will also generate the file MD5 hash, and pass it to the script.

dpgeorge · 2022-11-04

Actually I realized I don't need to implement any of that serial passthrough stuff, because I could mpremote mount the firmware dir

That sounds much better! I see you use subprocess to run mpremote, but I think you can just as easily do import mpremote and use it as a library.

I see you intend to run espflash-cmd.py via CPython (not unix MicroPython). So that script is not really part of the espflash package but rather a completely separate thing, that should probably live on PyPI (just like mpremote lives on PyPI). There's still a question of where to host the code though...

iabdalkader · 2022-11-04

I see you use subprocess to run mpremote, but I think you can just as easily do import mpremote and use it as a library.

Yes that works and it's much easier, but I have mpremote symlink in /bin, to always use the latest from the repo, so subprocess was just more convenient for me, but I can switch it to import mpremote.main and just call it.

So that script is not really part of the espflash package but rather a completely separate thing, that should probably live on PyPI (just like mpremote lives on PyPI). There's still a question of where to host the code though...

Yes it's not part of the package, I just wanted to share the code. I have no idea where to host it, I can remove it from the PR and try to publish it to pypi if it makes sense.

iabdalkader · 2022-11-04

I think I fixed all the pending issues, anything else I missed ? I squashed all commits, renamed to espflash, and removed the command line tool for now. Note this will also generate the MD5 hash when verify is called, if hashlib is enabled, and hashlib.md5 enabled and user didn't supply a hash.

dpgeorge · 2022-11-08

Thanks for updating, this looks good now. Rebased and merged in 82f6b18b8876bf6702dcd7bc55f10f1cd94a1571

robert-hh · 2023-03-29

@iabdalkader coming along that path again: Where do I find the actual NINA firmware, you mentioned NINA_W102-v1.5.0-Nano-RP2040-Connect.bin?

iabdalkader · 2023-03-29

@robert-hh You can find it here, starting from 1.5.0 the binaries are attached to releases:

https://github.com/arduino/nina-fw/releases

robert-hh · 2023-03-29

Thanks. I looked at https://github.com/arduino/nina-fw and did not spot it, just up to 1.4.8.

robert-hh · 2023-03-29

Did you rebuild the NINA firmware recently, and if, which esp-idf version did you use? I seemed to have built it in Jan 2022, but now it fails.

iabdalkader · 2023-03-29

I think it requires and older IDF, see https://github.com/arduino/nina-fw#building

I attached 1.4.8 if you need it, but it will not work with MicroPython.

NINA_W102-v1.4.8-Nano-RP2040-Connect.zip

robert-hh · 2023-03-29

Thanks. I still have the old binaries. And I already tried building with v3.3.1. That works, but doing so you have to re-enter all config options for sdkconfig. That's strange.

robert-hh · 2023-03-29

Problem solved. Syncing the files with the repository brought it back to a good state. The attempts to build it with a newer esp-idf had changed sdkconfig.

robert-hh · 2023-03-29

@iabdalkader Since I came along that file, I have a question about extmod/network_ninaw10.c:

In function network_ninaw10_socket_bind(), line 523 ctd., a symbol type is declared and assigned, but not further used, as I can tell. Is that symbol needed, further use forgotten? Sometimes, the compiler complains about it.

iabdalkader · 2023-03-29

@robert-hh No it's not needed, that whole block can be safely removed. Actually I think it was supposed to be in network_ninaw10_socket_socket not sure how it ended in bind(), the NINA socket types have the same values as modnetwork but that may change in the future, so yes best to move that block to network_ninaw10_socket_socket().

robert-hh · 2023-03-30

@iabdalkader I see that you made a PR for the change. Meanwhile I have a test setup running of a SAMD51 board and a generic ESP32 with NINA software. Besides adapting the config files, I had to change drivers/ninaw10/nina_wifi_bsp.c to accept specific settings for the SPI pins, similar to what is used at the wiznet driver. That could get the default, even for RP2. Then setting the SPI1 default pins would be replaced. The change is in the function bsp_init() (see code section below). What surprised me was the fact, that the code increase for WiFi is only ~14k. I had expected a relatively small increase, since most of the code in in the NINA software, but not that small. Maybe something is missing. I did not yet add Bluetooth.
There is at least one SAMD port with the airlift module from Adafruit. That one is a plain vanilla ESP32 with similar connections, only that it uses Pin 14 instead of Pin 12 for MOSI. Using the NINA firmware requires to change that. Did you ever try to run the Airlift firmware with your driver? I made some tests early 2022, but as far as I recall, the command mapping is different.

    // Initialize SPI.
    #ifdef MICROPY_HW_WIFI_SPI_SCK
    mp_obj_t spi_obj = MP_OBJ_NEW_SMALL_INT(MICROPY_HW_WIFI_SPI_SCK);
    mp_obj_t miso_obj = MP_OBJ_NEW_SMALL_INT(MICROPY_HW_WIFI_SPI_MISO);
    mp_obj_t mosi_obj = MP_OBJ_NEW_SMALL_INT(MICROPY_HW_WIFI_SPI_MOSI);
    mp_obj_t args[] = {
        MP_OBJ_NEW_SMALL_INT(MICROPY_HW_WIFI_SPI_ID),
        MP_OBJ_NEW_SMALL_INT(MICROPY_HW_WIFI_SPI_BAUDRATE),
        MP_ROM_QSTR(MP_QSTR_sck), mp_pin_make_new(NULL, 1, 0, &spi_obj),
        MP_ROM_QSTR(MP_QSTR_miso), mp_pin_make_new(NULL, 1, 0, &miso_obj),
        MP_ROM_QSTR(MP_QSTR_mosi), mp_pin_make_new(NULL, 1, 0, &mosi_obj),
    };
    MP_STATE_PORT(mp_wifi_spi) = MP_OBJ_TYPE_GET_SLOT(&machine_spi_type, make_new)(
        (mp_obj_t)&machine_spi_type, 2, 3, args);
    #else
    mp_obj_t args[] = {
        MP_OBJ_NEW_SMALL_INT(MICROPY_HW_WIFI_SPI_ID),
        MP_OBJ_NEW_SMALL_INT(MICROPY_HW_WIFI_SPI_BAUDRATE),
    };
    MP_STATE_PORT(mp_wifi_spi) = MP_OBJ_TYPE_GET_SLOT(&machine_spi_type, make_new)(
        (mp_obj_t)&machine_spi_type, 2, 0, args);
    #endif
robert-hh · 2023-03-30

I tried the upload tool with the Adafruit Aitlift breakout, and it works well. The NINA W102 firmware works too, I just had to change the MOSI pin number from 12 to 14 at the end of the file nina-fw-1.5.0-Arduino/arduino/libraries/SPIS/src/SPIS.cpp.

About SPI & Pin configuration in the MP NINA_W10 driver. Instead of or in additition to the configuration in the header file, a config method like wlan.interface(....) may be useful, allowing a kind of generic WiFi build.

iabdalkader · 2023-03-30

I tried the upload tool with the Adafruit Aitlift breakout, and it works well. The NINA W102 firmware works too, I just had to change the MOSI pin number from 12 to 14 at the end of the file nina-fw-1.5.0-Arduino/arduino/libraries/SPIS/src/SPIS.cpp.

Sadly I can't fix that, but maybe it can be made configurable somehow with an ioctl maybe ?

About SPI & Pin configuration in the MP NINA_W10 driver. Instead of or in additition to the configuration in the header file, a config method like wlan.interface(....) may be useful, allowing a kind of generic WiFi build.

You can make it configurable if you like, yes that would be a good change.

robert-hh · 2023-03-30

Sadly I can't fix that, but maybe it can be made configurable somehow with an ioctl maybe ?

I was not asking to fix that. Since that affects SPI communication, changing that on-the-fly would require a different comms channel than SPI.

You can make it configurable if you like, yes that would be a good change.

I have to think about that. At the moment, it is not urgent, since the including the WiFi package would anyhow require compiling a new firmware, and them changing the board files is not that much of an overhead.

b.t.w.: Together with Bluetooth, the firmware size increase is 100k. That's quite a bit.

iabdalkader · 2023-03-30

b.t.w.: Together with Bluetooth, the firmware size increase is 100k. That's quite a bit.

I'm not following, do you mean the MicroPython firmware when using this driver or the Nina firmware size ?

robert-hh · 2023-03-30

I mean the MicroPython firmware size. The NINA firmware size is something > 1MByte.
The MicroPython firmware size for SAMD51 went up from ~260k to ~360k.

iabdalkader · 2023-03-30

The MicroPython firmware size for SAMD51 went up from ~260k to ~360k.

The driver and module alone can't be taking that much, maybe it's the nimble stack ? Try this maybe you can find out what's taking the space.

arm-none-eabi-nm -S --size-sort --demangle firmware.elf
robert-hh · 2023-03-30

of course it' s the nimble stack.and the aioble srcripts

robert-hh · 2023-03-31

There could be a way how the NINA/Airlift model probe itself to which pin MOSI is connected, assuming that unused pins are left open. It could set one of the pins (like 12) to input mode with alternating pull-up and pull-down and read back the value. An open pin would follow the level, a pin connected to a MOSI of the host would stick at a level.

About configuring the interface, one could use wlan.config() with a "interface" keyword, expecting a tuple of values with spi-object, CS, Busy and RST pins and maybe more, which then can be used by bsp_init().

I skipped bluetooth for the moment and want to set up WiFi first.

robert-hh · 2023-03-31

I noticed the at the NINA support wlan.config(args) is already used to setting up the AP. So I'll better add a new method, like wlan.wiring(). That's more clear, and easier to implement. since it can have just a fixed set of arguments (4 or 6).

iabdalkader · 2023-03-31

Sorry I was thinking of a completely different WiFi module. I don't think we can make those pins configurable in the Nina firmware, you need a working SPI first to communicate with the ESP, so this won't work.

iabdalkader · 2023-03-31

@robert-hh Another idea, you can just take this commit:

https://github.com/arduino/nina-fw/commit/e8c31635e5d9fd03194ccf506b94496487edfb51

Push it to the Adafruit Nina-FW fork here: https://github.com/adafruit/nina-fw

That's the only change needed to make the firmware compatible with MicroPython's modnetwork (and a valid firmware version), and it doesn't affect the old ABI.

robert-hh · 2023-03-31

@iabdalkader But that would not resolve the need of loading a different firmware to a Adafruit module. It would make a difference if that is in the Adafruit firmware by default. And in both cases we have to provide a binary image with the boards instead of pointing to NINA or Adafruit. But providing an own image would anyhow be helpful for people just using a plain vanilla ESP32 module.

iabdalkader · 2023-03-31

Yes you're right, you will need to upload a new firmware anyway for older modules, but future modules (if any) will just work out of the box, but the firmware may change again which doesn't really fix the issue. I think one more option is to automate the firmware build/release on github, make the pins compile-time configurable, then multiple binaries can be released maybe. That involves some work not sure if there's enough interest in reusing the firmware to justify that.

robert-hh · 2023-03-31

That would be an option for the Wifi module firmware. I was just considering an initial simple step of having one firmware being able to detect, whether it should use GPIO12 or 14 for MOSI. That could be just a few lines of C code in a C++ file. Only my C++ knowledge never existed, even if I have Bjarne Stroustrup's book on by book shelf since it was published in 1987.

At the moment I work at changing network_ninaw10.c and ninaw10_bsp.* to have the interface pins configurable, not have it port specific and keep it backward compatible. Too many tiny differences between ports.

iabdalkader · 2023-03-31

. I was just considering an initial simple step of having one firmware being able to detect, whether it should use GPIO12 or 14 for MOSI. That could be just a few lines of C code in a C++ file.

I'd send it in but I don't think it will work, on the Nano RP2040 for example pin 14 is connected to I2C and pulled-up, so I don't think there's a way to reliably probe the pins. However, if the firmware gets used on something else and becomes that popular, I can give the workflow a shot (we could make the firmware have build targets for example).

At the moment I work at changing network_ninaw10.c and ninaw10_bsp.* to have the interface pins configurable, not have it port specific and keep it backward compatible. Too many tiny differences between ports.

That would be a great change, thank you!

EDIT @robert-hh How about this, for now we can just attach the binary to the firmware release ?

robert-hh · 2023-03-31

on the Nano RP2040 for example pin 14 is connected to I2C

At the Airlift devices, Pin 12 is not connected. So if that one is floating, one can assume it's an airlift device. Then Pin 14 has to be used for MOSI.

for now we can just attach the binary to the firmware release ?

Would that be a separate entry? At the moment, a firmware release is a single file. And the WiFi firmware would be used by several boards & ports. That's a novelty.

robert-hh · 2023-03-31

I have a branch for configuring the NINA wifi modules pin connections. It's at https://github.com/robert-hh/micropython/tree/drivers_ninaw10. The last commit represents the final state of the files:
drivers/ninaw10/nina_wifi_bsp.c, drivers/ninaw10/nina_wifi_drv.h and extmod/network_ninaw10.c.
Still works with the Arduino Nano Connect 2040, and my trial setup for SAMD.

iabdalkader · 2023-03-31

At the Airlift devices, Pin 12 is not connected. So if that one is floating, one can assume it's an airlift device. Then Pin 14 has to be used for MOSI.

Still sounds too board-specific, then some other module will be using another pin and we'd have to add special code for it, I can propose the change though and see what happens.

I have a branch for configuring the NINA wifi modules pin connections

It looks good to me, but I don't understand why do you need to set the pins dynamically (from Python) like that when you can just define the pins used by the board in its mpconfigboard.h file ?

robert-hh · 2023-03-31

then some other module will be using another pin

Possible, but at the moment the only separate boards which are easy to use without creating a new board are the Adafruit Airlift ones or generic ESP32 modules. But if we have to provide the firmware anyhow, we can provide two versions. Airlift and NINAW102. Generic ESP32 can work with both.

but I don't understand why do you need to set the pins dynamically (from Python)

Because it should be able to add the WiFi module to a board as an external device, like one would attach a sensor, without requiring to use a specific pin set. The driver would be in the firmware, but not the way how it is attached. That way I can add WiFi e.g. to all samd51 or mimxrt boards.
b.t.w: it now seems to me easier to add the wiring parameters to the wlan constructor and not have a separate method.

By the way. At which place in the code is the UART for Bluetooth configured?

iabdalkader · 2023-03-31

But if we have to provide the firmware anyhow, we can provide two versions. Airlift and NINAW102. Generic ESP32 can work with both.

That's another option yes, okay will raise the issue and see where it goes.

Because it should be able to add the WiFi module to a board as an external device

Ah now I see what you're trying to do, yes that makes sense.. BLE UART is configured in mpbthciport.c

robert-hh · 2023-03-31

raise the issue

?

iabdalkader · 2023-03-31

@robert-hh I will ask the maintainer and propose the charges and see what they say about it, or if you want you can go ahead and open an issue instead in the nina-fw repo.

robert-hh · 2023-04-01

@iabdalkader Looking at possible ebusiness interests, It seems to me that the best chance would be to ask Adafruit to add your BSD extension https://github.com/arduino/nina-fw/commit/e8c31635e5d9fd03194ccf506b94496487edfb51 into their firmware. That would extend their possible customer range for selling Airlift components. And it increases the functionality. In contrast, NINA should have no interest, that their software would run on Adafruit hardware.

b.t.w.: I moved the wiring definition to the WLAN constructor. Defining the GPIO0 pin is not required for WiFi operation. It may be sufficient to keep it pulled-up at the WLAN modem. I have to check in the driver whether the uses of it can be dropped. At least I can disconnect it without any change in behavior. It is read in nina_bsp_read_irq(), but that function is call at no place. It is of course required for Firmware upgrade.

iabdalkader · 2023-04-01

It seems to me that the best chance would be to ask Adafruit to add your BSD extension https://github.com/arduino/nina-fw/commit/e8c31635e5d9fd03194ccf506b94496487edfb51 into their firmware

Yes that would make their boards work with MicroPython, you can cherry-pick the commit and push it in a PR if you like, note their firmware version needs to be at least 1.5.0.

Defining the GPIO0 pin is not required for WiFi operation

It might be used in the future so I suggest you keep it, one extra pin object won't hurt, otherwise I like the idea of making the driver both compile-time and run-time configurable, thanks!

robert-hh · 2023-04-02

A few notes:

  • It turns out that trying to BSD extension https://github.com/arduino/nina-fw/commit/e8c31635e5d9fd03194ccf506b94496487edfb51 causes a few non-trivial merge conflicts. I did not dig further into that.
  • When adding the NINA WiFi support to the mimxrt port seems only to be possible for boards without Ethernet support. The latter use LWIP, and there seems to be a collision with the socket support provided by the NINA firmware. But at least for the other boards one can get simple Network capability.
  • It seems that with NINA the block size for socket.send/recv is limited to 4096.
  • The whole mechanical set-up is not stable. Wiring on breadboards is critical. Too many wires, and often at least one has a bad connection.
robert-hh · 2023-04-03

@iabdalkader So I have some sign of response from the bluetooth part of the NINA module. It took me quite a few moments to determine, that the NINA module has to be reset in a special mode to switch to bluetooth. About that, I have a few questions, which you seemed to have solved for the Arduino port.

  • Switching to WiFi or Bluetooth requires a reset with the CS pin either high or low. I see that sequence in the WiFi driver, but not in the Bluetooth driver. But Bluetooth works on the Arduino board. Does it start by default in that mode, or is that mode selection somewhere else. I see that the UART is set up at mpbthciport.c. At that place I would have expected the mode selection. But it could as well be done by Python code.
  • The NINA Firmware uses the same Pin for MOSI / RTS and ACK/CTS. While the latter can be free assigned (it's just Pin objects), Having SPI MOSI and UART RTS at the same pin is very hardware specific and fits to the RP2 device, but not generally. Is the NINA firmware generic or made specifically to match the RP2. It is of course possible to use different Pins in the NINA firmware, and RTS can as well be set statically.
iabdalkader · 2023-04-03
  • Switching to WiFi or Bluetooth requires a reset with the CS pin either high or low. I see that sequence in the WiFi driver, but not in the Bluetooth driver

That sequence is in drivers/ninaw10/nina_bt_hci.c:drivers/ninaw10/nina_bt_hci.c

  • Is the NINA firmware generic or made specifically to match the RP2. It

I don't think so, the Nina firmware predates the RP2040 chip.

robert-hh · 2023-04-03

That sequence is in drivers/ninaw10/nina_bt_hci.c:drivers/ninaw10/nina_bt_hci.c

Thanks. I had that file not included, and due to the weak links there was no error thrown. The comment below is only value for the Arduino Nano Connect board, which wire the GPIO1/CS and RX signals to the same RP2040 pin.

    // The UART must be re-initialize here because the GPIO1/RX pin is used initially
    // to reset the module in Bluetooth mode. This will change back the pin to UART RX.
iabdalkader · 2023-04-04

The comment below is only value for the Arduino Nano Connect board, which wire the GPIO1/CS and RX signals to the same RP2040 pin

Yes that's only for the Nano, but re-initializing the uart for any board should be harmless, so should work if the pin is multiplexed or not.

robert-hh · 2023-04-04

Yes that's only for the Nano, but re-initializing the uart for any board should be harmless, so should work if the pin is multiplexed or not.

Kind of. If the UART is initialized before calling BLE(), since the UART object was supplied, it will not be re-initialized. The problem is, that the parameters are port- and board-specific. Even if all ports in question have a UART-init method, not all might support hardware flow control, and even if, the pins may not be available. So re-init is only done where the UART parameters are hardcoded in the board's config files.
Using HW flow control seems not to be needed. Just RTS (MISO) must be set low. But I have to test more.

But if the UART is not touched any more, then reset causes the ESP32 boot message to stick in the UART input buffer. So I had to add a little loop into the function nina_hci_cmd() to flush the input buffer before a command is sent to the BLE.

iabdalkader · 2023-04-04

The problem is, that the parameters are port- and board-specific. Even if all ports in question have a UART-init method, not all might support hardware flow control, and even if, the pins may not be available. So re-init is only done where the UART parameters are hardcoded in the board's config files.

If reinit is the problem you can make it configurable (NINA_REINIT_UART_ENABLE or something like that), but there's usually a global mp_bthci_uart handle I'm not sure how you can override that.

robert-hh · 2023-04-04

If reinit is the problem

Reinit is not a problem per se. It is just not needed if RX and CS (or GPIO1) do not share the same pin, as the Arduino board does. And of course one can use the UART handle to call uart.init(). It's just that the number and names of the parameters are board&port-specific, and I want to keep that out of the code as far as possible.

robert-hh · 2023-04-04

The actual BLE for SAMD branch is here: https://github.com/robert-hh/micropython/tree/samd_ble. It surely needs some clean-up and straightening. ble.active(1) works, I have to test actual operation.

robert-hh · 2023-04-05

After testing the WiFi support with various ways for hardware set-up, it seems the these two lines from nina_wifi_bsp.c cause trouble.

    mp_hal_pin_write(MICROPY_HW_NINA_GPIO0, 0);
    mp_hal_pin_input(MICROPY_HW_NINA_GPIO0);

They set GPIO 0 low, which does not happen on a typical ESP32 board. And they sometimes cause the module to fall into the bootloader, it it takes too long to start after reset. Deleting these lines makes the startup more robust. Beside that, it is a good approach to force GPIO0 to high, even if it has an internal pull-up. The Bluetooth init does not set GPIO0.

iabdalkader · 2023-04-05

I followed the same sequence that was in the Arduino C++ driver, but if it's not needed feel free to removed but need to test WiFi/BT still works without it, will give it a test tomorrow and report back.

robert-hh · 2023-04-06

It may be that the major problem here was a pin config specialty at the mimxrt boards, where some pins of the connector are defined, but not wired in the PCB.
I scanned as well through the NINA wifi firmware, and the only pin that is read by normal code is pin 5. GPIO0 is used for CTS if UNO_WIFI_REV2 is set. For RP2040 pin 14 is used for that purpose. With UNO_WIFI_REV2, also Pin 19 is used. Strange enough, Pin 19 is also connected at the SAMD M4 Express Airlift board for RTS. It may be used by the Adafruit firmware.
I have a few more code size figures.

WiFi + utilities: 20k
BLE + nimble stack: 80k
mbedtls: 100k

So having WIFi + mbdetls is by default only feasible for SAMD51x20 and mimxrt boards, until the PR using external flash is merged. There seem still to be problems with the network tests and asyncio. The SAMD board simply stall. The mimxrt boards continue working, but complain at some point about errors in the test scripts, like in ssl_errors.py. So it's progressing with the usual hiccups.

iabdalkader · 2023-04-06

GPIO0 is used for CTS if UNO_WIFI_REV2 is set

That's correct the firmware supports other boards, well if it's not used for Nano-RP2040 it's safe to remove that init sequence.

mbedtls: 100k

Might want to try disabling MBEDTLS_ECP_NIST_OPTIM

robert-hh · 2023-04-06

Might want to try disabling MBEDTLS_ECP_NIST_OPTIM

It was not enabled, and the difference is ~3kB. mbedtls is simply large.

robert-hh · 2023-04-06

It seems that GPIO0 was intended to be used as IRQ line from the NINA to the host. But there is no sign of it being used.

iabdalkader · 2023-04-06

Yes that IRQ feature seemed useless, and it did seem like GPIO0 is not used but I just followed that init sequence like I said, it seems the important I/O is MICROPY_HW_NINA_GPIO1 in that case GPIO0 can be removed but still need to test WiFi and BT just to confirm they still work with it. You can removed it for now in your open PR if you like.

robert-hh · 2023-04-07

This lousy GPIO0! Looking into the NINA firmware again, the GPIO pin is actually used to signal internal states in void CommandHandlerClass::updateGpio0Pin(). There, the pin is calledGPIO_IRQ. It is set as an output when the command handler starts, and signals internal states. Only the MP driver does not use it. So if connected, the MP driver has to set that Pin as an input. I see no place in the NINA firmware where GPIO0 is read as input.
So I suggest to keep it out of the interface definition, and even if it is defined, just set it as input. That avoids potential collisions of either a too-slow-starting Wifi module, which falls into bootloader mode, or two outputs switched against each other, the one low, the other high.

robert-hh · 2023-04-08

@iabdalkader Just to stress that non-related PR again: How did you test BLE when porting to the Arduino Nano Connect? Did you use the test from micropython/tests, and what was you test set-up?

iabdalkader · 2023-04-08

How did you test BLE when porting to the Arduino Nano Connect? Did you use the test from micropython/tests, and what was you test set-up?

No I never run the standard BLE tests, but might be a good idea to start doing so, I use the old ble_advertising.py and this blinky example and nrfconnect:

https://github.com/openmv/openmv/blob/master/scripts/examples/10-Arduino-Boards/Nano-RP2040/04-Bluetooth/ble_blinky.py

I remember that aioble had an issue, that was a long time ago it might be fixed now.

robert-hh · 2023-04-08

Thanks. Blinky works. I'll try a few other examples.

robert-hh · 2023-04-09

I tried to run the tests from multi-bluetooth using the command line:
./run-multitests.py -i pyb:a0 -i pyb:a1 multi_bluetooth/*.py
One has to use two boards as test pair. The results are strange.

  • Using a SAMD and the Arduino RP2040 as test pair, 2 or 4 of the 16 tests pass, all other fail.
  • Using a SAMD and a PYBD_SF6 as test pair, 2 or 4 of the 16 tests pass, all other fail.
  • Using a PYBD as instanc0, the test crashes.
  • Best results are with a ESP32 as instance 0 and PYBD as instance 1, where 12 of 16 tests pass.

Typical error conditions are timeout, like in the log below.

./run-multitests.py -i pyb:a0 -i pyb:a1 multi_bluetooth/perf_l2cap.py
multi_bluetooth/perf_l2cap.py on ttyACM0|ttyACM1: FAIL
### TEST ###
--- instance0 ---

--- instance1 ---

Traceback (most recent call last):
  File "<stdin>", line 211, in <module>
  File "<stdin>", line 127, in instance1
  File "<stdin>", line 58, in wait_for_event
ValueError: Timeout waiting for 23

### TRUTH ###
--- instance0 ---

--- instance1 ---

### DIFF ###
--- /tmp/tmpkj17n9m1    2023-04-09 11:34:10.358979270 +0200
+++ /tmp/tmptmr8x_1s    2023-04-09 11:34:10.358979270 +0200
@@ -2,3 +2,9 @@
 
 --- instance1 ---
 
+Traceback (most recent call last):
+  File "<stdin>", line 211, in <module>
+  File "<stdin>", line 127, in instance1
+  File "<stdin>", line 58, in wait_for_event
+ValueError: Timeout waiting for 23
+
1 tests performed
0 tests passed
1 tests failed: multi_bluetooth/perf_l2cap.py
iabdalkader · 2023-04-09

multi_bluetooth/perf_l2cap.py

You could try to increase the default timeout in that script from 1000 ms to maybe 5000 ms and see if it helps ? Note comparing those different ports to rp2 port might not be very useful.

robert-hh · 2023-04-09

I tried already before posting to increase the timeout - no change. And as rp2 board to compare with a SAMD port I use the Arduino Nano rp2040 connect, which is intended to have the same software set-up.
The test success rate is not the same when redoing the test runs, which is bad as well.

iabdalkader · 2023-04-09

Yes but we don't have any other rp2 bluetooth board to compare to (is PICO_W BT support working?)

It seems both fail on wait_for_event so gets stuck, it might have something to do with the irq function or the default Bluetooth config ?

if(MICROPY_PY_BLUETOOTH)
    list(APPEND MICROPY_SOURCE_PORT mpbthciport.c)
    target_compile_definitions(${MICROPY_TARGET} PRIVATE
        MICROPY_PY_BLUETOOTH=1
        MICROPY_PY_BLUETOOTH_USE_SYNC_EVENTS=1
        MICROPY_PY_BLUETOOTH_ENABLE_CENTRAL_MODE=1
        MICROPY_PY_BLUETOOTH_ENABLE_PAIRING_BONDING=1
        MICROPY_PY_BLUETOOTH_ENABLE_L2CAP_CHANNELS=1
    )
endif()

I'd change those and test see if they make any difference.

robert-hh · 2023-04-09

is PICO_W BT support working?

There is a PR and from the communication it seems working. I could try that. But the test script even fail when using two PYBD_SFx boards. I tried a combination of PYBD_SF2 and PYBD_SF6.
Next thing would be to look into the irq/callback/scheduling path. With the Arduino, you used an alarm timer, with the SAMD/MIMXRT, I used the software timer mechanism just like PYBD. But with respect to the test scripts, there is no difference between Arduino-rp2 and SAMD.
There are subtle differences with respect to the test scripts. For instance the SADM port does not have a deepsleep() method. And so on.

robert-hh · 2023-04-09

If I skip one test (ble_deepsleep.py), the test runs well with two PYBD boards:

15 tests performed
14 tests passed
1 tests failed: multi_bluetooth/stress_log_filesystem.py

With a PICO W & PR10739 build, the figures are slightly worse:

15 tests performed
8 tests passed
4 tests skipped: multi_bluetooth/ble_gap_pair.py multi_bluetooth/ble_gap_pair_bond.py multi_bluetooth/ble_l2cap.py multi_bluetooth/perf_l2cap.py
3 tests failed: multi_bluetooth/ble_mtu.py multi_bluetooth/ble_subscribe.py multi_bluetooth/stress_log_filesystem.py

So at least the test itself works, at least for PYBD. SO let's see if we can find the reason for the timeout, using the NINA firmware approach.

robert-hh · 2023-04-09

I'd change those and test see if they make any difference.

MICROPY_PY_BLUETOOTH_USE_SYNC_EVENTS and MICROPY_PY_BLUETOOTH_ENABLE_PAIRING_BONDING are unconditionally set in nimble.mk. Changing that causes the build to fail.
MICROPY_PY_BLUETOOTH_ENABLE_CENTRAL_MODE and MICROPY_PY_BLUETOOTH_ENABLE_L2CAP_CHANNELS can be enabled or not, but affect functionality and timing. Initially, I did not have them enabled.

iabdalkader · 2023-04-09

./run-multitests.py -i pyb:a1 -i pyb:a0 multi_bluetooth/ble_characteristic.py

This seems to work if rp2 side is central role, and fails if it's peripheral.

iabdalkader · 2023-04-09

@robert-hh Found the issue, it's add_alarm_in_ms running out of timer slots, needs to be canceled first.

robert-hh · 2023-04-09

multi_bluetooth/ble_characteristic.py works often with the SAMD port in both roles, but not always.

it's add_alarm_in_ms running out of timer slots

Excellent. Did you try it with the other test cases as well? Ignore ble_deepsleep.py. That one is weird and fails on the PYBD as well.

iabdalkader · 2023-04-09

It's passing most tests but there's an issue with UART needs to be flushed like you mentioned due to soft-resets maybe, note even if GPIO0 is not changed I get some unexpected HCI packets. I'm working on it now will push a small fix later.

robert-hh · 2023-04-09

For the SAMD, it actually passes 10 out of the 15 tests when in central mode. UART flush may be an issue. GPIO0 is not touched at all for bluetooth, and there is no activity at that line during the tests.
Edit: That's the summary for SAMD as central:

15 tests performed
10 tests passed
5 tests failed: multi_bluetooth/ble_l2cap.py multi_bluetooth/ble_mtu.py multi_bluetooth/ble_subscribe.py multi_bluetooth/perf_gatt_char_write.py multi_bluetooth/perf_l2cap.py
robert-hh · 2023-04-09

Looking through the code here I see, that simply mp_bluetooth_hci_poll() is not called periodically by systick or UART RX irq as it should. STM32 calls it in the RX irq, but I have to add it into the systick timer handler. That's what always puzzled me: how does the BT stack gets notice of new events? Now I know: by polling.

iabdalkader · 2023-04-09

I think it's mostly timing issues, these tests were probably designed for ST chips, with so many timing assumptions, anyway I had to increase the timeout for a few tests and fix some others which fail even on the PYBD_SF2 side, some fixes:

  • ble_deepsleep.py SKIP
  • ble_l2cap.py increase timeout
  • perf_gatt_notify.py increase timeout
  • perf_l2cap.py increase timeout
  • perf_gatt_char_write.py fails with ENOMEM (reduce the number of notifications it works)
  • ble_gatt_data_transfer.py add a small delay after wait_for_event(_IRQ_GATTC_CHARACTERISTIC_DONE
  • ble_gap_device_name.py add a small delay

Now with my local changes peripheral:

16 tests performed
11 tests passed
1 tests skipped: multi_bluetooth/ble_deepsleep.py
4 tests failed: multi_bluetooth/ble_gap_pair_bond.py multi_bluetooth/ble_mtu.py multi_bluetooth/perf_gatt_notify.py multi_bluetooth/stress_log_filesystem.py

Central:

./run-multitests.py -i pyb:a1 -i pyb:a0 multi_bluetooth/*py
16 tests performed
13 tests passed
1 tests skipped: multi_bluetooth/ble_deepsleep.py
2 tests failed: multi_bluetooth/ble_gap_advertise.py multi_bluetooth/perf_gatt_char_write.py

I'll do some more tests and get back to you.

robert-hh · 2023-04-10

That looks much better. Did you change anything of the Arduino RP2 software? You mentioned that in an earlier post.
b.t.w: The ble_deepsleep.py test has a note, that it only applies to devices which do not disconnect USB on calling deepsleep, like the early ESP32 with UART on REPL.
Edit: The SAMD often locks up at the end of the function ble_hs_stop_done(). It processes a linked list there, and that list seems broken, pointing at at itself.

iabdalkader · 2023-04-10

Yes I made some minor fixes and was going to push a PR, but I also see random lockups when running tests, not sure why but I'm investigating.

robert-hh · 2023-04-10

If I stop the code with the debugger when it locks up, it's in ble_hs_stop_done() at that loop

    SLIST_FOREACH(listener, &slist, link) {
        listener->fn(status, listener->arg);
    }

slist is a classical linked list, and when it locks up, it is an a list element where the next entry points to itself. That happens often as the result of an concurrent access, but I do not see one.
I I add a crowbar exit to that loop, it works. But that's not a good fix.

iabdalkader · 2023-04-10

Is this in the SAMD port or rp2 ?

robert-hh · 2023-04-10

SAMD. I have no debugger connected to the RP2, but I could. But there are no good connect points. Or I could just try the RP2 port with the crowbar hack I made to break the endless loop. You could try just a a indication. It's at the end of ble_hs_stop_done() in the file ble_hs_stop.c in the nimble lib. This is the modified loop:

    SLIST_FOREACH(listener, &slist, link) {
        listener->fn(status, listener->arg);
        // Crowbar exit: break on endless loops
        if (listener == SLIST_NEXT(listener, link)) {
            break;
        }
    }

It not a valid fix, but it shows that this list is broken.
Edit: With that change, up to 12 tests in peripheral mode pass even without changing the timeout values in the test. The com ports swapped and I did not notice it.

robert-hh · 2023-04-10

I have a possible reason for the list crash. When the board is in lock state, The call stack shows an entry for pendsv as well the call to mp_bluetooth_deinit(), which is called by the test scripts. The timer task called by pendsv which polls uart for new BT events. Handling these events can also modify the list. So we have mp_bluetooth() deinit, which tries to short down BT, and the timer task which processes tasks. The only port specific place which is called during BT shutdown is mp_bluetooth_hci_controller_deinit(), but that is called AFTER mp_bluetooth_nimble_port_shutdown(), which calls in the end ble_hs_stop_done().

Long introduction, change:

I added code to stop the timer to mp_bluetooth_hci_controller_deinit() in mpcthciport.c and moved in the function mp_bluetooth_deinit() in modbluetooth_nimble.c the call to mp_bluetooth_nimble_port_hci_deinit() to the start of the function, before calling mp_bluetooth_nimble_port_shutdown(). That seems to fix the lock-up during deininit.

iabdalkader · 2023-04-10

Do you mean mpbthciport.c in SAMD port ? I can't find your port so not sure how you implemented mpbthciport.c but in rp2 port the timer schedules the callback with mp_schedule not pendsv, and in either case the the bluetooth module is configured by default to lock when needed (I actually will remove this because it's not needed). I also made fixes to the HCI UART and timer, and now I don't see any more lockups, the test count isn't great but improves a lot if timeouts are increased.

robert-hh · 2023-04-10

I mean the SAMD port, which uses softtimer like the STM32 port, and softtimer uses pendsv. But the change did not work for PYBD. So I have to use something else. I know that the rp2 port uses mp_schedule.
I have seen the locking mechanism, but it is not needed for SAMD, and I agree that teh RP2 port does not need it as well. The SAMD port is not at the repository, at least not the last state.

iabdalkader · 2023-04-10

I mean the SAMD port, which uses softtimer like the STM32 port, and softtimer uses pendsv.

That's not what the locking is for, it's to protect the ringbuf if pendsv_dispatch is used to schedule mp_bluetooth_hci_poll vs mp_sched_schedule_node (SYNC events mode), it doesn't matter if pendsv is used for the timer callback, note stm32 port disables the locking, see: https://github.com/micropython/micropython/blob/master/ports/stm32/mpconfigport.h#L297

robert-hh · 2023-04-10

The SAMD code usys SYNC_EVENTS, and then the MICROPY_PY_BLUETOOTH_ENTER/_EXIT defines are practically empty. And it also uses schedule to run mp_bluetooth_hci_poll(). I wonder if/why that does not apply to the rp2 port, because it uses the same mechanism, that mp_bluetooth_hci_poll() calls mp_bluetooth_hci_poll_in_ms(128); to scan again the interfaces. The way the alarm works should not make a difference.

b.t.w: The samd_ble branch is now in the repository. Could you make a copy of the modified test scripts available for comparison of the ports?

iabdalkader · 2023-04-10

Yes the rp2 uses the same mechanism, I still get a lot of timeouts, but no crashes so far, I get between 6 and 12 tests with most recent changes here: https://github.com/micropython/micropython/pull/11234

And in nina_bt_hci.c I skip 0xFE:

        // There seems to be a sync issue with this fw/module.
        if (i == 0 && (buf[0] == 0xFF || buf[0] == 0xFE)) {
            continue;
        }

. Could you make a copy of the modified test scripts available for comparison of the ports?

I only increased TIMEOUT_MS of all scripts, disabled multi_bluetooth/ble_deepsleep.py and also disabled multi_bluetooth/stress_log_filesystem.py because it fails on pyboards too.

iabdalkader · 2023-04-10

@robert-hh Actually running the tests a few times causes instance0 (RP2) to lock up, since SAMD locks up too, I can only assume that it's probably something in the nimble stack.

robert-hh · 2023-04-10

The changes look pretty straight. Did you have the impression that increasing rxbuf is needed? I tried with 512 and did not have an impression of a change. But it should not hurt. It still locks up.

robert-hh · 2023-04-10

Actually running the tests a few times causes instance0 (RP2) to lock up

Welcome to the party. As written above, it's the list of tasks in ble_hs_stop_done() which gets corrupted, such that this function never stops. That happens in mp_bluetooth_deinit(), when at hteh same time mp_bluetooth_hci_poll() is called by an running alarm/timer.
But for today, I'm done.

iabdalkader · 2023-04-10

That happens in mp_bluetooth_deinit(), when at hteh same time mp_bluetooth_hci_poll() is called by an running alarm/timer.

but mp_bluetooth_hci_poll is Not called directly in the timer callback, it's scheduled via mp_sched_schedule_node, so it's called by the VM and can't be called at the same time as mp_bluetooth_deinit. Anyway, no matter what the module is sending (or not) this shouldn't happen, so it's a bug somewhere else.

Did you have the impression that increasing rxbuf is needed?

Maybe not but it's the same size used in stm32 port, just trying to cover all the possibilities.

robert-hh · 2023-04-10

mp_bluetooth_deinit() calls mp_bluetooth_nimble_port_shutdown(), and there is a wait loop with MICROPY_EVENT_POLL_HOOK, which AFAIK allows scheduled tasks to run.

iabdalkader · 2023-04-10

Ah I see, well if that's the case it should be an easy fix.

iabdalkader · 2023-04-11

FWIW if I comment out that MICROPY_EVENT_POLL_HOOK it still locks up, so it could be something else, but your earlier "fix" seems to work so fat no lockups. Anyway, getting good test results now, latest numbers:

./run-multitests.py -i pyb:a0 -i pyb:a1 multi_bluetooth/*py
16 tests performed
9 tests passed
2 tests skipped: multi_bluetooth/ble_deepsleep.py multi_bluetooth/stress_log_filesystem.py
5 tests failed: multi_bluetooth/ble_gap_pair.py multi_bluetooth/ble_gattc_discover_services.py multi_bluetooth/ble_mtu.py multi_bluetooth/ble_subscribe.py multi_bluetooth/perf_l2cap.py

And

./run-multitests.py -i pyb:a1 -i pyb:a0 multi_bluetooth/*py
16 tests performed
13 tests passed
2 tests skipped: multi_bluetooth/ble_deepsleep.py multi_bluetooth/stress_log_filesystem.py
1 tests failed: multi_bluetooth/ble_subscribe.py
iabdalkader · 2023-04-11

Maybe @dpgeorge can comment on this issue ?

https://github.com/micropython/micropython-lib/pull/555#issuecomment-1501772821

robert-hh · 2023-04-11

I learned as well that skipping MICROPY_EVENT_POLL_HOOK did not help. Meanwhile I made a few tests blocking packet processing in mp_bluetooth_hci_poll() if the state is MP_BLUETOOTH_NIMBLE_BLE_STATE_STOPPING. That prevents the lock-up, but the device does not reach the stopped state. So the processing is required. It's simply weird. The only good fix would prevent modifying that list concurrently.

iabdalkader · 2023-04-11

@robert-hh I noticed that I can get it to lockup very easily if I just trigger an HCI command timeout (in nina_hci_cmd), for example if you just set a mismatching baudrate, HCI command will timeout and the board locks up immediately after. Note there's MICROPY_EVENT_POLL_HOOK in that loop but it's not causing the issue, I think it actually might be the error_printf in that loop, for some reason it's causing the lockup. If I comment it out, I don't get any lockups without your linked list fix and even if all HCI commands timeout, i.e this line:

error_printf("timeout waiting for HCI packet %d\n");
robert-hh · 2023-04-11

So I try to sort it out, so I do not have to recall that again and again.

  • the list ble_hs_stop_listeners is modified by calls to ble_hs_stop_register_listener(), which gets the element to be inserted at argument.
  • ble_hs_stop_register_listener() is called by ble_hs_stop_begin() which in turn gets this to-be-added element (which can be a list) as parameter.
  • ble_hs_stop_begin() is called by ble_hs_stop() which in turn gets this to-be-added element (which can be a list) as parameter.
  • ble_hs_stop() is called by mp_bluetooth_nimble_port_shutdown() which provides a list called ble_hs_shutdown_stop_listener as argument, which is local to modbluetooth_nimble.c.

About the error_printf: that seems weird, but I'll cross-check. Lock-ups happen here pretty fast. It seems that I removed a different MICROPY_EVENT_POLL_HOOK for no improvement.

iabdalkader · 2023-04-11

There might be more than one issue, but what I'm seeing is that commenting out the error_printf fixes lockups for me even with your linked list fix reverted, and HCI command timing out, I just get no more lock ups 🤷🏼‍♂️ it might be because the printf was allocating/failing to alloc memory or something.

robert-hh · 2023-04-11

It's running here well so far. I'll let it got for at least as dozen cycles. That would be an easy fix. Did you ever see that printout text on the console?
Edit: What puzzles me is that this part of the code is run only once at startup. It is not really part of the BT stack. And there are more error_printf() in the code.

iabdalkader · 2023-04-11

It's running here well so far. I'll let it got for at least as dozen cycles. That would be an easy fix. Did you ever see that printout text on the console?

Yes, you can trigger it by just setting any wrong baudrate, it will print the error messages and lock up immediately, the next test(s) will keep failing, until the run-tests script serial read raises an error.

Edit: What puzzles me is that this part of the code is run only once at startup

Yes but if it prints even once, something breaks, maybe an assert, maybe something that leads to memory corruption, I'm not sure.

robert-hh · 2023-04-11

But that would be true for all other print statements in the function as well. I commented out the definitioj of error_printf. Save a bit of code.

iabdalkader · 2023-04-11

But that would be true for all other print statements in the function as well. I commented out the definitioj of error_printf. Save a bit of code.

Yes but that line is the most likely to happen (due to the 0xFF or 0xFE byte sometimes received on reset) but I was just about to tell you that you should disable error_printf like so, or replace with debug_printf anyway

#define error_printf(...)   //mp_printf(&mp_plat_print, "nina_bt_hci.c: " __VA_ARGS__)

EDIT: Actually any of the printfs could be triggering the issue, anyway the above fix works.

robert-hh · 2023-04-11

@iabdalkader As expected, the problem is not gone. It does not lock-up if I use the pre-existing timeout values of the test scripts. It locks if I increase the timeout values. Not surprisingly, this problem is very timing sensitive. And it is some relief that it does not depend on a print statement being there or not.

iabdalkader · 2023-04-11

Sigh, well then I really have no idea what the issue might be, note that whatever the Nina module is doing/not doing should not cause this to happen (i.e the stack should handle this case gracefully).

And it is some relief that it does not depend on a print statement being there or not.

Note that print statement in the nina_hci_cmd has the effect of making that function run for longer, and linger in there, increasing the likelihood of whatever is causing this to occur. Like I said before, if you set a mismatching baudrate, leaving the printfs on, you get a lockup almost immediately after the first test. I have no way of debugging the RP2040, so guess it's up to you now.

iabdalkader · 2023-04-11

@robert-hh BTW did you test the PICO_W bluetooth PR ? This is what I get:

16 tests performed
6 tests passed
6 tests skipped: multi_bluetooth/ble_deepsleep.py multi_bluetooth/ble_gap_pair.py multi_bluetooth/ble_gap_pair_bond.py multi_bluetooth/ble_l2cap.py multi_bluetooth/perf_l2cap.py multi_bluetooth/stress_log_filesystem.py
4 tests failed: multi_bluetooth/ble_gap_connect.py multi_bluetooth/ble_gatt_data_transfer.py multi_bluetooth/ble_mtu.py multi_bluetooth/ble_subscribe.py
robert-hh · 2023-04-11

Interesting. I cannot recall whether I tested it. Another not about my test here with changed timeout. If I do a power cycle before each test, in runs well.

iabdalkader · 2023-04-11

If you enable more BT options in CMake, more tests fail:

16 tests performed
7 tests passed
2 tests skipped: multi_bluetooth/ble_deepsleep.py multi_bluetooth/stress_log_filesystem.py
7 tests failed: multi_bluetooth/ble_gap_connect.py multi_bluetooth/ble_gap_pair.py multi_bluetooth/ble_gap_pair_bond.py multi_bluetooth/ble_l2cap.py multi_bluetooth/ble_mtu.py multi_bluetooth/ble_subscribe.py multi_bluetooth/perf_l2cap.py

But no lockups, but it's using a different BLE stack.

robert-hh · 2023-04-11

The number of pass and fail changes with every run.

robert-hh · 2023-04-12

Just today, Damien published a PR for the PICO W using the nimble stack. SO we could try if that locks up as well. Looking at the PR, I see that he uses a 512+5 byte buffer for TX as well. We could try that with the NINA adapter. Although it should not matter. The default TX buffer size is 256, and the timeout is 1000. So the call to write would not return unless a timeout happens. And at 115200 that is not likely to happen, especially since I never have seen any activity on the CTS line, telling that the NINA is not able to accept more data.
The good thing with the PICO W is that it has the DEBUG signals easy accessible. I just have to wire them.

iabdalkader · 2023-04-12

I could test it, but where is this PR ?

robert-hh · 2023-04-12

PR #10739, the old one. I just ran a test.

16 tests performed
7 tests passed
6 tests skipped: multi_ble_nina/ble_deepsleep.py multi_ble_nina/ble_gap_pair.py multi_ble_nina/ble_gap_pair_bond.py multi_ble_nina/ble_l2cap.py multi_ble_nina/perf_l2cap.py multi_ble_nina/stress_log_filesystem.py
3 tests failed: multi_ble_nina/ble_gap_advertise.py multi_ble_nina/ble_mtu.py multi_ble_nina/ble_subscribe.py

More to follow.

iabdalkader · 2023-04-12

That one uses btstack not nimble I think, did you change it to build with nimble ?

iabdalkader · 2023-04-12

Well either way, whatever is causing it to fail those tests is failing on the Nina too and I think the "reference" should work first before trying to fix anything.

robert-hh · 2023-04-12

Seen that. I Have to try again. This one with nimble:

16 tests performed
12 tests passed
2 tests skipped: multi_ble_nina/ble_deepsleep.py multi_ble_nina/stress_log_filesystem.py
2 tests failed: multi_ble_nina/ble_gap_advertise.py multi_ble_nina/ble_mtu.py
iabdalkader · 2023-04-12

I found a way to reproduce the lockup issue more consistently, at least on the Nano, maybe it could help with debugging this issue on SAMD too ? The attached firmware blocks before initializing the BT controller (in ninafw/main/sketch.ino.cpp) if you load it with espflash, the HCI command times out, then the lockup happens and the second test will raise the serial errors.

NINA_W102-v1.5.1-Nano-RP2040-Connect.zip

robert-hh · 2023-04-12

I can try that. Besides that I will make a set-up with a genuine RP2 and a ESP32 breakout board to run the Arduino Nano firmware. Then I have easy access to all pins and can connect a debugger.

robert-hh · 2023-04-12

That approach does not work. I can build the firmware, load it and connect it to a BLE modem. That gets connected. And I can connect the debugger, even if it is only GDB at the moment, but when running the test the board fails with an error in the USB stack while trying to write a single byte to USB. That's something else.
Edit: If I run the tests slowly one-by-one, the board does not crash.

iabdalkader · 2023-04-12

Sorry I'm not following you.. The firmware attached above blocks before initializing the BT controller ( I changed it to do that) and that triggers the lockup on/after the first test for some reason. Note I'm on this branch though https://github.com/micropython/micropython/pull/11234 then I load the firmware above, and run just 1 test:

./run-multitests.py -i pyb:a0 -i pyb:a1 multi_bluetooth/ble_characteristic.py

After that the Nano is locked up, serial port doesn't respond.

robert-hh · 2023-04-12

I did not test with that firmware yet. Instead I first have set up a working version of a RP2, which I can debug. That one works now more or less. But it's a breadboard set-up, and the wiring is not reliable. So I may go back using the SAMD. One observation which matched your's. If I power cycle or reset the host (RP2 or SAMD) the operation is way more reliable. The Nimble stack has in internal state machine, which is reset on a power cycle. So one reason may be, that this state machine was not properly reset with one test, and the next test will break it. That's to be tested.

robert-hh · 2023-04-12

OK. So I loaded that firmware, and it locks up, but waiting for a response. So it seems to be different .

iabdalkader · 2023-04-12

@robert-hh Yes but it can happen with any HCI command that times out so it shouldn't cause the board to lockup like that, so I checked the code and it locks up because mp_bluetooth_nimble_hci_uart_wfi keeps getting called and calling mp_bluetooth_hci_uart_readchar in a tight loop for 2 seconds without calling MICROPY_EVENT_POLL_HOOK, not servicing USB task etc.. so if you add an event poll that at the end of mp_bluetooth_nimble_hci_uart_wfi it will not lockup. I don't know if that's a right place to add an event poll or if it fixes the other issue or not, but it was worth a shot.

robert-hh · 2023-04-12

That's indeed the place which pop's up in the debugger. And fixing that is worth on it's own. Another test I made here with the SAMD set-up that locks up at about every second run: if I add a delay of 5 seconds into run-multitests.py between running each test, it does not lock up any more. That would be another indication that the problem is caused by new messages coming in while the shutdown is in progress. Still that ble_hs_stop_listeners list should not get corrupted.

robert-hh · 2023-04-13

Still that ble_hs_stop_listeners list should not get corrupted.

Some more insight. There are quite a few places, from which elements are added to the list as part of command handling. These elements are typically single static data structures with just one element with a constant head address. If by chance an element is inserted twice in a row, the "next" pointer points to the same address as it's head, and that is what we see.
As a proof: If I block inserting an element to the list which is already at it's top, then there is no lock-up. Next step would be to analyze, why that is happening.

iabdalkader · 2023-04-13

If you suspect that it has something to do with the timer/scheduling, you could call the poll function directly from UART1's IRQ handler (this should work I think), and disable the timer, also the BT enter/exit can lock the scheduler.

robert-hh · 2023-04-13

The PYBD calls the poll function from the UART IRQ handler. But that's not possible for SAMD, because the UART in question is not always the same.
It's still strange. There is just one function which inserts this special list element, and that is mp_bluetooth_nimble_port_shutdown(), which in turn is only called be mp_bluetooth_deinit(). The latter is called under 2 instances: Either direct from main.c after a Soft-reset, or from mp_bluetooth_init() when the init times out. The actual time window for the fail is pretty small, just a few clock cycles, because once ble_hs_stop_done() starts, it will clean out the head of the ble_hs_stop_listeners list, and the lockup cannot happen any more.

iabdalkader · 2023-04-13

I'm not sure I understand what that function is doing, so it seems like it's storing NULL in the list head but then loops over the list ? What is the point of doing that ? Does it expect a listener to be registered when unlock is called and before the loop ? Or am I missing something ?

EDIT Oh okay I see list head is copied first, but even if the store works yeah that wouldn't fix it because the list head is copied first anyway (so ble_hs_stop_listeners is corrupted).

iabdalkader · 2023-04-13

@robert-hh Sounds familiar 🤔 ?

https://github.com/apache/mynewt-nimble/issues/736

iabdalkader · 2023-04-13

This might also be related:

https://github.com/micropython/micropython/issues/10477

robert-hh · 2023-04-13

This function is called at the end of the shutdown process. During that, handlers can be registered in the list, which are called then. Here, it is only a single callback. The function saves the list head into the variable listener, sets the list head to 0 and then processes the list. The inserted list elements are single static variables. Therefore an loop is created if that element is added twice.
Setting the list head to 0 seems to fail. But magically, when I trace it in the debugger, the list head is zero again.
I have a patch at the insert function that compares, if the element is already at the head and then it does not insert. That's almost as crude as the previous crowbar hack, but is works:

static void
ble_hs_stop_register_listener(struct ble_hs_stop_listener *listener,
                              ble_hs_stop_fn *fn, void *arg)
{
    BLE_HS_DBG_ASSERT(fn != NULL);

    listener->fn = fn;
    listener->arg = arg;
    if (listener != SLIST_FIRST(&ble_hs_stop_listeners)) {
        SLIST_INSERT_HEAD(&ble_hs_stop_listeners, listener, link);
    }
}

But that's a bad solution. a) because it only compares against the list head, and b) in the BLE stack. I would prefer a fix in the MP part of the code. I'll ask Jimmo if he has a clue. He made a great part of the code.
By the way: Do you accept if I rename MICROPY_HW_NINA_GPIO1 into MICROPY_HW_NINA_CS. That name confuses me all along, because it seems wrong, not matching the purpose.

robert-hh · 2023-04-13

Oh yes, the two issues you linked seem familiar. It's exactly the same situation. And no fix. I have looked also at the source repository whether the code has changed. But id did not.

robert-hh · 2023-04-14

I had the test running over night for 100 iterations, and it worked fine. Between 3 and 11 tests passed. But I test now another variant providing a set of nodes that could be added to the ble_hs_stop_listeners list, such that not the same static node is used if the functions is called more than once. That change is in extmod(bluetooth_nimble.c within the MP source tree. Adding some debug statements it seems that this second call of shutdown happens after a 2 second timeout somewhere in the BT stack.

robert-hh · 2023-04-14

This bug drives me crazy: alternating the to-be-inserted node is NOT a fix. It seems that the same insert call is just repeated for some weird reason, if a timer interrupt happens at the wrong moment.

iabdalkader · 2023-04-14

I may get a jig for the Nano to be able to help with debugging it but may take some time. Re the pin names, they just match the pins names on the ESP32 (except for ACK), I think it will be confusing if they are renamed.

BTW when you use any temp workaround for that bug, does the test pass ratio improve ?

robert-hh · 2023-04-14

they just match the pins names on the ESP32 (except for ACK),

Actually not. MICROPY_HW_NINA_GPIO1 is connected to GPIO5, and it's logical function is CS. That's why it is confusing. GPIO0 and RESET match the Pin names. ACK is called BUSY by Adafruit, and it's function can be interpreted in either way. So it's just MICROPY_HW_NINA_GPIO1 which is weird.

when you use any temp workaround for that bug, does the test pass ratio improve ?

No. The pass ratio varies between 3 and 11.

I may get a jig for the Nano to be able to help with debugging it but may take some time.

I get more and more to the conclusion that this is not directly related to the code, but to the way the MCU behaves when it is interrupted and returns from the interrupt. As if then the interrupted statement is repeated or the stack is affected. No magic accepted!

robert-hh · 2023-04-14

So I think I understand now what happens.

At the end of the call chain for mp_bluetooth_deinit(), ble_hs_stop_done() is NOT called if there are still connections open. Instead, a timed event is scheduled to call ble_hs_stop_done() after a timeout with a default of 2000ms. If the board is restarted before that timeout happens, ble_hs_stop_done() was not called and the list ble_hs_stop_listeners was not reset. Then, the next call to mp_bluetooth_deinit() would add a list element which is already in the list, and the lockup happens eventually. That explains, why:

  • when I do a hard reset between tests, all runs fine - being static data, the list is reset.
  • when I add a delay between the individual tests, all runs fine - I used 5 seconds, and the event timeout is 2 seconds, so the clean-up event happens.

Finally, it is more a test artifact than a coding problem. Wrong mp_bluetooth_deinit() will wait until ble_hs_stop_done() is called, since the callback setting MP_BLUETOOTH_NIMBLE_BLE_STATE_OFF is called from there.

iabdalkader · 2023-04-14

Actually not. MICROPY_HW_NINA_GPIO1 is connected to GPIO5

Ah yes you're right, I checked the schematics... Well in that case I guess it's okay to rename it if you want.

If the board is restarted before that timeout happens

That makes a lot of sense, but just wondering isn't the stack (or modbluetooth) reinitialized on soft-reboot ? Maybe it should be fixed in mp_bluetooth_deinit(); I did also notice that tests that fail in batch mode pass when run individually, I wonder if this is fixed if it could help with the test pass/fail ratio...

robert-hh · 2023-04-14

but just wondering isn't the stack (or modbluetooth) reinitialized on soft-reboot ?

It is. But the cursed list and the variable counting connections are local to ble_hs_stop.c, and there is no interface to access them directly. So the only way to reset the list is running ble_hs_stop_done(), which is also declared as static and will only be called trough ble_hs_stop(), which will not call it if connections are still open, or the timeout event.
b.t.w: 3 seconds wait between tests is sufficient.

iabdalkader · 2023-04-14

Okay at least we know the issue, great job! You should probably report it or send a fix to https://github.com/micropython/mynewt-nimble/
BTW I've been testing configuration and code changes in ninafw build to see if I can improve the test pass rate, I tried disabling sleep/low-power, faster baudrate, higher CPU frequency, increasing queues, heaps, thresholds, but nothing seems to have any detectable effect on stability. I suspect it might be the very old esp-idf, but it's not easy to build against a newer one, will try it later.

robert-hh · 2023-04-14

There seems to be a fix. Instead of waiting in mp_bluetooth_nimble_port_shutdown() for
mp_bluetooth_nimble_ble_state != MP_BLUETOOTH_NIMBLE_BLE_STATE_OFF one can wait for
ble_hs_enabled_state != BLE_HS_ENABLED_STATE_OFF. That value is set by ble_hs_stop() and ble_hs_enabled_state is public. Then, the callback which caused the trouble is not needed any more and the cursed list stays empty.
b.t.w.: My analysis above looks wrong.

robert-hh · 2023-04-14

With that change I could run the test 100 cycles, but the pass rate was lousy.
About the lockup: Since mp_bluetoot_deinit() cannot terminate regularly without the cursed list being reset, my recent hypothesis is, that the script is prematurely terminated by the test environment sending Ctrl-C. But I give that problem a break for a while. Sometimes it helps.

robert-hh · 2023-04-16

I added a PR to micropython/mynewt-nimble preventing duplicate entries into the stop_listeners list. https://github.com/micropython/mynewt-nimble/pull/2 From the program structure this is more straight than my other change. I hope someone will care.

iabdalkader · 2023-04-16

Yes hopefully that will be merged but might take some time to make it back to micropython. In the meantime, I'll try building the firmware with a more recent esp-idf, see if there's any improvement in the test pass rate.

robert-hh · 2023-04-16

There is a constant pattern in the test logs of failed tests:
The start with a section where tests seemed to be repeated because of timeout, but then the last pair of instance0 and instance1 log show the expected result. And it's both instance0 and instance1 that have timeouts.
So besides the timeout, the communication works. I have set the timeout in the tests to 10 seconds, but that did not change the general pattern.

robert-hh · 2023-04-17

Unless I get a better idea, I suspend BLE testing now. The chance that a test passes is about 50%. Few tests pass always, few never, and the others sometimes. Did you make any progress in test coverage? Or the port to a newer IDF version?

iabdalkader · 2023-04-17

Unless I get a better idea, I suspend BLE testing now. The chance that a test passes is about 50%. Few tests pass always, few never, and the others sometimes. Did you make any progress in test coverage? Or the port to a newer IDF version?

No, may get around to it tomorrow, and I'm getting a breakout so will be able to debug the board very soon.

iabdalkader · 2023-04-19

@robert-hh I tried rebuilding the firmware with a more recent esp-idf, I tried a few tags they all either don't build, or don't work or don't make any difference, anything > v3.x.x doesn't link.

robert-hh · 2023-04-19

Yes. I ran into that by accident when attempting to build the firmware. adapting it to 4.x.x seems like a lot of work. v3.3.6 builds, but I did not try it. I'm still not sure whether the low test pass rate it's a NINA firmware issue or one of the test scripts. In the latter case, changing the NINA firmware might not help.

iabdalkader · 2023-04-20

I'm still not sure whether the low test pass rate it's a NINA firmware issue or one of the test scripts. In the latter case, changing the NINA firmware might not help.

Ah yes, I noticed that too, making minor changes to where events are polled, poll divisor, locking the scheduler, using a higher baudrate etc.. seem to change the test results, so I'm not sure either, but I thought I should test making changes to the firmware just in case, but none of my changes seem to have any effect.

robert-hh · 2023-04-22

From time to time I try something else with BLE. This time, I added a call to mp_bluetooth_hci_poll_now() into the UART IRQ handler and verified in the debugger that is is called. It did not improve the test pass rate. Only the communication timing got a little bit more compact.

robert-hh · 2023-04-24

This nigh I ran again test for going through the test suite 100 times, with a SAMD board, a MIMXRT board and the Arduino Nano RP2040 Connect as test vehicle. Results below. It shows, that at least for the MIMXRT and RP2 board all tests pass sometimes, with varying probability. The test for perf_gatt_notify is interesting insofar as the test rating was often fail, but the line showing that the test was indeed successful was printed. In that case I rated it as pass. It seems that the termination of the test failed. That may be the case for other tests as well.

Total of 100          SAMD51 ItsyBitsy M4           
                      
    77        77 %    ble_characteristic            
    100       100 %   ble_gap_advertise             
    100       100 %   ble_gap_connect               
    63        63 %    ble_gap_device_name           
    40        40 %    ble_gap_pair_bond             
    54        54 %    ble_gap_pair                  
    55        55 %    ble_gattc_discover_services   
    63        63 %    ble_gatt_data_transfer        
    83        83 %    ble_l2cap                     
     0         0 %    ble_mtu                       
    57        57 %    ble_subscribe                 
    94        94 %    perf_gatt_char_write          
    58        58 %    perf_gatt_notify              
    58        58 %    perf_l2cap                    
    AVG               
    64        64 %    
Total of 100          Teensy 4.0                     
                      
    83        83 %    ble_characteristic             
    100       100 %   ble_gap_advertise              
    100       100 %   ble_gap_connect                
    25        25 %    ble_gap_device_name            
    68        68 %    ble_gap_pair_bond              
    50        50 %    ble_gap_pair                   
    47        47 %    ble_gattc_discover_services    
    62        62 %    ble_gatt_data_transfer         
    85        85 %    ble_l2cap                      
     3         3 %    ble_mtu                        
    60        60 %    ble_subscribe                  
    92        92 %    perf_gatt_char_write           
    34        34 %    perf_gatt_notify               
    78        78 %    perf_l2cap                     
    Avg               
    63        63 %    
Total of 100           Arduino NanoRP2040 Connect  
                       
     40        40 %    ble_characteristic          
    100        100 %   ble_gap_advertise           
    100        100 %   ble_gap_connect             
     71        71 %    ble_gap_device_name         
     71        71 %    ble_gap_pair_bond           
     53        53 %    ble_gap_pair                
     72        72 %    ble_gattc_discover_services 
     52        52 %    ble_gatt_data_transfer      
     31        31 %    ble_l2cap                   
     1          1 %    ble_mtu                     
     33        33 %    ble_subscribe               
     91        91 %    perf_gatt_char_write        
     42        42 %    perf_gatt_notify            
     31        31 %    perf_l2cap                  
                       
     56        56 %    
iabdalkader · 2023-04-26

It seems that ble_mtu may have clue as to why the tests fail since it almost always fails. I don't have any progress on my side, I did try something which seemed promising, I tested removing the soft-reboot loop (not breaking from for loop) the first few tests seem to always pass but then tests start to fail. If I have more time this week will try to get it working with btstack and see if it makes any difference.

robert-hh · 2023-04-26

I have btstack as short attempt, but it had a lot of compile errors. But it may work better. But still the problem may be the espressif implementation. It seems, that the NINA code just starts a BT app of espressif, which then handles it.

iabdalkader · 2023-04-26

I think there were recent fixes to btstack in the PR that adds CYW43 BT support.

It seems, that the NINA code just starts a BT app of espressif, which then handles it.

It does yes, see main/sketch.ino.cpp

robert-hh · 2023-04-26

It does yes, see main/sketch.ino.cpp

That's where I looked. There were as well changes to the test suite. I tried them as well with no change but for one test, which is already included in the above test summaries.
Edit: I did not look into the log for ble_mtu in detail, but at first glance it seems to be a synchronization problem. The test varies the MTU setting, and it may be that both side just went out of sync, indicating a test timing issue.
P.S.: Using two PYBD, all tests but stress_filesystem pass. The latter always comes up with a EIO error while writing.

robert-hh · 2023-04-26

If I have more time this week will try to get it working with btstack and see if it makes any difference.

btstack would be favorable. I made a short attempt to build it with the SAMD port - just compile & link, no operation.
The binary is ~20k smaller, and it uses way less static data, which increases the BSS section. A large BSS causes a problem with the MIMXRT port on i.mx 1010 and 1015, which have both BSS and stack in the 32k DTCM section. For nimble to fit, I must either decrease the stack size or move it to OCRM

iabdalkader · 2023-04-26

Well if it passes the test then at least we could rule out issues in other port/nina fw, and could possibly switch the Nano too. I've seen some recent fixes to btstack in/for the Pico CYW43-BT PR, did you try those ?

robert-hh · 2023-04-26

I did not make any tests yet, so i do not know if it works at all. I used the must recent version, which was merged this night and is part of v1.20.

robert-hh · 2023-04-26

It turned out that at least the communication path seems to work. If I start BT with ble.active(1), some activity happens at the interface, and them it gets a timeout. With debug printf enabled, I get the log below. I wonder if the type of messages exchanged with the BT device (NINA, CYW43, ..) is the same for each.

log:

btstack: mp_bluetooth_init
btstack: mp_bluetooth_deinit
btstack: mp_bluetooth_init: waiting for stack startup
btstack: btstack_packet_handler_generic(packet_type=4, packet=2002fe24)
btstack:   --> btstack event state 0x01
btstack: btstack_packet_handler_generic(packet_type=4, packet=588c8)
btstack:   --> hci transport packet sent
btstack: btstack_packet_handler_generic(packet_type=4, packet=20001666)
btstack:   --> hci command complete
btstack: btstack_packet_handler_generic(packet_type=4, packet=588c8)
btstack:   --> hci transport packet sent
btstack: btstack_packet_handler_generic(packet_type=4, packet=20001666)
btstack:   --> hci command complete
btstack: mp_bluetooth_init: stack startup timed out
btstack: btstack_packet_handler_generic(packet_type=4, packet=2002fe1c)
btstack:   --> btstack event state 0x00
btstack: btstack_packet_handler_generic(packet_type=4, packet=2002fe1c)
btstack:   --> btstack event state 0x00
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: [Errno 116] ETIMEDOUT

iabdalkader · 2023-04-26

I don't think the btstack changes/fixes were merged, PR 10739 is still open.

iabdalkader · 2023-04-26

I may find more time to test it in the weekend, a little bit busy right now trying to get CYW43 to work in ports/mimxrt

robert-hh · 2023-04-27

I don't think the btstack changes/fixes were merged, PR 10739 is still open.

I assumed these fixes were in PR #11239

a little bit busy right now trying to get CYW43 to work in ports/mimxrt

Which stack do you use?

iabdalkader · 2023-04-27

Which stack do you use?

Right now I'm just trying to get WiFi working first, then will move on to BT, very likely will use nimble too, if it passes all the tests.

robert-hh · 2023-04-30

On a different topic:
I just found the reason for a problem of the NINA driver and the mimxrt port. Under some circumstances, the NINA driver causes 0 length SPI transfers. These were considered by the MIMXRT SPI driver as error. I changed that in the SPI driver. But it might be interesting to determine, why that happens. It happens when socket.recv() is called on a DGRAM or RAW socket. Example: ntptime.time().

During debugging I found an error in network_ninaw10.c, line 124. It reads:
debug_printf(&mp_plat_print, "poll_connect() status: %d reason %d\n", status, reason);
But the first argument must not be present. I did not change that.

iabdalkader · 2023-05-02

It happens when socket.recv() is called on a DGRAM or RAW socket. Example: ntptime.time().

Might have some time next weekend to test it, probably something that can be fixed easily.

robert-hh · 2023-05-02

I considered that and decided to change machine.spi(). Yes, it should be easy prevent it in the nina_wifi_bsp c driver.

iabdalkader · 2023-05-02

If you want to send a fix go ahead, if not sure just let me know will look into it, but can't allocate any time for this before the weekend.

robert-hh · 2023-05-02

Not calling spi_transfer() if size == 0 is the trivial part. It would be interesting to know why that happens.
And it is not urgent.

robert-hh · 2023-05-02

Just if it helps: This is the call stack when it happens. It seems like the NINA reports back a zero length value to read, and that the done.

call_stack_zero_len

Edit: I looked a few moments into the NINA Arduino firmware, and a zero length is reported if in socket_getsocketopt() the call to lwip_getsockopt_r() return -1. For whatever reason. So it might be the best not to call spi_transfer() in nina_read_response() it the transfer size is 0.

robert-hh · 2023-05-10

Looking at the NINA BT interface: Do you know the structure of the commands and the command codes?

iabdalkader · 2023-05-10

Those are a standardized Bluetooth interface:

https://software-dl.ti.com/simplelink/esd/simplelink_cc13x2_sdk/1.60.00.29_new/exports/docs/ble5stack/vendor_specific_guide/BLE_Vendor_Specific_HCI_Guide/hci_interface.html

robert-hh · 2023-07-06

Hello Ibrahim. You mentioned the Portenta C33 wWiFi/BT module. Do you know the URL of this module's source code?

iabdalkader · 2023-07-06

Hello Ibrahim. You mentioned the Portenta C33 wWiFi/BT module. Do you know the URL of this module's source code?

Yes it's hosted here, note we patched the firmware to fix the pinout, otherwise nothing else changed.

https://github.com/espressif/esp-hosted/

robert-hh · 2023-07-06

Thanks. That might be an alternative to the NINA firmware. According to your tests, it works better.

robert-hh · 2023-07-07

A few questions about it:

  • is the path esp-hosted/esp_hosted_fg/esp/esp_driver/network_adapter the one with the code for the ESP?
  • Which esp-idf version did you use to compile the ESP32 firmware?
  • the readme.md tells, the BT/BLE is not readily available. Does that just mean the e.g. the NIMBLE stack has to be used, like you do in the Portenta C33 port?
iabdalkader · 2023-07-07
* is the path esp-hosted/esp_hosted_fg/esp/esp_driver/network_adapter the one with the code for the ESP?

Yes

* Which esp-idf version did you use to compile the ESP32 firmware?

It builds with IDF-5 just fine.

* the readme.md tells, the BT/BLE is not readily available. Does that just mean the e.g. the NIMBLE stack has to be used, like you do in the Portenta C33 port?

As far as I know yes, the host must run the BT/Networking stacks.

robert-hh · 2023-07-07

Thank you for your swift reply. The firmware build fine here as well. Just a few more questions:

  • did you enable BT over UART in the sdkconfig files?
  • I noticed that the build of the ESP32 firmware included mbedtls. Does that mean that the host does not have to include mbdetls in it's build? That would be a great advantage.
iabdalkader · 2023-07-08

Like I said, we only patched the pinout to match the C33, and no the host must have mbedtls too.

robert-hh · 2023-07-08

I started to write a complaint about esp_hosted_hal_init() etc. assuming that a SPI declaration is done by device number only, but then I noticed that you declared it as MP_WEAK, allowing to have port-specific versions of these functions. That's convenient.

robert-hh · 2023-07-17

Below is a modified version of NINA's combine.py for creating a single file with the esp-hosted firmware, that can be uplaoded with your espflash.py utility. That's for ESP32. For other other ESP32 variants that offsets may differ.

#!/usr/bin/env python

import sys;

booloaderData = open("build/bootloader/bootloader.bin", "rb").read()
partitionData = open("build/partition_table/partition-table.bin", "rb").read()
otaData = open("build/ota_data_initial.bin", "rb").read()
network_adapterData = open("build/network_adapter.bin", "rb").read()

# calculate the output binary size, app offset 
outputSize = 0x10000 + len(network_adapterData)
if (outputSize % 1024):
	outputSize += 1024 - (outputSize % 1024)

# allocate and init to 0xff
outputData = bytearray(b'\xff') * outputSize

# copy data: bootloader, partitions, app
for i in range(0, len(booloaderData)):
	outputData[0x1000 + i] = booloaderData[i]

for i in range(0, len(partitionData)):
	outputData[0x8000 + i] = partitionData[i]

for i in range(0, len(otaData)):
        outputData[0xd000 + i] = otaData[i]

for i in range(0, len(network_adapterData)):
	outputData[0x10000 + i] = network_adapterData[i]

outputFilename = "esp_hosted.bin"
if (len(sys.argv) > 1):
	outputFilename = sys.argv[1]

# write out
with open(outputFilename,"w+b") as f:
	f.seek(0)
	f.write(outputData)
robert-hh · 2023-07-17

note we patched the firmware to fix the pinout

I#m a little bit lost again with the ESP32 SDK.
At which place did you patch the pinout. I see it in spi_slave_api.c, but as well there seem to be settings to be made in the sdkconfig files, like for ESP_SPI_GPIO_HANDSHAKE and ESP_SPI_GPIO_DATAREADY.

iabdalkader · 2023-07-17

At which place did you patch the pinout.

You can find the patches here along with all the other binaries, and it seems there's already a combine.py script:

https://github.com/arduino/ArduinoCore-renesas/tree/main/libraries/WiFi/extra

robert-hh · 2023-07-18

Thanks. I try to adapt this driver to the Adafruit airlift and NINA W102 hardware. So many of the pin setting have to be changed.

Did you make your changes to these files, or did you use the patches as they are in this link to modify the espressif esp_hosted files. It seems that the sdkconfig.default at your link is the sdkconfig after a build has been made of esp_hosted driver.
Looking at the patch files, it seems that the patches only affect the UART for BTM part, not the SPI interface.

b.t.w.: Does our PR for the Renesas port require to upload a different firmware to the C33 board, or is the proper firmware already installed/provided by Arduino?

What I also did not find are things like CONFIG_ESP_SPI_GPIO_DATA_READY, which are used in the code but not set at any place, not even in the build directory.

Another difficulty: the NINA/Adafruit port uses GPIO33 for the handshake. The esp_hosted code uses direct port access to set & clear the pins by creating a bit mask in the form 1 << GPIO_NR. That fits only for GPIO numbers < 32.

The whole approach would fit well for the MIMXRT port, since it makes use of LWIP. Then WiFi can coexist with Ethernet. It does not fit for the SAMD port, since including LWIP will increase the code by another ~100k, and that's too much for most SAMD51 ports. It might work for BT only.

Edit: Found the settings in idf.py menuconfig. Surprise: reading the instructions helps!

robert-hh · 2023-07-18

So it compiles now, only for the MIMXRT1010 MCU it's hardly feasible. The RAM use is increased by ~30k, leaving only 32k for the heap. That's not good. It could work with MIMXRT1020 and up, like Teensy 4.x. They have much more RAM, and LWIP is already included.

iabdalkader · 2023-07-18

Did you make your changes to these files

No those are Arduino's patches for the pins, and yes seems they only patch BT pins. The patched firmware is provided in the link, but it also ships with any C33 (firmware is preinstalled at the factory).

The whole approach would fit well for the MIMXRT port

I have a PR to add networking/bluetooth support for mimxrt (lwip/nimble/cyw43).

robert-hh · 2023-07-18

I have a PR to add networking/bluetooth support for mimxrt (lwip/nimble/cyw43).

That's a very good solution, but limited to a single board, as far as I could tell. And the extension HW seems not to b readily available. Using esp_hosted or NINA fw one can use Adafruit add-on boards with the MIMXRT EVK or Teensy 4.x boards.

iabdalkader · 2023-07-18

I'm not saying it's better, just saying that some of the work required is already done here:

https://github.com/micropython/micropython/pull/11397

robert-hh · 2023-07-18

Yes, I know. But, as said, there seem to be no easily available and usable breakout board for the CYW43, opposed to ESP32 boards or modules.
The PR for esp_hosted can be transferred easily to the MIMXRT port. It "just" requires adding some lines for mpconfigboard.x, Makefile and testing ............

robert-hh · 2023-07-19

There is a subtle hiccup in esp_hosted_hal.c and the way, the symbols like MICROPY_HW_WIFI_DATAREADY are use. The actual code assumes, that these are constants. In the actual MIMXRT port, they are defined e.g. as a function call, like pin_find(MP_OBJ_NEW_SMALL_INT(6)), which create a new pin object with default configuration on every call. So using expressions like mp_hal_pin_write(MICROPY_HW_WIFI_SPI_CS, 0) may fail.. So it's not critical, but "just" some overhead.

robert-hh · 2023-07-19

Some progress with a MIMXRT1020 EVK board. That board has wired LAN too, but I cannot run them both at the same time. The esphosted variant connects and supports basic operations. Still slow and not overly reliable. Ping success rate is low and slow. Ping IN is fine, Ping out success rate is about 50%.
What took me longest was the pin selection in the esphosted firmware. The default is Pin 2 for handshake. But Pin 2 is also a strapping pin, causing the ESP32 firmware to stall. Changing it to a different Pin made it work.
When in extmod/esphosted DEBUG is enabled, I get a lot of warning about invalid data frames. When tracing the SPI bus, I see a lot of transmissions with all bits 0.

The test adapter is a Arduino shaped hat with a generic ESP32 handwired to the pins of the hat. That way I hav access to all pins and can load firmware using the ESP32's USB port.

iabdalkader · 2023-07-19

Ping success rate is low and slow. Ping IN is fine, Ping out success rate is about 50%.

Yes I noticed UDP performance is bad, the udp unittests always fails, I have no idea why, might be something to tune in IDF config, otherwise everything else seems okay.

When in extmod/esphosted DEBUG is enabled, I get a lot of warning about invalid data frames. When tracing the SPI bus, I see a lot of transmissions with all bits 0.

I know, it seems to be a bug in the firmware it just sends a buffer full of zeros, but it's harmless.

robert-hh · 2023-07-19

I have a simple udp echo test with a small server on a RPi in my LAN. That test works fine, albeit slow.
So I have to do some polishing to make it more reliable, and have to change the ESP32 firmware, such that Pin 33 can be used for handshake or data_ready. I do not know why they made direct port access instead of the API calls. That is a useless optimization.

robert-hh · 2023-07-20

Attached is a copy of the esp_hosted_hal.c file, adapted for the mimxrt port. Instead of using the #define macro names for the pins, it assigns them to local variables and uses these. That way, fewer assumption are made about the content of the defines. Only that they create a lvalue of type mp_hal_pin_obj_t.

esp_hosted_hal.zip

robert-hh · 2023-07-22

Now I started to get BLE running with the ESP hosted firmware. The Arduino patches you linked were helpful, but there was more to be configured, especially telling the ESP hosted firmware to use UART instead of SPI. The complication with the Adafruit modules is, that they have UART at pins 1 and 3, which is as well used a log port of the esp_hosted firmware. But BLE overrides that, and so it seems to work now. Baud rate switching and test is t.b.d.

I do not know whether in in the esp_hosted firmware both WiFi and Bluetooth is active at the same time. Otherwise the SPI pins could be used for UART. Maybe using SPI for communication is an option.

b.t.w.: I see that the Copyright notice for the RA port's mpbthciport.c file is now Arduino SA, while a previous version for the RP2 port shows your name, even if these two files are mostly the same. And all are in the essential logic similar to the initial version made by Damien and more similar to the version I tailored for SAMD. So I think it's not fair to omit Damien or you in the notice.

iabdalkader · 2023-07-22

b.t.w.: I see that the Copyright notice for the RA port's mpbthciport.c file is now Arduino SA, while a previous version for the RP2 port shows your name, even if these two files are mostly the same

If you remove the license and weak/empty functions, mpbthciport.c in renesas-ra and stm32 for example have only 34% in common (including function names), so no they are not mostly the same, that needed to be said and to be clear, but I will add Damien to the file.

robert-hh · 2023-07-22

Test coverage looks much better than for the NINA fw:

18 tests performed
12 tests passed
2 tests skipped: multi_bluetooth/ble_l2cap.py multi_bluetooth/perf_l2cap.py
4 tests failed: multi_bluetooth/ble_deepsleep.py multi_bluetooth/ble_mtu_peripheral.py multi_bluetooth/perf_gatt_notify.py multi_bluetooth/stress_log_filesystem.py

about (C): I have a version of mpbthciport.c adapted for SAMD, which looks almost identical to the renesas version, besides the way the UART is configured. For that version I kept your name in the copyright notice. I cannot recall from which port it was, but cannot be the Renesas one, and the rp2 one is different as well. Maybe a mix of all.
b.t.w.: Did you move from openmv.io to Arduino?

iabdalkader · 2023-07-22

cannot recall from which port it was, but cannot be the Renesas one, and the rp2 one is different as well. Maybe a mix of all.

Most likely from the rp2 port, I sent that one too and it was adapted from stm32 (they all are) but this renesas work is made specifically for Arduino, so it has Arduino copyright notice, and I added Damien too.

b.t.w.: Did you move from openmv.io to Arduino?

No I just work full-time for Arduino now.

robert-hh · 2023-07-22

Most likely from the rp2 port,

Kind of. The rp2 port uses the Pico alarm mechanism for timing, while the SAMD one uses SoftTimer like the STM32. So it's more likely an evolution of both.

robert-hh · 2023-08-13

but I will add Damien to the file.

With all the forth & back, the (C) note of Damien disappeared again. And keeping it is not only a matter of respect. Reading the Copyright note it seems clear to me that even if you modify code and add your own code to it, the initial Copyright notice must be retained. And because it's hard to write code for MicroPython which does not contain code designed by Damien, I keep him in every file.

iabdalkader · 2023-08-15

With all the forth & back, the (C) note of Damien disappeared again

Which file are you referring to ? Please comment on the PR/file you think is missing a License. I added Damien to all of the BT files copyright notices.

robert-hh · 2023-08-15

I thought I had looked at mpbthciport.c. Looking again, Damien is indeed mentioned. Thank you.

On a completely different topic:
Last week I spent a day to implement at the MIMXRT esp_hosted trial a callback on rx line idle, which is called at the end of a received data chunk. The hardware supports it, only I had to modify fsl_lpuart.c to work like is should. In that callback mp_bluetooth_hci_poll_now() is called, similar to the STM32 port. Technically it works, and I can see in the logic analyzer that the response to a message from the esp_hosted firmware is sent earlier, usually about after 1-2 ms instead polling_time/2. But a) the total execution time of the BLE tests does not improve, b) transfer rates in the perf_xxx.py tests are the same, and c) the test coverage is somewhat worse. Maybe a) and b) are not surprising. Most of the time the board waits for data exchange cycle. But c) is disappointing.

iabdalkader · 2023-08-15

I still can't test anything right now, I will be back next week, but I'm not too concerned about performance at the moment, I only want to get something stable merged then anyone is welcome to improve it.

robert-hh · 2023-08-15

No problem. The Portenta C33 port has to be merged first as the baseline. I do not know whether the C33 UART support an idle event like the STM32 or MIMXRT. The rp2 and SAMD do not.

iabdalkader · 2023-08-19

I thought I had looked at mpbthciport.c. Looking again, Damien is indeed mentioned. Thank you.

@robert-hh maybe you meant the mpbthciport.c from the imxrt CYW43 support PR ?

https://github.com/micropython/micropython/pull/11397/files#diff-4b7d7538852dcd8a55e7c899c645e989058b83c508673de245b8104c664e74a2

Although I did write it all, including sched_node bits which I first used in rp2 as far as I remember, but it does share the stubs, so I will amend the license for that one too.

robert-hh · 2023-08-19

Maybe it was this one. It looks as well pretty similar to the mpbthciport.c file of the stm32 port. Anyhow it's hard to write code for MicroPython without using code and designs made by Damien.

Keyboard

j / / n
next pair
k / / p
previous pair
1 / / h
show query pane
2 / / l
show candidate pane
c
copy suggested comment
r
toggle reasoning
g i
go to index
?
show this help
esc
close overlays

press ? or esc to close

copied