ESP8266 stack trace: find position of "last failed alloc call" in code - stack-trace

I'm trying to debug a large sketch, I'm occasionally getting a crash related to the web interface, and am trying to find out where exactly this is going wrong.
The stack trace ends with:
last failed alloc call: 4022D552(1480)
This particular address is not present in the decode stack trace itself, but may hold the key to finding the source of the problem. Any suggestions on how to track this one down?
Decoding 36 results
0x402222c5: BearSSL::WiFiClientSecure::_installClientX509Validator() at /home/wouter/.arduino15/packages/esp8266/tools/xtensa-lx106-elf-gcc/2.5.0-3-20ed2b9/xtensa-lx106-elf/include/c++/4.8.2/bits/shared_ptr_base.h line 986
: (inlined by) ?? at /home/wouter/.arduino15/packages/esp8266/tools/xtensa-lx106-elf-gcc/2.5.0-3-20ed2b9/xtensa-lx106-elf/include/c++/4.8.2/bits/shared_ptr.h line 316
: (inlined by) ?? at /home/wouter/.arduino15/packages/esp8266/tools/xtensa-lx106-elf-gcc/2.5.0-3-20ed2b9/xtensa-lx106-elf/include/c++/4.8.2/bits/shared_ptr.h line 598
: (inlined by) ?? at /home/wouter/.arduino15/packages/esp8266/tools/xtensa-lx106-elf-gcc/2.5.0-3-20ed2b9/xtensa-lx106-elf/include/c++/4.8.2/bits/shared_ptr.h line 614
: (inlined by) BearSSL::WiFiClientSecure::_installClientX509Validator() at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/libraries/ESP8266WiFi/src/WiFiClientSecureBearSSL.cpp line 877
0x40222cfc: BearSSL::WiFiClientSecure::_connectSSL(char const*) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/libraries/ESP8266WiFi/src/WiFiClientSecureBearSSL.cpp line 962
0x40219c70: esp_yield at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/core_esp8266_main.cpp line 91
0x4021a8c3: delay at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/core_esp8266_wiring.cpp line 54
0x40206d6d: WiFiClient::connect(IPAddress, unsigned short) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/libraries/ESP8266WiFi/src/include/ClientContext.h line 136
: (inlined by) WiFiClient::connect(IPAddress, unsigned short) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/libraries/ESP8266WiFi/src/WiFiClient.cpp line 170
0x40222eed: BearSSL::WiFiClientSecure::connect(char const*, unsigned short) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/libraries/ESP8266WiFi/src/WiFiClientSecureBearSSL.cpp line 231
0x40225230: BearSSL::PrivateKey::getEC() const at ?? line ?
0x40225230: BearSSL::PrivateKey::getEC() const at ?? line ?
0x40216b5c: HTTPClient::connect() at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/libraries/ESP8266HTTPClient/src/ESP8266HTTPClient.cpp line 1165
0x4021c16e: uart_write at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/uart.cpp line 498
0x40217820: HTTPClient::sendRequest(char const*, unsigned char*, unsigned int) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/libraries/ESP8266HTTPClient/src/ESP8266HTTPClient.cpp line 655
0x40217d98: HardwareSerial::write(unsigned char const*, unsigned int) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/HardwareSerial.h line 158
0x40273eba: sleep_reset_analog_rtcreg_8266 at ?? line ?
0x402180a5: Print::write(char const*) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/Print.h line 60
0x40217d83: HardwareSerial::write(unsigned char) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/HardwareSerial.h line 154
0x402179ba: HTTPClient::GET() at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/libraries/ESP8266HTTPClient/src/ESP8266HTTPClient.cpp line 575
0x402118e8: HydroMonitorLogging::sendPostData(char*) at /home/wouter/Arduino/libraries/HydroMonitor/src/HydroMonitorLogging.cpp line 730
0x40225470: BearSSL::PrivateKey::getEC() const at ?? line ?
0x4022c4c0: _vsprintf_r at /home/earle/src/esp-quick-toolchain/repo/newlib/newlib/libc/stdio/vsprintf.c line 65
0x40217d98: HardwareSerial::write(unsigned char const*, unsigned int) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/HardwareSerial.h line 158
0x4022000a: spiffs_object_truncate at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/spiffs/spiffs_nucleus.cpp line 1727
0x40211d41: HydroMonitorLogging::transmitMessages() at /home/wouter/Arduino/libraries/HydroMonitor/src/HydroMonitorLogging.cpp line 337
0x40211caa: HydroMonitorLogging::transmitMessages() at /home/wouter/Arduino/libraries/HydroMonitor/src/HydroMonitorLogging.cpp line 326
0x40204dab: handleAPI() at /home/wouter/Arduino/Williams_fridge/Fridge_control/webAPI.ino line 149
0x402251c8: BearSSL::PrivateKey::getEC() const at ?? line ?
0x40217d98: HardwareSerial::write(unsigned char const*, unsigned int) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/HardwareSerial.h line 158
0x40218130: Print::println() at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/Print.cpp line 178
0x402121b5: HydroMonitorLogging::logData() at /home/wouter/Arduino/libraries/HydroMonitor/src/HydroMonitorLogging.cpp line 250
0x40217d83: HardwareSerial::write(unsigned char) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/HardwareSerial.h line 154
0x40224948: Print::write(char) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/Print.h line 73
0x40218114: Print::print(char) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/Print.cpp line 126
0x402106f4: HydroMonitorIsolatedSensorBoard::readSensor(bool) at /home/wouter/Arduino/libraries/HydroMonitor/src/HydroMonitorIsolatedSensorBoard.cpp line 36
0x40207a07: TwoWire::requestFrom(int, int) at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/libraries/Wire/Wire.cpp line 134
0x402042f9: loop at /home/wouter/Arduino/Williams_fridge/Fridge_control/loop.ino line 36
0x40219d20: loop_wrapper() at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/core_esp8266_main.cpp line 125
0x401015d5: cont_wrapper at /home/wouter/.arduino15/packages/esp8266/hardware/esp8266/2.5.2/cores/esp8266/cont.S line 81
Note: I also asked on the esp8266 forum, and opened an issue on github about the same.

Intermittent ESP crashes are most often due to running out of RAM. A "failed alloc call" likely refers to a failure to allocate memory for an operation (encryption takes a good bit, as does every open request in the queue). The stack trace will not point to a specific spot in your code because client libraries run "in the background" (triggered by interrupts or once-per-loop calls to a function that does everything) and no specific event in your firmware caused it. On a memory-starved ESP with a long response time, sending several AJAX requests at once from the frontend can cause the overload.
To see whether this is the case, try printing the result of ESP.getFreeHeap() at various points in your loop. If it falls much below 10kb, there is a risk that another request will drive it over the edge.

I tested the free heap and got around 32 kB. So didn't expect any memory issues.
Turns out, 32 kB is less than 17 kB + 5 kB - for clientSecure and server respectively. And that's what caused the crashes. Dropping SSL encryption completely solved the problem. The development team mentioned it must be contiguous memory - which for whatever reason that 32 kB of mine apparently isn't.
Simple conclusion: you can not run the server (just http - no encryption) AND have a secure client in the same program.
No AJAX, by the way. Haven't gotten around to that level yet.

Related

Error with show variable in data viewer for jupyter notebook

In the recent VS Code release, they added this feature to view the active variables in the Jupyter Notebook and also, view the values in the variable with Data Viewer.
However, every time I am trying to view the values in Data Viewer, VS Code is throwing error below. It says that the reason is that the object of data type is Int64 and not string, but I am sure that should not be the reason to not show the variable. Anyone facing similar issues. I tried with a simple data frame and it's working fine.
Error: Failure during variable extraction:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-16-eae5f1f55b35> in <module>
97
98 # Transform this back into a string
---> 99 print(_VSCODE_json.dumps(_VSCODE_targetVariable))
100 del _VSCODE_targetVariable
101
~/anaconda3/lib/python3.7/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
229 cls is None and indent is None and separators is None and
230 default is None and not sort_keys and not kw):
--> 231 return _default_encoder.encode(obj)
232 if cls is None:
233 cls = JSONEncoder
~/anaconda3/lib/python3.7/json/encoder.py in encode(self, o)
197 # exceptions aren't as detailed. The list call should be roughly
198 # equivalent to the PySequence_Fast that ''.join() would do.
--> 199 chunks = self.iterencode(o, _one_shot=True)
200 if not isinstance(chunks, (list, tuple)):
201 chunks = list(chunks)
~/anaconda3/lib/python3.7/json/encoder.py in iterencode(self, o, _one_shot)
255 self.key_separator, self.item_separator, self.sort_keys,
256 self.skipkeys, _one_shot)
--> 257 return _iterencode(o, 0)
258
259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,
~/anaconda3/lib/python3.7/json/encoder.py in default(self, o)
177
178 """
--> 179 raise TypeError(f'Object of type {o.__class__.__name__} '
180 f'is not JSON serializable')
181
TypeError: Object of type int64 is not JSON serializable

IPython - Raise exception when a shell command fails

I'm using IPython as a system shell.
Is there a way to make IPython to raise an exception when the shell command fails? (non-zero exit codes)
The default makes them fail silently.
As of IPython 4.0.1, !cmd is transformed into get_ipython().system(repr(cmd)) (IPython.core.inputtransformer._tr_system()).
In the source, it's actually InteractiveShell.system_raw(), as inspect.getsourcefile() and inspect.getsource() can tell.
It delegates to os.system() in Windows and subprocess.call() in other OSes. Not configurable, as you can see from the code.
So, you need to replace it with something that would call subprocess.check_call().
Apart from monkey-patching by hand, this can be done with the IPython configuration system. Available options (viewable with the %config magic) don't allow to replace TerminalInteractiveShell with another class but several TerminalIPythonApp options allow to execute code on startup.
Do double-check whether you really need this though: a look through the system_raw()'s source reveals that it sets the _exit_code variable - so it doesn't actually fail completely silently.
If you use ! to execute shell commands, errors will pass silently
!echo "hello" && exit 1
hello
If you use the %%sh cell magic to execute the shell command, errors will raise:
%%sh
echo "hello" && exit 1
hello
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
<ipython-input-10-9229f76cae28> in <module>
----> 1 get_ipython().run_cell_magic('sh', '', 'echo "hello" && exit 1\n')
~/anaconda/envs/altair-dev/lib/python3.6/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
2360 with self.builtin_trap:
2361 args = (magic_arg_s, cell)
-> 2362 result = fn(*args, **kwargs)
2363 return result
2364
~/anaconda/envs/altair-dev/lib/python3.6/site-packages/IPython/core/magics/script.py in named_script_magic(line, cell)
140 else:
141 line = script
--> 142 return self.shebang(line, cell)
143
144 # write a basic docstring:
<decorator-gen-110> in shebang(self, line, cell)
~/anaconda/envs/altair-dev/lib/python3.6/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):
~/anaconda/envs/altair-dev/lib/python3.6/site-packages/IPython/core/magics/script.py in shebang(self, line, cell)
243 sys.stderr.flush()
244 if args.raise_error and p.returncode!=0:
--> 245 raise CalledProcessError(p.returncode, cell, output=out, stderr=err)
246
247 def _run_script(self, p, cell, to_close):
CalledProcessError: Command 'b'echo "hello" && exit 1\n'' returned non-zero exit status 1.

Displaying human-readable text in perl Log::Report stack traces

A library that I'm using XML::Compile::Translate::Reader calls Log::Report's error method
or error __x"data for element or block starting with `{tag}' missing at {path}"
, tag => $label, path => $path, _class => 'misfit';
As I've got Log::Report set to debug mode, it returns a stack trace for an error.
[11 07 2014 22:17:39] [2804] error: data for element or block starting with `MSISDN' missing at {http://www.sigvalue.com/acc}TA
at /usr/local/share/perl5/XML/Compile/Translate/Reader.pm line 476
Log::Report::error("Log::Report::Message=HASH(0x2871cf8)") at /usr/local/share/perl5/XML/Compile/Translate/Reader.pm line 476
<snip>
XML::Compile::SOAP::Daemon::LWPutil::lwp_run_request("HTTP::Request=HASH(0x2882858)", "CODE(0x231ba38)", "HTTP::Daemon::ClientConn::SSL=GLOB(0x231b9c0)", undef) at /usr/local/share/perl5/XML/Compile/SOAP/Daemon/LWPutil.pm line 95
Any::Daemon::run("XML::Compile::SOAP::Daemon::AnyDaemon=HASH(0x7a3168)", "child_task", "CODE(0x2548128)", "max_childs", 36, "background", 1) at /usr/local/share/perl5/XML/Compile/SOAP/Daemon/AnyDaemon.pm line 75
XML::Compile::SOAP::Daemon::AnyDaemon::_run("XML::Compile::SOAP::Daemon::AnyDaemon=HASH(0x7a3168)", "HASH(0x18dda00)") at /usr/local/share/perl5/XML/Compile/SOAP/Daemon.pm line 99
(eval)("XML::Compile::SOAP::Daemon::AnyDaemon=HASH(0x7a3168)", "HASH(0x18dda00)") at /usr/local/share/perl5/XML/Compile/SOAP/Daemon.pm line 94
XML::Compile::SOAP::Daemon::run("XML::Compile::SOAP::Daemon::AnyDaemon=HASH(0x7a3168)", "name", "rizserver.pl", "background", 1, "max_childs", 36, "socket", [7 more]) at ./rizserver.pl line 95
There is lots of juicy data in those HASH, SCALAR, GLOB, and other elements that I want to get logged; as we are having trouble logging the original request in case it doesn't match.
I've explored using
Some leads that I don't know how to use are using Log::Dispatch, or some sort of Filter on Log::Report; but in the end, all I really want is to apply Data::Dumper to those elements.

Why sometimes 'self' isn't available while debugging with lldb?

A lot of time (when it's not every time) I got the following error when I try to print objects on lldb. Is there some build/debug configuration to change or is this an error inside lldb?
(lldb) po userLevel
error: warning: Stopped in an Objective-C method, but 'self' isn't available; pretending we are in a generic context
error: use of undeclared identifier 'userLevel'
error: 1 errors parsing expression
I build with llvm and do not strip debug symbols.
Edit: Here is the backtrace:
(lldb) bt
* thread #1: tid = 0x1c03, 0x001169c5 FanCake-Beta`-[KWUserLevelController addPoints:](, _cmd=0x0029187b, points=15) + 179 at KWUserLevelController.m:53, stop reason = step over
frame #0: 0x001169c5 FanCake-Beta`-[KWUserLevelController addPoints:](, _cmd=0x0029187b, points=15) + 179 at KWUserLevelController.m:53
frame #1: 0x00112172 FanCake-Beta`-[KWEventRealTimeControllergameEngine:hostedGame:didSucceedIn:withScore:](self=0x0b9d7740, _cmd=0x0027a2a7, engine=0x1be5af40, game=0x1be5a850, completionTime=3.59473554800206, score=0) + 421 at KWEventRealTimeController.m:647
frame #2: 0x0007189a FanCake-Beta`__35-[KMCatchEmGameEngine animateStep6]_block_invoke197(, finished='\x01') + 257 at KMCatchEmGameEngine.m:214
frame #3: 0x01990df6 UIKit`-[UIViewAnimationBlockDelegate _didEndBlockAnimation:finished:context:] + 223
frame #4: 0x01983d66 UIKit`-[UIViewAnimationState sendDelegateAnimationDidStop:finished:] + 237
frame #5: 0x01983f04 UIKit`-[UIViewAnimationState animationDidStop:finished:] + 68
frame #6: 0x017587d8 QuartzCore`CA::Layer::run_animation_callbacks(void*) + 284
frame #7: 0x03634014 libdispatch.dylib`_dispatch_client_callout + 14
frame #8: 0x036247d5 libdispatch.dylib`_dispatch_main_queue_callback_4CF + 296
frame #9: 0x04737af5 CoreFoundation`__CFRunLoopRun + 1925
frame #10: 0x04736f44 CoreFoundation`CFRunLoopRunSpecific + 276
frame #11: 0x04736e1b CoreFoundation`CFRunLoopRunInMode + 123
frame #12: 0x040367e3 GraphicsServices`GSEventRunModal + 88
frame #13: 0x04036668 GraphicsServices`GSEventRun + 104
frame #14: 0x01945ffc UIKit`UIApplicationMain + 1211
frame #15: 0x000039a8 FanCake-Beta`main(argc=1, argv=0xbffff354) + 94 at main.m:13
frame #16: 0x00002dc5 FanCake-Beta`start + 53
Edit 2: Same thing when I try to print local vars
(lldb) po currentUser
error: variable not available
If I put some NSLog() in the code, the correct value is printing. But not with the po command.
I usually get this error when I have compiler optimization turned on. The compiler will generate code which does not necessarily follow your code logic flow.
Go to your project in the navigator -> Target -> Build settings -> Search for optimization level -> expand optimization level -> select the debug line -> change to none in both columns of your project and target.
Hope this helps.
I was having this problem and it went away when I edited my scheme and set the Run build to Debug. I had set it to AdHoc for testing Push Notifications and that apparently makes LLDB unhappy.
Because it's optimized away. Let's use an example:
void f(int self) {
// Here, "self" is live:Its value is used in the call to NSLog()
NSLog(#"I am %d",self);
// Here, "self" is dead: Its value is never used.
// I could do self=0 or self=self*self and nobody would know.
// An optimizing compiler will typically optimize it away.
// It might still be on the stack (in the parameters passed to NSLog())
// but the compiler shouldn't assume this.
NSLog(#"Another function call: %d", 1);
// If "self" was on the stack, it will probably now have been overwritten.
}
Compilers do a lot of things to make your code faster/smaller; forgetting about variables which are no longer needed is a very common optimization.
As already said in this other question:
In Build Settings, setting Precompile Prefix Header to NO fixed it for me.

Windows debugging - WinDbg

I got the following error while debuggging a process with its core dump.
0:000> !lmi test.exe
Loaded Module Info: [test.exe]
Module: test
Base Address: 00400000
Image Name: test.exe
Machine Type: 332 (I386)
Time Stamp: 4a3a38ec Thu Jun 18 07:54:04 2009
Size: 27000
CheckSum: 54c30
Characteristics: 10f
Debug Data Dirs: Type Size VA Pointer
MISC 110, 0, 21000 [Debug data not mapped]
FPO 50, 0, 21110 [Debug data not mapped]
CODEVIEW 31820, 0, 21160 [Debug data not mapped] - Can't validate symbols, if present.
Image Type: FILE - Image read successfully from debugger.
test.exe
Symbol Type: CV - Symbols loaded successfully from image path.
Load Report: cv symbols & lines
Does any body know what the error CODEVIEW 31820, 0, 21160 [Debug data not mapped] - Can't validate symbols, if present. really mean?
Is this error meant that i can't read public/private symbols from the executable?
If it is not so, why does the WinDbg debugger throws this typr of error?
Thanks in advance,
Santhosh.
Debug data not mapped can mean the section of the executable that holds the debug information hasn't been mapped into memory. If this is a crash dump, your options are limited, but if it's a live debug session. you can use the WinDbg .pagein command to retrieve the data. To do that you need to know the address to page in. If you use the !dh command on the module start address (which you can see with lm - in my case, lm mmsvcr90 for msvcr90.dll), you may see something like this (scrolling down a ways):
Debug Directories(1)
Type Size Address Pointer
cv 29 217d0 20bd0 Can't read debug data cb=0
This shows you that the debug data is at offset 217d0 from the module start and is length 29. If you attempt to dump those bytes you'll see (78520000 is the module's start address):
kd> db 78520000+217d0 l29
785417d0 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ????????????????
785417e0 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ????????????????
785417f0 ?? ?? ?? ?? ?? ?? ?? ??-?? ?????????
If you execute .pagein /p 82218b90 785417d0, then F5, when the debugger breaks back in you'll see (82218b90 is the _EPROCESS address of the process that I'm debugging):
kd> db 78520000+217d0 l29
785417d0 52 53 44 53 3f d4 6e 7a-e8 62 44 48 b2 54 ec 49 RSDS?.nz.bDH.T.I
785417e0 ae f1 07 8c 01 00 00 00-6d 73 76 63 72 39 30 2e ........msvcr90.
785417f0 69 33 38 36 2e 70 64 62-00 i386.pdb.
Now executing .reload /f msvcr90.dll will load the symbols. For a crash dump, if you can find the 0x29 bytes you're missing (from another dump maybe), you may be able to insert them and get the symbols loaded that way.
Have you set your symbol path for WinDbg (see Step 2 # http://blogs.msdn.com/iliast/archive/2006/12/10/windbg-tutorials.aspx) and are your PDB files in the symbol path?
I assume you're testing an executable built in debug mode which generates the necessary PDB files.