UEFI edk2/FmpDevicePkg in tianocore crashed after excution - uefi

I built FmpDevicePkg in tianocore/edk2. Then, I load eif driver in EmulatorX64 and Minnow Board. It both crashed without any message.
I can set break point with Visual Studio for other driver/app packages. However, as for FmpDevicePkg, it can't set break point and crashed directly after I load the driver.
Does anyone know how to debug it? Or how can I test FmpDevicePkg driver? (or associated capsule)
Any suggestion is highly appreciated.
Thanks!
I

I traced the code and found out it hang at follows:
PcdGetBool (PcdTestKeyUsed);
PcdSetBoolS (PcdTestKeyUsed, TRUE);
Detail as follows:
\edk2\FmpDevicePkg\FmpDxe\ DetectTestKey.c
DetectTestKey (
VOID
)
{
….
// If PcdTestKeyUsed is already TRUE, then skip test key detection
//
TestKeyUsed = PcdGetBool (PcdTestKeyUsed); -> system hang
..
// If test key detected or an error occurred checking for the test key, then
// set PcdTestKeyUsed to TRUE.
//
if (TestKeyUsed) {
DEBUG ((DEBUG_INFO, "FmpDxe(%s): Test key detected in PcdFmpDevicePkcs7CertBufferXdr.\n", mImageIdName));
PcdSetBoolS (PcdTestKeyUsed, TRUE); -> system hang
I modified \MdeModulePkg\MdeModulePkg.dec from
PcdsDynamic, PcdsDynamicEx]
gEfiMdeModulePkgTokenSpaceGuid.PcdTestKeyUsed|FALSE|BOOLEAN|0x00030003
as below:
[PcdsPatchableInModule, PcdsDynamic, PcdsDynamicEx]
gEfiMdeModulePkgTokenSpaceGuid.PcdTestKeyUsed|FALSE|BOOLEAN|0x00030003
It worked fine.

Related

How to Troubleshoot Dexie bound on IDBKeyRange Error

I'm using Dexie.js version 3.0.3-rc.3 in a Vue JS project and I occasionally run into this exception in Chrome (86):
Failed to execute 'bound' on 'IDBKeyRange': The parameter is not a valid key.↵ DataError: Failed to execute 'bound' on 'IDBKeyRange': The parameter is not a valid key.
Here's a screenshot of the full error:
I'm fairly certain the problem lies with something in my data being undefined, but I'm trying to find a good way to troubleshoot this. I paused the Chrome dev tools on exceptions and inspected the code around this particular part of Dexie, but it doesn't reveal what data was used to make this exception occur.
Does anyone have any suggestions on how to find out what's actually wrong? It feels a bit like a needle in a haystack.
== Update ==
Below is the full call stack:
Try inspecting the call stack. I know it can be long until you reach a frame within your application code, but the failing call should be there!

Opencascade crash when calling calling Transfer()

I have tested two cases:
I use STEPCAFControl_Reader then STEPControl_Reader to read my step file but both methods crash when I call STEPCAFControl_Reader::Transfer, repsectively STEPControl_Reader:: TransferRoots.
By using STEPControl_Reader, I displayed a log on my console, then there is a message like this:
1 F:(BOUNDED_SURFACE,B_SPLINE_SURFACE,B_SPLINE_SURFACE_WITH_KNOTS,GEOMETRIC_REPRESENTATION_ITEM,RATIONAL_B_SPLINE_SURFACE,REPRESENTATION_ITEM,SURFACE): Count of Parameters is not 1 for representation_item
EDIT:
There is a null reference inside TransferRoots() method.
const Handle(Transfer_TransientProcess) &proc = thesession->TransferReader()->TransientProcess();
if (proc->GetProgress().IsNull())
{
//This condition does not exist from the source code
std::cout << "GetProgress is null" << std::endl;
return 0;
}
Message_ProgressSentry PS ( proc->GetProgress(), "Root", 0, nb, 1 );
My app and FreeCAD crash but if I use CAD Assitant which OCC official viewer, it loads.
It looks like comments already provide an answer to the question - or more precisely answers:
STEPCAFControl_Reader::ReadFile() returns reading status, which should be checked before calling STEPCAFControl_Reader::Transfer().
Normally, it is a good practice to put OCCT algorithm into try/catch block and check for OCCT exceptions (Standard_Failure).
Add OCC_CATCH_SIGNALS at the beginning of try statements (required only on Linux) and OSD::SetSignal(false) within working thread creation to redirect abnormal cases (access violation, NULL dereference and others) to C++ exceptions (OSD_Signal which is subclass of Standard_Failure). This may conflict other signal handlers in mixed environment - so check also documentation of other frameworks used by application.
If you catch failures like NULL dereference on calling OCCT algorithm with valid arguments - this is a bug in OCCT which is desirable to be fixed in one or another way, even if input STEP file contains syntax/logical errors triggering such kind of issues. Report the issue on OCCT Bugtracker with sufficient information for reproducing bug, including sample files - it is not helpful to developers just saying that OCCT crashes somewhere. Consider also contributing into this open source project by debugging OCCT code and suggesting patches.
Check STEP file reading log for possible errors in the file itself. Consider reporting an issue to system producing a broken file, even if main file content can be loaded by STEP readers.
It is a common practice to use OSD::SetSignal() within OCCT-based applications (like CAD Assistant) to improve their robustness on non-fatal errors in application/OCCT code. It is more user friendly reporting an internal error message instead of silently crashing.
But it should be noted, that OSD::SetSignal() doesn't guarantee application not being crashed nor that application can work properly after catching such failure - due to asynchronous nature of some signals, the memory can be already corrupted at the moment, when C++ exception has been raised leading to all kinds of undesired behavior. For that reason, it is better not ignoring such kind of exceptions, even if it looks like application works fine with them.
OSD::SetSignal(false); // should be called ones at application startup
STEPCAFControl_Reader aReader;
try
{
OCC_CATCH_SIGNALS // necessary for redirecting signals on Linux
if (aReader.ReadFile (theFilePath) != IFSelect_RetDone) { return false; }
if (!aReader.Transfer (myXdeDoc)) { return false; }
}
catch (Standard_Failure const& theFailure)
{
std::cerr << "STEP import failed: " << theFailure.GetMessageString() << "\n";
return false;
}
return true;

Crash inside http_client constructor (Casablanca SDK)

I'm trying to use Casablanca to consume a REST api.
I've been following the microsoft tutorial, how ever i'm getting a crash and I cannot figure it out.
I'm using visual studio 2017 with C++11
I've codded a function GetRequest() that do work when used in a new empty project, but when I try to use it on my Project (Very big project with millions of code lines).
I'm crashing in the constructor of http_client, in the file xmemory0 line 118.
const uintptr_t _Ptr_container = _Ptr_user[-1];
This is a link to the callstack : https://i.imgur.com/lBm0Hv7.png
void RestManager::GetRequest()
{
auto fileStream = std::make_shared<ostream>();
// Open stream to output file.
pplx::task<void> requestTask = fstream::open_ostream(U("results.html")).then([=](ostream outFile)
{
*fileStream = outFile;
// Create http_client to send the request.
http_client client(U("XXX/XXX.svc/"));
// Build request URI and start the request.
uri_builder builder(U("/IsLive"));
builder.append_query(U("q"), U("cpprestsdk github"));
return client.request(methods::GET, builder.to_string());
})
// Handle response headers arriving.
.then([=](http_response response)
{
printf("Received response status code:%u\n", response.status_code());
// Write response body into the file.
return response.body().read_to_end(fileStream->streambuf());
})
// Close the file stream.
.then([=](size_t)
{
return fileStream->close();
});
// Wait for all the outstanding I/O to complete and handle any exceptions
try
{
requestTask.wait();
}
catch (const std::exception &e)
{
printf("Error exception:%s\n", e.what());
}
}
EDIT : I just want to add that the http_client constructor is the issue. It always crash inside it no matter what I send as parameter.
The wierd thing is that it's not crashing when i just make a main() that call this function.
I guess it must be due to some memory issues, however I have no idea how could I debug that.
Does anyone would have an idea about it?
Thanks and have a great day!
I've experienced a similar issue on ubuntu. It works in an empty project, but crashes randomly when put into an existing large project, complaining memory corruptions.
Turns out that the existing project loaded a proprietary library, which is using cpprestsdk (casablanca) internally. Even cpprestsdk is static linked, its symbols are still exported as Weak Symbols. So either my code crashes, or the proprietary library crashes.
Ideally, my project can be divided into several libraries, and load them with RTLD_LOCAL to avoid symbol clashes. But the proprietary library in my project only accept RTLD_GLOBAL, otherwise it crashes... So the import order and flags become important:
dlopen("my-lib-uses-cpprest", RTLD_LOCAL); //To avoid polluting the global
dlopen("proprietary-lib-with-built-in-cpprest", RTLD_GLOBAL); //In my case, this lib must be global
dlopen("another-lib-uses-cpprest", RTLD_DEEPBIND); //To avoid being affected by global
"it will probably never concern anyone."
I agree with that.
I guess this issues was very specific, and it will probably never concern anyone, but still I'm going to update on everything I found out about it.
On this project, we are using custom allocator, if i'm not wrong, it's not possible to give our custom allocator to this lib, which result to many random crash.
A good option to fix it would be to use the static version to this lib, however, since we are using a lot of dynamic lib, this option wasn't possible for us.
If you are on my case, I would advice to use the libcurl and rapidjson, it's a bit harder to use, but you can achieve the same goal.

Problems in exit code using C++ AMP

Environment: Visual Studio 2017, Windows 10 ver. 1709. Compiling mode: release.
When I call:
accelerator_view acc_view = accelerator().default_view;
an exception is raised (see figure link below), but the code performs fine afterwards.
But when the executable process exits and I call:
::GetExitCodeProcess(hChildProcess, &retVal);
from a caller process, instead of returning 0, it returns a garbage value in retVal.
Digging the source code, the problem seems to be in the snipped code below (SchedulerBase.cpp, line 149)
// Auto-reset event that is not signalled initially
m_hThrottlingEvent = platform::__CreateAutoResetEvent();
// Use a trampoline for UMS
if (!RegisterWaitForSingleObject(&m_hThrottlingWait, m_hThrottlingEvent, SchedulerBase::ThrottlerTrampoline, this, INFINITE, WT_EXECUTEDEFAULT))
{
throw scheduler_resource_allocation_error(HRESULT_FROM_WIN32(GetLastError()));
}
I think it is beyond my hands to fix it, because the code above is inside MFC. The same code works well when compiling with Visual Studio 2013. Refer to the figure attached of the stack, showing the raised exception (and catched inside) when I call
accelerator_view acc_view = accelerator().default_view;
The question: how to clean up the AMP before exiting and the getting the correct result when calling GetExitCodeProcess()?
Here is the figure:
Solved! If you add
concurrency::amp_uninitialize();
after using AMP framework, when the caller process calls
::GetExitCodeProcess(hChildProcess, &retVal);
The retVal parameter is filled correctly.

CLIPS (clear) command fails / throws exception in pyclips

I have a pyclips / clips program for which I wrote some unit tests using pytest.
Each test case involes an initial clips.Clear() followed by the execution of real clips COOL code via clips.Load(rule_file.clp). Running each test individually works fine.
Yet, when telling pytest to run all tests, some fail with ClipsError: S03: environment could not be cleared. In fact, it depends on the order of the tests in the .py file. There seem to be test cases, that cause the subsequent test case to throw the exception.
Maybe some clips code is still "in use" so that the clearing fails?
I read here that (clear)
Clears CLIPS. Removes all constructs and all associated data structures (such as facts and instances) from the CLIPS environment. A clear may be performed safely at any time, however, certain constructs will not allow themselves to be deleted while they are in use.
Could this be the case here? What is causing the (clear) command to fail?
EDIT:
I was able to narrow down the problem. It occurs under the following circumstances:
test_case_A comes right before test_case_B.
In test_case_A there is a test such as
(test (eq (type ?f_bio_puts) clips_FUNCTION))
but f_bio_puts has been set to
(slot f_bio_puts (default [nil]))
So testing the type of a slot variable, which has been set to [nil] initially, seems to cause the (clear) command to fail. Any ideas?
EDIT 2
I think I know what is causing the problem. It is the test line. I adapted my code to make it run in the clips Dialog Windows. And I got this error when loading via (batch ...)
[INSFUN2] No such instance nil in function type.
[DRIVE1] This error occurred in the join network
Problem resided in associated join
Of pattern #1 in rule part_1
I guess it is a bug of pyclips that this is masked.
Change the EnvClear function in the CLIPS source code construct.c file adding the following lines of code to reset the error flags:
globle void EnvClear(
void *theEnv)
{
struct callFunctionItem *theFunction;
/*==============================*/
/* Clear error flags if issued */
/* from an embedded controller. */
/*==============================*/
if ((EvaluationData(theEnv)->CurrentEvaluationDepth == 0) &&
(! CommandLineData(theEnv)->EvaluatingTopLevelCommand) &&
(EvaluationData(theEnv)->CurrentExpression == NULL))
{
SetEvaluationError(theEnv,FALSE);
SetHaltExecution(theEnv,FALSE);
}