Integrated Intel GPU is faster than Radeon one on MacBook - swift

I'm writing an app that extensively uses CoreImage filters + custom shaders. Usual case is:
Load and cache large RAW file from hard drive through CIFilter(imageURL)
Apply corrections (tint, temp, exposure)
Apply random CoreImage pre-defined filters
Apply custom-written shaders
Render everything to MTLTexture
Render from MTLTexture to screen
Go to step 2.
Now, we observed on various MacBooks (e.g. late 2014) that this code runs faster if target MTLDevice is Intel integrated GPU, rather than high performance Radeon attached to MBP.
Any ideas why is that? I would expect Radeon to be way faster.
edit:
Tested cards:
"Radeon Pro 460 4096 MB" vs "Intel HD Graphics 530 1536MB"
"NVIDIA GeForce GT 750M 2GB GDDR5" vs "Intel Iris Pro Graphics"
Simplified version of code we're using:
let filter = CIFilter(imageURL: urlToRawFile20MBLarge)
class Renderer: MTKView {
override func draw() {
filter.setValue(temp, forKey: kCIInputNeutralTemperatureKey)
let image: CIImage = filter.cropped(to: rect)
// uses CIFilter(name: "CIGaussianBlur").outputImage
.applyBlurFilter(radius: radius)
.applyCustomShader1(param: x)
.applyCustomShader2(param: y)
// ... create command buffer and `CIRenderDestination`
do {
try ciContext.startTask(toClear: dest)
try ciContext.startTask(toRender: image, to: dest)
} catch {
log(error)
}
if let drawable = currentDrawable {
commandBuffer.present(drawable)
}
}
}

Related

How can I get rid of blurry textures in Metal?

I just created a Metal template and changed up the code a little. I switched out the default colormap with the Minecraft 16x16 lapis ore texture, but for some reason they are blurred out when they are low resolution. I am trying to achieve that pixelated, Minecraft look and so would like to know how to disable this blurring/filtering.
Is there a way to load/present assets without this blurring? Here is my load asset function:
class func loadTexture(device: MTLDevice, textureName: String) throws -> MTLTexture {
/// Load texture data with optimal parameters for sampling
return try MTKTextureLoader(device: device).newTexture(name: textureName, scaleFactor: 1.0, bundle: nil, options: [
MTKTextureLoader.Option.textureUsage: NSNumber(value: MTLTextureUsage.shaderRead.rawValue),
MTKTextureLoader.Option.textureStorageMode: NSNumber(value: MTLStorageMode.`private`.rawValue)
])
}
Here is a screenshot of the blurry cube I'm getting:
In your texture sample call (in the shader), you need to set your magnification filter to 'nearest', instead of 'linear', like so (assuming your sampler is declared inline inside your shader):
constexpr sampler textureSampler (mag_filter::nearest, // <-- Set this to 'nearest' if you don't want any filtering on magnification
min_filter::nearest);
// Sample the texture to obtain a color
const half4 colorSample = colorTexture.sample(textureSampler, in.textureCoordinate);

CoreImage: CIImage write JPG is shifting colors [macOS]

Using CoreImage to filter photos, I have found that saving to JPG file will result in an image that has a subtle but visible blue hue. In this example using a B&W image, the histogram reveals how the colors have been shifted in the saved file.
---Input
Output [] Histogram shows the color layers are offset
-- Issue demonstrated with MacOS 'Preview' App
I can show a similar result using only the Preview App.
test image here: https://i.stack.imgur.com/Y3f03.jpg
Open the JPG image using Preview.
Export to JPEG at any 'Quality' other than the default (85%?)
Open the exported file and look at the Histogram, and the same color shifting can be seen as I experience within my app.
-- Issue demonstrated in custom MacOS App
The code here is as bare bones as possible, creating a CIImage from the photo and immediately saving it without performing any filters. In this example I chose 0.61 for compression as it resulted in a similar file size as the original. The distortion seems to be broader if using a higher compression ratio, but I could not find any value that would eliminate it.
if let img = CIImage(contentsOf: url) {
let dest = procFolder.url(named: "InOut.jpg")
img.jpgWrite(url: dest)
}
extension CIImage {
func jpgWrite(url: URL) {
let prop: [NSBitmapImageRep.PropertyKey: Any] = [
.compressionFactor: 0.61
]
let bitmap = NSBitmapImageRep(ciImage: self)
let data = bitmap.representation(using: NSBitmapImageRep.FileType.jpeg, properties: prop)
do {
try data?.write(to: url, options: .atomic)
} catch {
log.error(error)
}
}
}
Update 1: Using #Frank Schlegel's answer for saving JPG file
The JPG now carries a Color Sync Profile, and I can (unscientifically) track a ~10% performance boost for portrait images (less for landscape), which are nice improvements. But, unfortunately the resulting file is still skewing the colors in the same way demonstrated in the histograms above.
extension CIImage {
static let writeContext = CIContext(mtlDevice: MTLCreateSystemDefaultDevice()!, options: [
// using an extended working color space allows you to retain wide gamut information, e.g., if the input is in DisplayP3
.workingColorSpace: CGColorSpace(name: CGColorSpace.extendedSRGB)!,
.workingFormat: CIFormat.RGBAh // 16 bit color depth, needed in extended space
])
func jpgWrite(url: URL) {
// write the output in the same color space as the input; fallback to sRGB if it can't be determined
let outputColorSpace = colorSpace ?? CGColorSpace(name: CGColorSpace.sRGB)!
do {
try CIImage.writeContext.writeJPEGRepresentation(of: self, to: url, colorSpace: outputColorSpace, options: [:])
} catch {
}
}
}
Question:
How can I open a B&W JPG as a CIImage, and re-save a JPG file avoiding any color shifting?
This looks like a color sync issue (as Leo pointed out) – more specifically a mismatch/misinterpretation of color spaces between input, processing, and output.
When you are calling NSBitmapImageRep(ciImage:), there's actually a lot happening under the hood. The system actually needs to render the CIImage you are providing to get the bitmap data of the result. It does so by creating a CIContext with default (device-specific) settings, using it to process your image (with all filters and transformations applied to it), and then giving you the raw bitmap data of the result. In the process, there are multiple color space conversions happening that you can't control when using this API (and seemingly don't lead to the result you intended). I don't like these "convenience" APIs for rendering CIImages for this reason and I see a lot of questions on SO that are related to them.
I recommend you instead use a CIContext to render your CIImage into a JPEG file. This gives you direct control over color spaces and more:
let input = CIImage(contentsOf: url)
// ideally you create this context once and re-use it because it's an expensive object
let context = CIContext(mtlDevice: MTLCreateSystemDefaultDevice()!, options: [
// using an extended working color space allows you to retain wide gamut information, e.g., if the input is in DisplayP3
.workingColorSpace: CGColorSpace(name: CGColorSpace.extendedSRGB)!,
.workingFormat: CIFormat.RGBAh // 16 bit color depth, needed in extended space
])
// write the output in the same color space as the input; fallback to sRGB if it can't be determined
let outputColorSpace = input.colorSpace ?? CGColorSpace(name: CGColorSpace.sRGB)!
context.writeJPEGRepresentation(of: input, to: dest, colorSpace: outputColorSpace, options: [kCGImageDestinationLossyCompressionQuality: 0.61])
Please let me know if you still see a discrepancy when using this API.
I never found the underlying cause of this issue and therefore no 'true' solution as I was seeking. In discussion with #Frank Schlegel, it led to a belief that it is an artifact of Apple's jpeg converter. And the issue was certainly more apparent when using test files that appear monochrome but actually had small amount of color info in them.
The simplest fix for my app was to ensure there was no color in the source image, so I drop the saturation to 0 prior to saving the file.
let params = [
"inputBrightness": brightness, // -1...1, This filter calculates brightness by adding a bias value: color.rgb + vec3(brightness)
"inputContrast": contrast, // 0...2, this filter uses the following formula: (color.rgb - vec3(0.5)) * contrast + vec3(0.5)
"inputSaturation": saturation // 0...2
]
image.applyingFilter("CIColorControls", parameters: params)

vImageAlphaBlend crashes

I'm trying to alpha blend some layers: [CGImageRef] in the drawLayer(thisLayer: CALayer!, inContext ctx: CGContext!) routine of my custom NSView. Until now I used CGContextDrawImage() for drawing those layers into the drawLayer context. While profiling I noticed CGContextDrawImage() needs 70% of the CPU time so I decided to try the Accelerate framework. I changed the code but it just crashes and I have no clue what the reason could be.
I'm creating those layers like this:
func addLayer() {
let colorSpace: CGColorSpaceRef = CGColorSpaceCreateWithName(kCGColorSpaceGenericRGB)
let bitmapInfo = CGBitmapInfo(CGImageAlphaInfo.PremultipliedFirst.rawValue)
var layerContext = CGBitmapContextCreate(nil, UInt(canvasSize.width), UInt(canvasSize.height), 8, UInt(canvasSize.width * 4), colorSpace, bitmapInfo)
var newLayer = CGBitmapContextCreateImage(layerContext)
layers.append( newLayer )
}
My drawLayers routine looks like this:
override func drawLayer(thisLayer: CALayer!, inContext ctx: CGContext!)
{
var ctxImageBuffer = vImage_Buffer(data:CGBitmapContextGetData(ctx),
height:CGBitmapContextGetHeight(ctx),
width:CGBitmapContextGetWidth(ctx),
rowBytes:CGBitmapContextGetBytesPerRow(ctx))
for imageLayer in layers
{
//CGContextDrawImage(ctx, CGRect(origin: frameOffset, size: canvasSize), imageLayer)
var inProvider:CGDataProviderRef = CGImageGetDataProvider(imageLayer)
var inBitmapData:CFDataRef = CGDataProviderCopyData(inProvider)
var buffer:vImage_Buffer = vImage_Buffer(data: &inBitmapData, height:
CGImageGetHeight(imageLayer), width: CGImageGetWidth(imageLayer), rowBytes:
CGImageGetBytesPerRow(imageLayer))
vImageAlphaBlend_ARGB8888(&buffer, &ctxImageBuffer, &ctxImageBuffer, 0)
}
}
the canvasSize is allways the same and also all the layers have the same size, so I don't understand why the last line crashes.
Also I don't see how to use the new convenience functions to create vImageBuffers directly from CGLayerRefs. That's why I do it the complicated way.
Any help appreciated.
EDIT
inBitmapData indeed holds pixel data that reflect the background color I set. However the debugger can not po &inBitmapData and fails with this message:
error: reference to 'CFData' not used to initialize a inout parameter &inBitmapData
So I looked for a way to get the pointer to inBitmapData. That is what I came up with:
var bitmapPtr: UnsafeMutablePointer<CFDataRef> = UnsafeMutablePointer<CFDataRef>.alloc(1)
bitmapPtr.initialize(inBitmapData)
I also had to change the way to point at my data for both buffers that i need for the alpha blend input. Now it's not crashing anymore and luckily the speed boost is inspectable with a profiler (vImageAlphaBlend only takes about a third of CGContextDrawImage), but unfortunately the image results in a transparent image with pixel failures instead of the white image background.
So far I don't get any runtime errors anymore but since the result is not as expected I fear that I still don't use the alpha blend function correctly.
vImage_Buffer.data should point to the CFData data (pixel data), not the CFDataRef.
Also, not all images store their data as four channel, 8-bit per channel data. If it turns out to be three channel or RGBA or monochrome, you may get more crashing or funny colors. Also, you have assumed that the raw image data is not premultiplied, which may not be a safe assumption.
You are better off using vImageBuffer_initWithCGImage so that you can guarantee the format and colorspace of the raw image data. A more specific question about that function might help us resolve your confusion about it.
Some CG calls fall back on vImage to do the work. Rewriting your code in this way might be unprofitable in such cases. Usually the right thing to do first is to look carefully at the backtraces in the CG call to try to understand why you are causing so much work for it. Often the answer is colorspace conversion. I would look carefully at the CGBitmapInfo and colorspace of the drawing surface and your images and see if there wasn't something I could do to get those to match up a bit better.
IIRC, CALayerRefs usually have their data in non cacheable storage for better GPU access. That could cause problems for the CPU. If the data is in a CALayerRef I would use CA to do the compositing. Also, I thought that CALayers are nearly always BGRA 8-bit premultiplied. If you are not going to use CA to do the compositing, then the right vImage function is probably vImagePremultipliedAlphaBlend_RGBA/BGRA8888.

Flash to iPhone Very Slow Performance

I'm creating an iPhone app in Flash, and I've run into performance problems. I've stripped the entire thing down to a simple example (below). It draws a box to the screen, and uses TouchEvent to track finger gestures. Here's the problem: it is extremely sluggish on the iPhone 3G I am testing on. The box stutters up and down the page.
GPU mode is enabled in the application.xml, and when I set the -renderingdiagnostics flag, the text turns blue (meaning it is being rendered each time, which is correct), but the square stays white. It doesn't turn any of the three colors of diagnostics mode. Here is a screen of that:
http://whit.info/dev/flashapp/screen.jpg
And here is a video of the sluggishness:
http://vimeo.com/25160240
So, given that this is only one cached sprite moving vertically, am I missing something about enabling the GPU or bitmap caching? Or is this as good as it gets on this hardware? Other apps seem to glide brilliantly.
Can anyone assist?
Many thanks!
-Whit
package {
import flash.display.MovieClip;
import flash.display.Sprite;
import flash.events.TouchEvent;
import flash.ui.Multitouch;
import flash.ui.MultitouchInputMode;
import flash.text.TextField;
import flash.text.TextFormat;
import flash.text.TextFieldAutoSize;
[SWF(width='320', height='480', backgroundColor='#BACC00', frameRate='60')]
public class Main extends MovieClip{
private var square:Sprite;
private var txt:TextField;
private var startDragY:Number;
private var startObjY:Number;
public function Main(){
Multitouch.inputMode=MultitouchInputMode.TOUCH_POINT;
stage.addEventListener(TouchEvent.TOUCH_BEGIN, beginhandler);
stage.addEventListener(TouchEvent.TOUCH_MOVE, movehandler);
stage.addEventListener(TouchEvent.TOUCH_END, endhandler);
drawBox(0xffffff);
makeOutput();
}
private function beginhandler(evt:TouchEvent): void {
startDragY = evt.stageY;
startObjY = square.y;
}
private function movehandler(evt:TouchEvent): void {
out(String(evt.stageY));
square.y = startObjY - (startDragY - evt.stageY);
}
private function drawBox(fill:Number):void {
square = new Sprite();
square.graphics.beginFill(fill);
square.graphics.drawRect(20,60,40,40);
square.graphics.endFill();
stage.addChild(square);
square.cacheAsBitmap = true;
}
private function makeOutput():void {
txt = new TextField();
stage.addChild(txt);
txt.selectable = false;
txt.autoSize = TextFieldAutoSize.CENTER;
txt.defaultTextFormat = new TextFormat("Arial", 22, 0x000000);
txt.text = "Touch Screen";
txt.x = stage.stageWidth/2 - txt.width/2;
txt.y = stage.stageHeight/2 - txt.height/2;
}
private function out(str:String):void {
txt.text = str;
}
}
}
Also, here are the commands I'm using to compile:
.amxmlc ~/Files/Code/iOS/MyApp/Main.as
.pfi -package -renderingdiagnostics -target ipa-test -provisioning-profile MyApp.mobileprovision -storetype pkcs12 -keystore Certificates.p12 -storepass MyPass MyApp.ipa application.xml Main.swf Default.png icons
Latest Adobe updates are meant for the devices.
Our team was also facing the same problem and kinda solved the problem by updating the products.
We updated to:
Flash professional CS5.5
AIR 2.7
and the performance difference is noticeable.
Which version of the Flash exporter are you using? The CS5.5 update allegedly came with some speed boosts.
Also, the iPhone 3g on iOS 4 is already memory starved and I wouldn't be surprised if you were running out of memory with AIR loaded.
If you want good animation performance on an iPhone 3G (perhaps any iOS device), write your app in Objective C using Apple's provided Xcode tools. The result will run tons faster and eat up far less of your customer's device's battery and memory.

How do I program a stereo-capable graphics card to display stereo images?

I'd like to write my own stereo image viewer, because there are certain features I need which are missing from the one bundled with my NVidia/EVGA GTX 580.
I can't figure out how to program the card to enter "shutterglass" mode where every other frame (at 120 HZ) alternates left and right.
I've looked at the OpenGL, Direct3D, and XNA APIs, as well as information from NVIDIA, and can't figure out how to get started. How do I set separate left and right images, how do I tell the screen to display it, and how to I tell the driver to activate the shutterglass transmitter?
(Another disconcerting thing is that whenever I use the bundled software to view stereo images and video in shutterglass mode, it's in fullscreen, and the screen blinks when entering that mode--even though I run the screen at 120Hz in 2D. Is there a way to have a 3D surface in a window without upsetting the rest of the screen on the NVidia "gamer" cards that are 3D capable (570, 580)?
I'm a bit late to this, but I just got the stereoscopic 3D to work using nothing but a GTX 580 and OpenGL. No need for a quadro card or DirectX.
I have the nVidia 3D Vision driver and IR emitter and simply set the emitter to "Always on" in the nVidia control panel.
In my game engine, I switched to a full screen mode with 120Hz and render the scene twice with a slight frustum offset (as per nVidia's own documentation PDF on the manual implementation "2010_GTC2010.pdf").
No quad buffers or any other tricks needed, it works great. Plus, I am in control of all the settings, like convergence etc.
For the NVidia 3Dvision with the GEForce range you need to write a full screen directX surface twice the width of the display with the left image on the left,right on the right (duh).
Then you need to write a magic value into the bottom left of the image which the NVision driver picks up and turns on the glasses, you don't need the nvapi.dll
With the Nvidia pro glasses and a Quadra card you can use the regular OpenGL stereo API.
ps.I did find some sample code that manages to do this with a normal window.
Edit - it was a low level USB code talking to the xmitter that I could never get to build, I think it eventually became this http://sourceforge.net/projects/libnvstusb/
Here is some sample code for full screen with the NVision glasses.
I'm not a DirectX expert so some of this might be less than optimal.
My app is also based on Qt, there might be some Qt bits left in the code
-----------------------------------------------------------------
// header
void create3D();
void set3D();
IDirect3D9 *_d3d;
IDirect3DDevice9 *_d3ddev;
QSize _size; // full screen size
IDirect3DSurface9 *_imageBuf; //Source stereo image
IDirect3DSurface9 *_backBuf;
--------------------------------------------------------
// the code
#include <windows.h>
#include <windowsx.h>
#include <d3d9.h>
#include <d3dx9.h>
#include <strsafe.h>
#pragma comment (lib, "d3d9.lib")
#define NVSTEREO_IMAGE_SIGNATURE 0x4433564e //NV3D
typedef struct _Nv_Stereo_Image_Header
{
unsigned int dwSignature;
unsigned int dwWidth;
unsigned int dwHeight;
unsigned int dwBPP;
unsigned int dwFlags;
} NVSTEREOIMAGEHEADER, *LPNVSTEREOIMAGEHEADER;
// ORedflags in the dwFlagsfielsof the _Nv_Stereo_Image_Headerstructure above
#define SIH_SWAP_EYES 0x00000001
#define SIH_SCALE_TO_FIT 0x00000002
// call at start to set things up
void DisplayWidget::create3D()
{
_size = QSize(1680,1050); //resolution of my Samsung 2233z
_d3d = Direct3DCreate9(D3D_SDK_VERSION); // create the Direct3D interface
D3DPRESENT_PARAMETERS d3dpp; // create a struct to hold various device information
ZeroMemory(&d3dpp, sizeof(d3dpp)); // clear out the struct for use
d3dpp.Windowed = FALSE; // program fullscreen
d3dpp.SwapEffect = D3DSWAPEFFECT_DISCARD; // discard old frames
d3dpp.hDeviceWindow = winId(); // set the window to be used by Direct3D
d3dpp.BackBufferFormat = D3DFMT_A8R8G8B8; // set the back buffer format to 32 bit // or D3DFMT_R8G8B8
d3dpp.BackBufferWidth = _size.width();
d3dpp.BackBufferHeight = _size.height();
d3dpp.PresentationInterval = D3DPRESENT_INTERVAL_ONE;
d3dpp.BackBufferCount = 1;
// create a device class using this information and information from the d3dpp stuct
_d3d->CreateDevice(D3DADAPTER_DEFAULT,
D3DDEVTYPE_HAL,
winId(),
D3DCREATE_SOFTWARE_VERTEXPROCESSING,
&d3dpp,
&_d3ddev);
//3D VISION uses a single surface 2x images wide and image high
// create the surface
_d3ddev->CreateOffscreenPlainSurface(_size.width()*2, _size.height(), D3DFMT_A8R8G8B8, D3DPOOL_DEFAULT, &_imageBuf, NULL);
set3D();
}
// call to put 3d signature in image
void DisplayWidget::set3D()
{
// Lock the stereo image
D3DLOCKED_RECT lock;
_imageBuf->LockRect(&lock,NULL,0);
// write stereo signature in the last raw of the stereo image
LPNVSTEREOIMAGEHEADER pSIH = (LPNVSTEREOIMAGEHEADER)(((unsigned char *) lock.pBits) + (lock.Pitch * (_size.height()-1)));
// Update the signature header values
pSIH->dwSignature = NVSTEREO_IMAGE_SIGNATURE;
pSIH->dwBPP = 32;
//pSIH->dwFlags = SIH_SWAP_EYES; // Src image has left on left and right on right, thats why this flag is not needed.
pSIH->dwFlags = SIH_SCALE_TO_FIT;
pSIH->dwWidth = _size.width() *2;
pSIH->dwHeight = _size.height();
// Unlock surface
_imageBuf->UnlockRect();
}
// call in display loop
void DisplayWidget::paintEvent()
{
// clear the window to a deep blue
//_d3ddev->Clear(0, NULL, D3DCLEAR_TARGET, D3DCOLOR_XRGB(0, 40, 100), 1.0f, 0);
_d3ddev->BeginScene(); // begins the 3D scene
// do 3D rendering on the back buffer here
RECT destRect;
destRect.left = 0;
destRect.top = 0;
destRect.bottom = _size.height();
destRect.right = _size.width();
// Get the Backbuffer then Stretch the Surface on it.
_d3ddev->GetBackBuffer(0, 0, D3DBACKBUFFER_TYPE_MONO, &_backBuf);
_d3ddev->StretchRect(_imageBuf, NULL, _backBuf, &destRect, D3DTEXF_NONE);
_backBuf->Release();
_d3ddev->EndScene(); // ends the 3D scene
_d3ddev->Present(NULL, NULL, NULL, NULL); // displays the created frame
}
// my images come from a camera
// _left and _right are QImages but it should be obvious what the functions do
void DisplayWidget::getImages()
{
RECT srcRect;
srcRect.left = 0;
srcRect.top = 0;
srcRect.bottom = _size.height();
srcRect.right = _size.width();
RECT destRect;
destRect.top = 0;
destRect.bottom = _size.height();
if ( isOdd() ) {
destRect.left = _size.width();
destRect.right = _size.width()*2;
// get camera data for _left here, code not shown
D3DXLoadSurfaceFromMemory(_imageBuf, NULL, &destRect,_right.bits(),D3DFMT_A8R8G8B8,_right.bytesPerLine(),NULL,&srcRect,D3DX_DEFAULT,0);
} else {
destRect.left = 0;
destRect.right = _size.width();
// get camera data for _right here, code not shown
D3DXLoadSurfaceFromMemory(_imageBuf, NULL, &destRect,_left.bits(),D3DFMT_A8R8G8B8,_left.bytesPerLine(),NULL,&srcRect,D3DX_DEFAULT,0);
}
set3D(); // add NVidia signature
}
DisplayWidget::~DisplayWidget()
{
_d3ddev->Release(); // close and release the 3D device
_d3d->Release(); // close and release Direct3D
}