I’m setting up a new building door phone. I decided on the 2N Helios IP Vario. The build quality seems very good, but the software is absolutely terrible. It took a little while, but everything seems to be working now.
I wasted nearly a full day on this, so I thought I’d share my findings with everyone. The YUV (really Y’CbCr, although I’ll use the terms interchangeably) video output from the iPhone/iPod/iPad camera takes a bit of tweaking to spit out a standard planar YUV frame. In my case, I needed to convert to I420 to feed into a video codec function. I got it working in a plain C-loop, but obviously that’s very inefficient. I finally found this blog, but there were some slight issues with the code. I’ll review what I did to get it to work in this post.
The iPhone 3GS, 4, 4s and the fourth generation iPod all share the same default video format: kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange. kCVPixelFormatType_420YpCbCr8BiPlanarFullRange is the same thing with a slightly wider chroma range. This format outputs frames in two planes of memory: one for the Y (luma) and one for CbCr (chroma). The data for 8 pixels looks like this:
Plane 1 - YYYYYYYY Plane 2 - UVUV
Now, there are a few ways we could get this data out. The Y-plane (luma) is easy enough, and iOS gives us some nice functions to get the base memory address. The first thing is to setup a memory buffer. Since there is one Y value for every pixel, and one U and one V value for each 2x2 square, we need a buffer 1.5 times the size of our full frame dimensions.
size_t bufferLen = width_ * height_ * 1.5; unsigned char *imgBuffer = malloc(bufferLen);
I won’t go into the details of setting up the AVCaptureSession as it’s quite straight forward. Your processPixelBuffer callback should look something like this:
- (void)processPixelBuffer: (CVImageBufferRef)pixelBuffer {
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
int bufferHeight = CVPixelBufferGetHeight(pixelBuffer);
int bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0);
unsigned char* rowBase = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
int bufferSize = bufferHeight * bytesPerRow;
// Copy the Y-plane to memory buffer
memcpy(imgBuffer, rowBase, bufferSize);
// Copy the UV-plane to memory buffer
rowBase = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 1);
bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 1);
bufferHeight = CVPixelBufferGetHeightOfPlane(pixelBuffer, 1);
// This function uses ARM NEON intrinsics
// http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0205j/BABGHIFH.html
#ifdef _ARM_ARCH_7
size_t uvPlane = 1; // Y plane is 0, CbCr plane is 1
size_t planeWidth = CVPixelBufferGetWidthOfPlane(pixelBuffer, uvPlane);
size_t planeHeight = CVPixelBufferGetHeightOfPlane(pixelBuffer, uvPlane);
uint8_t *planeBaseAddress = (uint8_t *)CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, uvPlane);
size_t planeSize = planeWidth * planeHeight;
uint8_t *uPlane = (uint8_t *)malloc(planeSize);
uint8_t *vPlane = (uint8_t *)malloc(planeSize);
for (uint32_t i = 0; i < (planeWidth * planeHeight / 4); i++) {
uint8_t *uvSrc = &planeBaseAddress[i * 8];
uint8_t *uDest = &uPlane[i * 4];
uint8_t *vDest = &vPlane[i * 4];
uint8x8x2_t loaded = vld2_u8(uvSrc); // Load 8 bytes into a 8-byte two lane vector and de-interleave along the way
// NOTE: We only need 4 bytes, but no such type exists
vst1_u8(uDest, loaded.val[0]); // First four bytes go into the U-plane
vst1_u8(vDest, loaded.val[1]); // Next four bytes go into the V-plane
}
memcpy(((unsigned char *)imgBuffer) + bufferSize, uPlane, planeSize);
memcpy(((unsigned char *)imgBuffer) + bufferSize + planeSize, vPlane, planeSize);
free(uPlane);
free(vPlane);
#else
// Fallback standard C function, or in my case doesn't need to be handled and just return since we only support video on ARMV7
#endif
CVPixelBufferUnlockBaseAddress( pixelBuffer, 0 );
// Do something with your image buffer
}
I’ll briefly go over what the code is doing. First, we create a 8x8x2 vector (two rows of eight bytes each). We really only need four from each, but there is no datatype for this. The laste four bytes of each iteration are ignored. vld2_u8 is the function where the magic happens - this loads alternate bytes into the vector we created. After that, it’s as easy as copying those bytes into the memory we allocated with uPlane and vPlane. If you still have questions, the URL in the code comments on the ARM website gives a great overview of the NEON functionality in the ARM core.
I left out the optional basic C code to do the same thing, as it wasn’t needed in our application, but it would be trivial to implement a fallback. I hope this snippet helps you out in understanding the output of the iOS cameras and a quick way to de-interleave the video for standard YUV processing.
Practicing my cable lacing on small bundles. These are attached to the ladder rack with a Kansas City stitch. A normal clove hitch and reef knot is used in a few places. Eventually when they are larger they will be bound with RipWrap and the Kansas City stitches will be spaced further apart (right now they are every two feet on straight sections, one foot on curves).
I lost some hair over this SUP720-3B. It saw exactly 50% ingress packet loss on any fabric-connected line card. Turns out it was bad.













