I wasted nearly a full day on this, so I thought I’d share my findings with everyone.  The YUV (really Y’CbCr, although I’ll use the terms interchangeably) video output from the iPhone/iPod/iPad camera takes a bit of tweaking to spit out a standard planar YUV frame.  In my case, I needed to convert to I420 to feed into a video codec function.  I got it working in a plain C-loop, but obviously that’s very inefficient.  I finally found this blog, but there were some slight issues with the code.  I’ll review what I did to get it to work in this post.

The iPhone 3GS, 4, 4s and the fourth generation iPod all share the same default video format:  kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange.  kCVPixelFormatType_420YpCbCr8BiPlanarFullRange is the same thing with a slightly wider chroma range.  This format outputs frames in two planes of memory: one for the Y (luma) and one for CbCr (chroma).  The data for 8 pixels looks like this:

Plane 1 - YYYYYYYY
Plane 2 - UVUV

Now, there are a few ways we could get this data out.  The Y-plane (luma) is easy enough, and iOS gives us some nice functions to get the base memory address.  The first thing is to setup a memory buffer. Since there is one Y value for every pixel, and one U and one V value for each 2x2 square, we need a buffer 1.5 times the size of our full frame dimensions.

size_t bufferLen = width_ * height_ * 1.5;
unsigned char *imgBuffer = malloc(bufferLen);

I won’t go into the details of setting up the AVCaptureSession as it’s quite straight forward. Your processPixelBuffer callback should look something like this:

- (void)processPixelBuffer: (CVImageBufferRef)pixelBuffer {
    
    CVPixelBufferLockBaseAddress(pixelBuffer, 0);
    int bufferHeight = CVPixelBufferGetHeight(pixelBuffer);
    int bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0);
    unsigned char* rowBase = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
    
    int bufferSize = bufferHeight * bytesPerRow;
    
    // Copy the Y-plane to memory buffer
    memcpy(imgBuffer, rowBase, bufferSize);
    
    // Copy the UV-plane to memory buffer
    rowBase = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 1);
    bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 1);
    bufferHeight = CVPixelBufferGetHeightOfPlane(pixelBuffer, 1);

    // This function uses ARM NEON intrinsics
    // http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0205j/BABGHIFH.html
    #ifdef _ARM_ARCH_7
    size_t uvPlane = 1; // Y plane is 0, CbCr plane is 1
    
    size_t planeWidth = CVPixelBufferGetWidthOfPlane(pixelBuffer, uvPlane);
    size_t planeHeight = CVPixelBufferGetHeightOfPlane(pixelBuffer, uvPlane);
        
    uint8_t *planeBaseAddress = (uint8_t *)CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, uvPlane);
    
    size_t planeSize = planeWidth * planeHeight;
    uint8_t *uPlane = (uint8_t *)malloc(planeSize);
    uint8_t *vPlane = (uint8_t *)malloc(planeSize);
    
    for (uint32_t i = 0; i < (planeWidth * planeHeight / 4); i++) {
        uint8_t *uvSrc = &planeBaseAddress[i * 8];
        uint8_t *uDest = &uPlane[i * 4];
        uint8_t *vDest = &vPlane[i * 4];
        uint8x8x2_t loaded = vld2_u8(uvSrc);    // Load 8 bytes into a 8-byte two lane vector and de-interleave along the way
                                                // NOTE: We only need 4 bytes, but no such type exists
        vst1_u8(uDest, loaded.val[0]);          // First four bytes go into the U-plane
        vst1_u8(vDest, loaded.val[1]);          // Next four bytes go into the V-plane
    }
    memcpy(((unsigned char *)imgBuffer) + bufferSize, uPlane, planeSize);
    memcpy(((unsigned char *)imgBuffer) + bufferSize + planeSize, vPlane, planeSize);
    
    free(uPlane);
    free(vPlane);
    
    #else
    // Fallback standard C function, or in my case doesn't need to be handled and just return since we only support video on ARMV7
    #endif
        
    CVPixelBufferUnlockBaseAddress( pixelBuffer, 0 );

    // Do something with your image buffer
}

I’ll briefly go over what the code is doing.  First, we create a 8x8x2 vector (two rows of eight bytes each).  We really only need four from each, but there is no datatype for this.  The laste four bytes of each iteration are ignored.  vld2_u8 is the function where the magic happens - this loads alternate bytes into the vector we created.  After that, it’s as easy as copying those bytes into the memory we allocated with uPlane and vPlane. If you still have questions, the URL in the code comments on the ARM website gives a great overview of the NEON functionality in the ARM core.

I left out the optional basic C code to do the same thing, as it wasn’t needed in our application, but it would be trivial to implement a fallback.  I hope this snippet helps you out in understanding the output of the iOS cameras and a quick way to de-interleave the video for standard YUV processing.


I&#8217;m setting up a new building door phone.  I decided on the 2N Helios IP Vario.  The build quality seems very good, but the software is absolutely terrible.  It took a little while, but everything seems to be working now.

I’m setting up a new building door phone.  I decided on the 2N Helios IP Vario.  The build quality seems very good, but the software is absolutely terrible.  It took a little while, but everything seems to be working now.


A few customer servers.  I&#8217;m a big fan of the 11th generation Dells.

A few customer servers. I’m a big fan of the 11th generation Dells.


Rack cabling cleanup.  Kudos to Marc.

Rack cabling cleanup. Kudos to Marc.


New monitor

New monitor


Let there be RAM.

Let there be RAM.





Chicago style stitch where cables exit the ladder rack.

Chicago style stitch where cables exit the ladder rack.


Practicing my cable lacing on small bundles. These are attached to the ladder rack with a Kansas City stitch.  A normal clove hitch and reef knot is used in a few places.  Eventually when they are larger they will be bound with RipWrap and the Kansas City stitches will be spaced further apart (right now they are every two feet on straight sections, one foot on curves).

Practicing my cable lacing on small bundles. These are attached to the ladder rack with a Kansas City stitch. A normal clove hitch and reef knot is used in a few places. Eventually when they are larger they will be bound with RipWrap and the Kansas City stitches will be spaced further apart (right now they are every two feet on straight sections, one foot on curves).


Fiber trough install is pretty much completed.

Fiber trough install is pretty much completed.


Fiber trough going up

Fiber trough going up


I lost some hair over this SUP720-3B.  It saw exactly 50% ingress packet loss on any fabric-connected line card.  Turns out it was bad.

I lost some hair over this SUP720-3B. It saw exactly 50% ingress packet loss on any fabric-connected line card. Turns out it was bad.


If you look carefully you can see the pinball machine.

If you look carefully you can see the pinball machine.