QueTwo's Blog

thouoghts on telecommunications, programming, education and technology

Tag Archives: Computer Science

Integrating the Microsoft Kinect into your Flex application

One of the topics I will be talking about during my MAX session this year is integrating game controllers into your Flex/AIR or ActionScript games in order to make more naturally interactive games.  There are a lot of examples of using the Wii controller online already — and plus with the next generation of video game consoles coming out soon, I thought it would be best to spend my time on the latest and greatest — the Microsoft Kinect.

For those who don’t know much about the Kinect, it is a pair of Web Cams — one RGB (like a normal Web Cam), and one that sees only infrared.  The combination of the two (along with a bunch of other electronics and software) allow the Kinect to have real depth-preception.  This means that being able to track people, hands and everything else become trivial.  The Microsoft Kinect was actually developed by PrimeSense, who originally made the device for PCs, rather than game consoles.  This means that the devices are actually compatible with PCs, rather than relying on a whole stack of hacks to get them to work. 

The biggest difficulty in getting the Kinect working is wading through all the pieces and parts that are needed to get it all working.  Documentation is in very short supply (and usually found within text files on GitHub).  Compounding the problem is that there are tens, if not hundreds of open-source projects revolving the Kinect — but unfortunately each depend on a different set of drivers, middleware and APIs — many which are NOT compatible with eachother. 

I will state right here that I am not an expert in all of these, but I did follow one path that made things work (after following MANY that didn’t).  Matt LeGrand and I are planning a separate website and blog that will explore these other paths more to help people make sense of what to choose and what not to choose.  My primary experience is getting the Kinect working with Windows 7, as I am well aware that my aging Mac laptop is not powerful enough to do much with the Kinect.  Yes, you need a modern computer in order to use the Kinect.

So, where would you start?  You obviously need the actual Microsoft Kinect controller.  There are two models in the market right now — one that is USB only, and one with a USB + Power supply.  The USB only version will ONLY work with the XBox, where the USB + Power Supply version works with PCs (the Kinect draws a few amp of power, which the XBox apparently can supply using USB).  The only difference between the two packages is the power brick and a USB power injection module.  If you end up with the wrong one, you are supposed to be able to buy the power brick and injection module, but I have no idea where one would pick that up.  I got my kit from Best Buy in the video game refurbished aisle (it was about $100, after an instant coupon).  The brand-new ones were about $140.

Before you plug it in, you need to install the drivers, middleware and APIs.  There are three well-known sets of drivers and middleware, but the ones I found worked for me are directly from PrimeSence.  The drivers and middleware are hosted on OpenNI’s website at http://www.openni.org/downloadfiles/opennimodules.  There are three downloads you need — the OpenNI Module (OpenNI Binaries), the PrimeSense NITE Middleware, and the PrimeSense Drivers (OpenNI Compliant Hardware Binaries).  Download only the 32-bit downloads, even if you are on a 64-bit Windows 7 install!!  I’ve found LOTS of issues with the 64-bit drivers that cause things to break.    Install the drivers, then the middleware, then the OpenNI APIs (in that order).

Finally, you will need the as3 modules.  There is an open-source project known as AS3OpenNI that makes programming towards the OpenNI APIs very simple.  Because AIR can’t directly talk to the APIs you have to use the included C++ application that proxies the driver calls to a TCP/IP connection.  I’m sure this will become easier in future versions of AIR.   AS3OpenNI takes all the hard word of processing the blob data that comes back from OpenNI and gives you either skeletal data (as a class), RGB, Depth, or “multitouch blob” data.  With this data you can read back data from multiple users and track their hands, head, neck, legs, etc.  I built all of my stuff on version 1.3.0, which is really stable.

Take a look at the examples in the AS3OpenNI project — they are pretty descriptive, and once you get all the parts working together, they work really, really well :) 

So, what did I make using the Kinect?  One of my first games I threw together was a simple version of Space Invaders.  I have AS3OpenNI track the right hand to move the player left and right, and when the right hand moves over the center point of the neck, I fire a missile.  The following code is all that is really needed (minus the setup of the player, etc.) :

protected function gotShipMove(event:ONISkeletonEvent):void
{
 var rightHand3D:NiPoint3D = event.rightHand; // get the right hand's x,y,z
 var rightHand:NiPoint2D = NiPoint3DUtil.convertRealWorldToScreen(rightHand3D, this.stage.width, this.stage.height);
    
 shipIcon.x = rightHand.pointX;
 if ((event.skeleton.rightHand.pointY > event.skeleton.neck.pointY) && canFire)
 {
  if (!fireInProgress)
  {
   fireInProgress = true;   // prevent 5,000 missiles from firing at once...
   var missile:MissileItem = new MissileItem(spaceInvadersCluster);
   missile.x = rightHand.pointX;
   missile.y = height - 64;
   missile.addEventListener("MissileFireComplete", missileFireComplete);
   missile.addEventListener("MissileFireHit", missileFireComplete);
   addElement(missile);
  }
 }
 else
 {
  fireInProgress = false;
 }
}

Watch my twitter stream for more information about our (Matt and I) new website dedicated to the Kinect and AS3 :)  I will also cover the game a lot more during MAX, so make sure to come!

The BikePOV. Adobe AIR + Arduino + Blinking lights on a bike

So, for the past month I have been working on a side project called the BikePOV.  If you have been reading my tweets, I’m sure you’ve picked up on my cursing, explaining and working on making it work. 

This evening I finally got everything working just the right way — and it actually works!

So, first let me explain what is going on.  I took an Arduino prototyping board and designed a circuit around it.  Essentially I took 12 RGB (Red, Green, Blue) LEDS and soldered them onto a circuit board.  I then mounted the circuit board in between the spokes of a bike wheel.  The theory is that when the wheel turns, I can control the LEDs, and make them flash in a pattern that represents letters, patterns or images.  This is called a POV, or Persistance of Vision. 

This idea has been done before — there are pre-made kits that you can buy from a company called AdaFruit.  A company called Monkeyletric also sells a POV kit for about $60 (which is MUCH nicer than my setup, but they only have pre-done patterns). Read more of this post

H.264 Being removed from Google’s Chrome Browser

This afternoon Google announced that they were dropping support for the H.264 codec in future releases of their browser, Chrome.  If you want to read more about the announcement, check it out here

Essentially their argument is that they want to support Open-Source.  And there is no better way to support open-source than to include support for the new codec that they just bought (VP8) and announced would be open-source (WebM).  On paper, it sounds great and the world should cheer for them.  I personally support ALL the browsers including support for WebM so at least there is some consistency across the landscape for a single codec.

However, how Google is introducing WebM into their browser is by removing the support for H.264 codec.  For those of you who don’t know what H.264 is, it is by far the most widely supported video codec today — being directly supported by DVD players, Blu-Ray players, video game systems, and more importantly, enterprise video systems. 

Over the last few years we have seen enterprise video make the shift from tape to MPEG-2, to MPEG-4 (H.264) video for recording, editing and storage.  The great thing about the industry was that as time passed more and more devices were getting support for H.264, so there was no need to re-encode your video to support each device under the sun.  It was starting to become a world where we could publish our video in a single format and be able to push it out to a great audience.  Re-encoding video takes time, storage space and typically reduces the quality of the end product.

What is even more unfortunate is that recently we have seen H.264 decoders get moved to hardware on many devices, allowing them to decode high-definition content and display them in full-screen without hitting the processor.  This is starting to allow developers to focus on making more interactive user interfaces and better looking interfaces without worrying about degrading the quality of the video.  A good example of this is the H.264 decoder that is built into the newer Samsung televisions — you can run HD content, as long as you are running H.264 — otherwise you are lucky to see a 640×480 video clip without taxing the processor.   One of the reasons why HD video on the web sucked so bad for Mac based computers was Apple didn’t allow the Flash Player to access the H.264 hardware accelerator.  Newer releases of the Flash Player now support it and HD video is smooth as silk on the web. 

By Google moving to WebM and abandoning H.264 it forces manufactures to reassess their decision to support H.264 in hardware.  It will also force content producers to re-encode their video in yet another format (can you imagine YouTube re-encoding their content into WebM just to support a Chrome users?), and will make the HTML5 video tag to be even harder to work with.  Content producers will need to produce their content in H.264 for Flash, many WebKit browsers, and IE9, WebM for Chrome, and Ogg Theora for Firefox.  The real tricky part right now is there are virtually no commercial products that can encode in all three required formats to date. 

So, if you are looking to support video on the web — what are you to do?  Right now a majority of the users don’t support HTML5 video — but they do support H.264 through the Flash Player.  iPad/iPhone/Android devices support H.264 directly, and embedded systems like the Samsung HDTVs, Sony PS3, etc. all support H.264 encoded video.  The only reason to encode in a different format today is to support Firefox and Chrome users that don’t have the Flash Player installed (less than 1% of all users?).  The recommendation I’m sure will change over the years, but for right now, you still pretty much have a one-codec-meets-most-demands solution that you can count on.

Know the code you are in…

Last week I had the pleasure of attending the Detroit Java User’s Group presentation from Josh Holmes on Application Simplicity.  Walking into the presentation, I, along with most of the people that attended expected the typical UX presentation (keep it simple, stupid! among other common slogans), but I was pleasantly surprised at one of the tangents that we went across.

“If you have to think about how to explain the solution, it is entirely too complex.”

We then went into a great dialog about how you can best arrive to the most simple solution.   One of the things he kept drilling to us was this : If you don’t understand what your code is doing in the background, you can’t write good code.

Today, we live in a world of code-hinting, code-wizards, code cookbooks, and frameworks that write our applications for us.  Far too often I’ve been asked to help debug an application or help solve an issue where the person really has no idea what their code is really doing in the background.  Dragging and dropping a data-grid onto the stage, dragging a database connection to it and publishing the application is great — but can you actually tell me what is happening in the background.  Would you be able to explain why it starts to slow down after running the application was running for 5 minutes?

In the United States we tend to have two camps of people who create applications.  Some people are based in computer science or computer engineering.  These people have taken some sort of formal classes in computer science theory and understand memory management, multi-threaded computing, etc.  The other camp is primarily developers — people who typically don’t have formal CS/CE training, and often will require the use of tools or frameworks to complete their projects.  The difference is either engineering software or developing software.

Do you know what kind of code your tools or frameworks are creating?  How much overhead does storing that variable as a string vs. a boolean have?  How about creating your own class?  How does your framework / language manage CPU cycles when you store large amounts of data?

Just things to think about.  Knowing how your underlying code works, and not just relying on a tool to create your application can help you write better code.  Anybody can use a tool to connect a web page to a database, but not everybody can engineer an application.

Follow

Get every new post delivered to your Inbox.

Join 27 other followers