Using MediaPipe to detect the YMCA dance

Patrick Ryan
2 min readNov 30, 2021

--

Ok, this blog might not be ground breaking — but it sure was a fun way to show a group of people the kinds of problems machine learning can solve that would be very difficult with traditional techniques.

If by chance you have no idea what the YMCA dance is or what song I am referring to, you can check out the official video on YouTube.

For more details on this fun, silly project see my Github Repo.

MediaPipe

A number of blog posts have been written about MediaPipe. All I am going to say about it is that it was the framework I used to capture pose landmark information.

I was able to collect a number of training samples of what a ‘Y’ pose, ‘M’ pose, ‘C’ pose, ‘A’ pose and a ‘dance’ pose look like. Each of the samples include the x,y,z values for the body landmarks of interest.

I created a script to collect frames from a webcam and append the pose landmark data to a csv file. You can see that script called, 01_pose_training-data.py here.

Scikit-Learn

Once the data was collected for all of the poses, the problem became a very typical Scikit-Learn supervised machine learning problem.

We have a labeled dataset in csv format. I ran through a number of different models with GridSearchCV to determine the best model for the given data.

I created a script to run through all of the models. You can see that script called, 02_pose_model_training.py here.

Making Predictions

This is where it gets fun!

Using the trained model, we can start to capture new images from the computer webcam and make predictions on the dance pose.

Because it is not much fun to dance with yourself, even though Billy Idol had a hit song around that theme, I decided to add an option to the prediction script to add some additional ‘virtual’ dancers to make the entire process that much more fun and silly.

I created a script to run the predictions and you can see that script called, 03_pose_predictions.py here.

End of the song

My goal in putting this together was to create a fun and engaging way to show a group of people the kinds of problems that machine learning can solve. I think to spark interest and imagination in someone, the best way is to let them interact with the technology. Get them excite about the technology and the possibilities of using the technology.

Again, check out my github repo if you would like to try this. The scripts will capture any pose, train and predict on whatever pose you would like to capture and train on. It just so happens I picked something from my past.

If you try this yourself, send me the YouTube link!

--

--

No responses yet