Here’s the second part of the ML Kit series and its going to be Face Detection! You pass in an image and you can get the coordinates of each face’s eyes, ears, etc. and recognise facial expression like people’s sweet smiles!
Or you can pass in a video then track and manipulate people’s faces in real-time (in the video of course, real life will have to wait). We won’t dive too much into the video part of it yet (I’ll try to get this updated ASAP to include it) so we’ll be focusing on image face detection for now.
If this is the first time you’ve heard about the Firebase ML Kit, check out its introduction here.
Add the Dependencies and Metadata
implementation 'com.google.firebase:firebase-ml-vision:16.0.0'
As with any other Firebase Service, we’ll start by importing this dependency which is the same one used for all the ML Kit features.
<application ...> ... <meta-data android:name="com.google.firebase.ml.vision.DEPENDENCIES" android:value="face" /> <!-- To use multiple models: android:value="face,model2,model3" --> </application>
Although this is optional, it’s highly recommended to at this to your AndroidManifest.xml as well. Doing so will have the machine learning model downloaded along with your app in the Play Store. Otherwise, the model will be downloaded during the first ML request you make, at which point, you can’t get any results from ML operations before the model is downloaded.
Configuring Face Detection Settings
There’s a few settings you might want to configure based on your app’s needs. I’m just going to rip this table straight off of the official docs.
Settings | |
---|---|
Detection mode | FAST_MODE (default) | DEFAULT_MODE
Favor speed or accuracy when detecting faces. |
Detect landmarks | NO_LANDMARKS (default) | ALL_LANDMARKS
Whether or not to attempt to identify facial “landmarks”: eyes, ears, nose, cheeks, mouth. |
Classify faces | NO_CLASSIFICATIONS (default) | ALL_CLASSIFICATIONS
Whether or not to classify faces into categories such as “smiling”, and “eyes open”. |
Minimum face size | FLOAT (default: 0.1f)
The minimum size, relative to the image, of faces to detect. |
Enable face tracking | false (default) | true
Whether or not to assign faces an ID, which can be used to track faces across images. |
And here’s an example also ripped straight off of the official docs
FirebaseVisionFaceDetectorOptions options = new FirebaseVisionFaceDetectorOptions.Builder() .setModeType(FirebaseVisionFaceDetectorOptions.ACCURATE_MODE) .setLandmarkType(FirebaseVisionFaceDetectorOptions.ALL_LANDMARKS) .setClassificationType(FirebaseVisionFaceDetectorOptions.ALL_CLASSIFICATIONS) .setMinFaceSize(0.15f) .setTrackingEnabled(true) .build();
Create the FirebaseVisionImage
Here’s where my interpretation of the article starts differing from the official docs, although this first step is identical to that of Text Recognition, so if you’ve read how to create a FirebaseVisionImage from there, this step will be EXACTLY the same as the one there.
FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap);
This object will prepare the image for ML Kit processing. You can make a FirebaseVisionImage from a bitmap, media.Image, ByteBuffer, byte array, or a file on the device.
From Bitmap
The simplest way to do it. The above code will work as long as your image is upright.
From media.Image
Such as when taking a photo using your device’s camera. You’ll need to get the angle by which the image must be rotated to be turned upright, given the device’s orientation while taking a photo, and calculate that against the default camera orientation of the device (which is 90 on most devices, but can be different for other devices).
private static final SparseIntArray ORIENTATIONS = new SparseIntArray(); static { ORIENTATIONS.append(Surface.ROTATION_0, 90); ORIENTATIONS.append(Surface.ROTATION_90, 0); ORIENTATIONS.append(Surface.ROTATION_180, 270); ORIENTATIONS.append(Surface.ROTATION_270, 180); } private int getRotationCompensation(String cameraId) throws CameraAccessException { int deviceRotation = getWindowManager().getDefaultDisplay().getRotation(); int rotationCompensation = ORIENTATIONS.get(deviceRotation); CameraManager cameraManager = (CameraManager) getSystemService(CAMERA_SERVICE); int sensorOrientation = cameraManager .getCameraCharacteristics(cameraId) .get(CameraCharacteristics.SENSOR_ORIENTATION); rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360; // Return the corresponding FirebaseVisionImageMetadata rotation value. int result; switch (rotationCompensation) { case 0: result = FirebaseVisionImageMetadata.ROTATION_0; break; case 90: result = FirebaseVisionImageMetadata.ROTATION_90; break; case 180: result = FirebaseVisionImageMetadata.ROTATION_180; break; case 270: result = FirebaseVisionImageMetadata.ROTATION_270; break; default: result = FirebaseVisionImageMetadata.ROTATION_0; Log.e(LOG_TAG, "Bad rotation value: " + rotationCompensation); } return result; } private void someOtherMethod() { int rotation = getRotationCompensation(cameraId); FirebaseVisionImage image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation); }
Long method do make all those calculations, but it’s pretty copy-pastable. Then you can pass in the mediaImage and the rotation to generate your FirebaseVisionImage.
From ByteBuffer
FirebaseVisionImageMetadata metadata = new FirebaseVisionImageMetadata.Builder() .setWidth(1280) .setHeight(720) .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21) .setRotation(rotation) .build(); FirebaseVisionImage image = FirebaseVisionImage.fromByteBuffer(buffer, metadata);
You’ll need the above (from media.Image) rotation method as well, on top of having to build the FirebaseVisionImage with the metadata of your image.
From File
FirebaseVisionImage image = FirebaseVisionImage.fromFilePath(context, uri);
Simple to present here in one line, but you’ll be wrapping this in a try-catch block.
Instantiate a FirebaseVisionFaceDetector
FirebaseVisionFaceDetector detector = FirebaseVision.getInstance() .getVisionFaceDetector(options);
The actual face recognition method belongs to this object.
Call detectInImage
detector.detectInImage(image).addOnSuccessListener( new OnSuccessListener<List<FirebaseVisionFace>>() { @Override public void onSuccess(List<FirebaseVisionFace> firebaseVisionFaces) { // Task completed successfully // ... } }) .addOnFailureListener(new OnFailureListener() { @Override public void onFailure(@NonNull Exception e) { // Task failed with an exception // ... } });
Use the detector, call detectInImage, pass in the image, add success and failure listeners, the success listener gives you a list of FirebaseVisionFaces. The code above says it all really.
What you can do with each FirebaseVisionFace
Here’s where the fun begins… All the following code assumes you loop through the FirebaseVisionFaces and are currently handling an object called face
Get Face coordinates and rotation
Rect bounds = face.getBoundingBox(); float rotY = face.getHeadEulerAngleY(); // Head is rotated to the right rotY degrees float rotZ = face.getHeadEulerAngleZ(); // Head is tilted sideways rotZ degrees
Get Facial Landmark Positions (Requires Landmark Detection enabled)
FirebaseVisionFaceLandmark leftEar = face.getLandmark(FirebaseVisionFaceLandmark.LEFT_EAR); if (leftEar != null) { FirebaseVisionPoint leftEarPos = leftEar.getPosition(); }
Identify Facial Expressions (Requires Face Classification enabled)
if (face.getSmilingProbability() != FirebaseVisionFace.UNCOMPUTED_PROBABILITY) { float smileProb = face.getSmilingProbability(); }
Get Face Tracking ID (Requires Face Tracking enabled)
if (face.getTrackingId() != FirebaseVisionFace.INVALID_ID) { int id = face.getTrackingId(); }