Bumble Releases Dick-Pic-Fighting AI Into the Open-Source Wilds

by -185 views

At Bumble, safety has always been at the heart of our mission. Since 2018, nosotros’ve worked to help pass legislation in both the U.S. and U.M. to combat the sending of unsolicited nudes online, known every bit cyberflashing. In 2019, we harnessed applied science to better shield our community from unwanted lewd images, launching our Individual Detector™ A.I. feature in the Bumble app.

Private Detector™ works by automatically blurring a potential nude image shared within a conversation on Bumble. You’ll be notified, and information technology’s upward to you to decide whether to view or cake the image. (You tin can also easily report it to Bumble. Nosotros don’t tolerate whatsoever bad beliefs at all on our app.)

Now, Bumble’due south Data Science team has written a white paper explaining the technology of Private Detector™ and has made an open-source version of it bachelor on GitHub. It’south our hope that the characteristic will be adopted by the wider tech customs every bit nosotros work in tandem to make the net a safer place.

For more on Bumble’s legislative work, see



At Bumble Inc., the parent company of Bumble, Badoo, and Fruitz, safety has been a central function of our mission and a cadre value that informs the company’s product innovations and roadmap. We’ve leveraged the latest advancement in engineering science and Artificial Intelligence (AI) to assist provide our customs of users with the tools and resource they need to have a prophylactic experience on our platforms. In 2019 we launched Private Detector™ across Bumble and Badoo app, an AI-powered feature that detects and blurs lewd images and a warning is sent to users most the photograph before they open it.

As just ane of many players in the world of dating apps and social media at large, we besides recognize that there’s a need to address this issue beyond Bumble’southward production ecosystem and engage in a larger conversation about how to accost the issue of unsolicited lewd photos – also known as cyberflashing – to make the net a safer and kinder place for anybody.

In an effort to help address this larger upshot of cyberflashing, Bumble teamed up with legislators from across the aisle in 2019 in Texas to pass a beak that finer made sending unsolicited lewd photos a punishable offense. Since the passing of HB 2789 in Texas in 2019, Bumble has continued to successfully abet for similar laws across the The states and globally.

In 2022, Bumble reached another milestone in public policy by helping to pass SB 493 in Virginia and almost recently SB 53 in California, calculation another layer of online safety in ane of the most populous states in the United States.

These new laws are the kickoff stride to creating accountability and consequences for this everyday class of harassment that causes victims—predominantly women—to feel distressed, violated, and vulnerable online.

As Bumble continues to assist adjourn cyberflashing through legislative efforts and provide safety tools such as Private Detector™ to assist keep our community safe from unsolicited nudes within our apps, we hope to brand a ripple effect of change across the internet and social media at large. This is why today we are extremely proud to release a version of the Private Detector™ to the wider tech community with the hope of democratizing access to our technology and to help scientists and engineers experiencing the same challenges effectually the world to amend their approach to online safety.

How does it work?

Since the early days of Badoo, we have always been pioneers in leveraging technology and advanced procedures to amend both our match-making experience and our integrity and safety capabilities. Backside the scenes, we started designing and implementing automobile learning solutions for lewd prototype detection for about a decade, trying to leverage both our all-time-in-class knowledge in the tech infinite and the insights collected past our apps, cheers to our dominant position in the dating industry.

Machine learning (ML) is a field devoted to understanding and building methods that
learn (or ameliorate, mimic)
how to reach human-level performances on specific tasks, leveraging data to improve their accuracy. The evolution cycle requires you to carefully pattern and develop a neural network’south architecture and to provide information technology iteratively with a curated set of samples (dataset) from the problem – in our case, detecting if a motion picture contains lewd content or non.

Even though the number of users sending lewd images on our apps is luckily a negligible minority – but 0.1% – our scale allows us to collect a best-in-the-manufacture dataset of both lewd and non-lewd images, tailored to achieve the best possible performances on the task. Our Private Detector™ is trained using very loftier book data sets, with the negative samples (the ones
containing whatsoever lewd content) advisedly selected in order to improve reflect edge cases and other parts of the homo body (eg. legs, arms) in order not to flag them as abusive. Iteratively calculation samples to the training dataset to reverberate actual users’ behavior or exam misclassification, proved to be a successful do that we applied during the years in all our motorcar learning endeavors. Even if the downstream task is framed as a binary classification problem (as in our example!) nothing prevents data scientists from peradventure defining more than concepts (or labels), to possibly merge them dorsum right before the actual preparation epochs.

Traversing the trade-offs betwixt country-of-the-fine art performance and the power to serve our user base at calibration, we implemented (in its latest iteration) an EfficientNetv2-based binary classifier: a convolutional network that has faster grooming speed and overall ameliorate parameters efficiency. It uses a combination of better designed architecture and scaling, with layers like MBConv (that utilizes i×ane convolutions to broad up the space and depth-wise convolutions for reducing the number of overall parameters) and FusedMBConv (that merges some steps of the
MBConv above for faster execution), to jointly optimize grooming speed and parameter efficiency. The model has been trained leveraging our GPU powered data centers in a continuous do of dataset, network and hyperparameters (the settings used to speed upwardly or amend the training performance) optimization.

When analyzing its operation in different conditions (both offline and online) we are proud to state that it achieves world class performance (>98% accuracy, both in upsampled and product-like settings, with no clear tradeoffs between precision and recall).

What are nosotros releasing today?

Concomitantly with this White Paper, we are releasing on Github.com the source code we used to train the car learning engine powering the Individual Detector™, together with a
SavedModel to deploy the model equally it is (using TensorFlow Serving) and a checkpoint for maybe finetune information technology with additional images, improving its performance on samples that are important for specific use cases. In both scenarios, the repository comes with extensive documentation and a guide on how to perform those actions, in club to make the experience as smooth as possible for all the scientists, engineers or product folks around the world.

This version of the Private Detector™ is released nether the Apache License, so that it is bachelor for everyone to implement it as the standard for blurring lewd images as it is, or later on fine tuning information technology with additional training samples. Improvements to the architecture or to the overall code quality and construction are welcome.
Check out bumble-tech for whatever other exciting projects happening at Bumble.

Source: https://bumble.com/en/the-buzz/bumble-open-source-private-detector-ai-cyberflashing-dick-pics