Microsoft quietly deleted the million face recognition database

WASHINGTON June 6, 2019, with face issue of the identification data spread human rights violations at the international, Microsoft quietly deletes its largest public face recognition database US (Stanford University) and Duke Daxue (Duke University) has also removed the facial recognition data.

According to the Financial Times, a few days after Microsoft used its largest database, the company quietly removed its database data for the global facial recognition system from the Internet and deleted about 10 million face information. Microsoft said, “The purpose of this website is for academic needs. It was later operated by employees who no longer cooperate with Microsoft, and now (the website) has been removed.”

The database, called MS Celeb, was released in 2016 and is called the world’s largest public face recognition database by Microsoft. It has more than 10 million images and nearly 100,000 facial information. MS Celeb is mainly used to train facial recognition systems in many countries, including military researchers and Chinese companies such as SenseTime and Megvii.

According to the citations in the artificial intelligence paper, there are already many commercial organizations using the MS Celeb database, IBM, Panasonic, Alibaba, Nvidia, Hitachi, Shangtang Technology, and Vision Technology. Shangtang Technology and Defiance Technology are equipment suppliers of the Xinjiang government of the Communist Party of China. A large number of local Uighurs and Muslim minorities are tracked and detained by the authorities in detention camps.

Due to the photos used in the database, Microsoft did not remove the consent of the photo party, and Microsoft removed their facial data information from search engines and videos under the Creative Commons license.

In addition to Microsoft’s deletion of face database data, two other academic units also deleted relevant data, including the Duke MTMC monitoring database built by Duke University researchers and the Brainwash database of Stanford University.

The Brainwash database uses information from customers at the Brainwash Cafe in the San Francisco Sea Area, where they shoot their customers through live cameras. A Stanford University spokesperson said that after a request from one of the researchers, the database had been deleted and the school was committed to protecting the privacy of schools and communities.

The infringement problems of these three databases were discovered by Berlin researcher Adam Harvey, whose Megapixels project documents the details of many databases and how they are used. The Harvey survey found that Microsoft itself has used these databases to train facial recognition calculations.

Microsoft named the database “Celeb” (name stream), indicating that the face it outlined is a photo of a public figure. According to media verification, MS Celeb does include information on special characters and media people, such as Kim Zetter’s senior journalist on Wired magazine, covering cybercrime, civil liberties, privacy and security. theme. Adrian Chen and Shoshana Zuboff, author of The Age of Surveillance Capitalism.

Harvey pointed out that Microsoft uses the term “celebrity” to include people who only work online and have visibility in the digital world. When the Financial Times contacted the parties included in the database, they did not realize that their photos had been included. Adam Greenfield, a technology writer, said, “I am by no means a public. I have no way to give up my privacy.” “This shows that Microsoft cannot keep its researchers honest and upright before employees leave. Scrap (database).”

Michael Weir, a technology policy researcher at the Alan Turing Institute, said that Microsoft may violate the General Data Protection Law passed by the European Union, which came into force last year, and the MS Celeb database is in regulation. Put into use after it takes effect.

In response, Microsoft said that the site has been deleted, it is not clear whether the General Data Protection Act has an impact.

Although the database has been removed by Microsoft, it is still available to researchers and companies that have previously downloaded. Harvey said that it is still shared on opensource website.