e dot dot dot - a mostly about the Internet blog by James Raposa 16 07 2020

a mostly about the Internet blog by

Content Moderation Case Study: Dealing With Misinformation In Search (2004)

Furnished content.

This series of case studies is published in partnership with the Trust & Safety Foundation to examine the difficult choices and tradeoffs involved in content moderation. Learn more »Summary: Google's biggest early innovation in search was that it used inbound links as a tool for determining the popularity of a website, and thus what its relevance to a particular search might be. That feature, however, created some side effects that raised concerns about how search results might lead to misinformation, or how the search engine might be gamed.One of the earliest examples of this was the discovery in 2004 that the first result of a search on the word jew pointed to a blatantly anti-semitic website, Jewwatch. It was widely theorized that the reason for this was that the singular noun jew was more likely to be used by those pushing anti-semitic arguments, rather than the more common adjective jewish or the phrase jewish wo/man etc. Also, the site Jewwatch had been in existence for many years, and had many inbound links from other sources.Some also believed that the people behind Jewwatch had used an early search engine optimization technique known as Googlebombing to purposefully game the results deliberately linking to Jewwatch from other sites, and using the word jew as the link text.As this result got attention, Google came under tremendous pressure to change the search result, as people accused the company of anti-semitism or deliberately pointing to the Jewwatch site in search results. The Anti-Defamation League sent a letter to Google asking it to explore whether or not its ranking system needed to be changed (though the ADL also posted an article to its own site telling people that it was clear that the result was not intentional, or done for nefarious reasons). Some politicians, including Senator Chuck Schumer, also got involved to pressure Google to change its results.Decisions to be made by Google:

Should the top search results be manually changed when it is discovered they lead to misinformation and hate?
Should the algorithm be changed to try to avoid these results?
Should the company do nothing and say that the algorithm decides the results, period?
Should any decision set a precedent for future decisions, and if so, what policies and guidelines need to be put in place to deal with future cases?
Are there other ways to respond to this situation?
How should Google handle attempts to game search via things like Googlebombing?

Questions and policy implications to consider:

If any changes are made, will lots of others expect similar changes to be made as well?
Will making changes lead to questions regarding the credibility of search results and the Google algorithm?
What sorts of policies and processes need to be in place to deal with these kinds of requests?
Will any changes have other, unintended consequences as well?
Are search engine optimization techniques nefarious? Can they be? If so, how do you distinguish between good intentions and bad intentions?
If you block certain techniques, such as Googlebombing, will that stop the practice when used for good purposes as well?

Resolution: Google responded by clearly stating that it had no direct intentions to change its algorithm. However, it did decide to provide more information, by using the advertising space above the top result to encourage people to click through for more information about how the results came about:

The company also stated that it would explore additional ways of addressing issues like this in the future.Perhaps more interesting, however, was that Google's users took matters into their own hands, and realized that if Jewwatch was Googlebombing, they could use the same tools to diminish the result. A campaign was quickly organized online, with many people linking the word jew to Wikipedia's page on Judaism, and indeed, this worked to get that result to the top of the rankings.Over time, Google's algorithms were adjusted globally to try to diminish the power of Googlebombing for any reason (good or bad). In 2007, the company announced that it believed its algorithm would filter out attempts at Googlebombing. In that discussion, the employees who helped stop the effectiveness of Googlebombing explained why they did so, and how they believed it was better to take a holistic approach (which was more scalable) than responding to individual bad results:

People have asked about how we feel about Googlebombs, and we have talked about them in the past. Because these pranks are normally for phrases that are well off the beaten path, they haven't been a very high priority for us. But over time, we've seen more people assume that they are Google's opinion, or that Google has hand-coded the results for these Googlebombed queries. That's not true, and it seemed like it was worth trying to correct that misperception. So a few of us who work here got together and came up with an algorithm that minimizes the impact of many Googlebombs.The next natural question to ask is "Why doesn't Google just edit these search results by hand?" To answer that, you need to know a little bit about how Google works. When we're faced with a bad search result or a relevance problem, our first instinct is to look for an automatic way to solve the problem instead of trying to fix a particular search by hand. Algorithms are great because they scale well: computers can process lots of data very fast, and robust algorithms often work well in many different languages. That's what we did in this case, and the extra effort to find a good algorithm helps detect Googlebombs in many different languages. We wouldn't claim that this change handles every prank that someone has attempted. But if you are aware of other potential Googlebombs, we are happy to hear feedback in our Google Web Search Help Group.

Read more here

posted at: 12:01am on 16-Jul-2020
path: /Policy | permalink | edit (requires password)

0 comments, click here to add the first

Fan Uses AI Software To Lipread What Actors Really Said In TV Series Before Chinese Authorities Censored Them

Furnished content.

It's hardly news to Techdirt readers that China carries out censorship on a massive scale. What may be more surprising is that its censorship extends to even the most innocuous aspects of life. The ChinAI Newsletter, which provides translations by Jeff Ding of interesting texts from the world of Chinese AI, flags up one such case. It concerns a Chinese online TV series called "The Bad Kids". Here's how the site Sixth Tone describes it:

Since its first episodes were released on China's Netflix-like video platform iQiyi in mid-June, "The Bad Kids" has earned sweeping praise for its plot, cinematography, casting, dialogue, pacing, and soundtrack. It's also generated wide-ranging online discussion on human nature due to the psychology and complex motivations of its characters.

However, as the Sixth Tone article points out, the authorities required "a lot of changes" for the series to be approved. One fan of "The Bad Kids", Eury Chen, wanted to find out what exactly had been changed, and why that might be. In a blog post translated by ChinAI, Chen explained how he went about this:

Two days ago, I watched the TV series "The Bad Kids" in one go, and the plot was quite exciting. The disadvantage is that in order for the series to pass the review (of the National Radio and Television Administration), the edited sequences for episodes 11 and 12 were disrupted, even to the point that lines were modified, so that there are several places in the film where the actor's mouth movements and lines are not matched, which makes the plot confusing to people. Therefore, I tried to restore the modified lines through artificial intelligence technology, thereby restoring some of the original plot, which contained a darker truth.

The AI technology involved using Google's Facemesh package, which can track key "landmarks" on faces in images and videos. By analyzing the lip movements, it is possible to predict the sounds of a Chinese syllable. However, there is a particular problem that makes it hard to lipread Chinese using AI. There are many homophones in Chinese (similar sounds, different meanings). In order to get around this problem, Chen explored the possible sequences of Chinese characters to find the ones that best match the plot at that point. As his blog post (and the ChinAI translation) explains, this allowed him to work out why certain lines were blocked by the Chinese authorities -- turns out it was for totally petty reasons.Perhaps more interesting than the details of this particular case, is the fact that it was possible to use AI to carry out most of the lipreading, leaving human knowledge to choose among the list of possible Chinese phrases. Most languages don't require that extra stage, since they rarely have the same number of homophones that Chinese does. Indeed, for English phrases, researchers already claimed in 2016 that their AI-based LipNet achieved "95.2% accuracy in sentence-level, overlapped speaker split task, outperforming experienced human lipreaders".It's clear that we are fast approaching a situation where AI is able to lipread a video in any language. That is obviously a boon for the deaf or hard of hearing, but there's a serious downside. It means that soon all those millions of high-quality CCTV systems around the world will not only be able to use facial recognition software to work out who we are, but also run AI modules to lipread what we are saying.Follow me @glynmoody on Twitter, Diaspora, or Mastodon.

Read more here

posted at: 12:01am on 16-Jul-2020
path: /Policy | permalink | edit (requires password)

0 comments, click here to add the first

RSS (site) RSS (path)

ATOM (site) ATOM (path)

Categories

- blog home

- Announcements  (0)
- Annoyances  (0)
- Career_Advice  (0)
- Domains  (0)
- Downloads  (3)
- Ecommerce  (0)
- Fitness  (0)
- Home_and_Garden  (0)
     - Cooking  (0)
     - Tools  (0)
- Humor  (0)
- Notices  (0)
- Observations  (1)
- Oddities  (2)
- Online_Marketing  (0)
     - Affiliates  (1)
     - Merchants  (1)
- Policy  (3743)
- Programming  (0)
     - Bookmarklets  (1)
     - Browsers  (1)
     - DHTML  (0)
     - Javascript  (3)
     - PHP  (0)
     - PayPal  (1)
     - Perl  (37)
          - blosxom  (0)
     - Unidata_Universe  (22)
- Random_Advice  (1)
- Reading  (0)
     - Books  (0)
     - Ebooks  (0)
     - Magazines  (0)
     - Online_Articles  (5)
- Resume_or_CV  (1)
- Reviews  (2)
- Rhode_Island_USA  (0)
     - Providence  (1)
- Shop  (0)
- Sports  (0)
     - Football  (0)
          - Cowboys  (0)
          - Patriots  (0)
     - Futbol  (0)
          - The_Rest  (0)
          - USA  (0)
- Technology  (1198)
- Windows  (1)
- Woodworking  (0)

Archives
-	2024 April (134)
-	2024 March (179)
-	2024 February (168)
-	2024 January (146)
-	2023 December (140)
-	2023 November (174)
-	2023 October (156)
-	2023 September (161)
-	2023 August (49)
-	2023 July (40)
-	2023 June (44)
-	2023 May (45)
-	2023 April (45)
-	2023 March (53)

My Sites

- Millennium3Publishing.com

- SponsorWorks.net

- ListBug.com

- TextEx.net

- FindAdsHere.com

- VisitLater.com