e dot dot dot
a mostly about the Internet blog by

January 2021
Sun Mon Tue Wed Thu Fri Sat
         
           


Seven Years Ago, CERN Gave Open Access A Huge Boost; Now It's Doing The Same For Open Data

Furnished content.


Techdirt readers will be very familiar with CERN, the European Council for Nuclear Research (the acronym comes from the French version: Conseil Européen pour la Recherche Nucléaire). It's best known for two things: being the birthplace of the World Wide Web, and home to the Large Hadron Collider (LHC), the world's largest and most powerful particle accelerator. Over 12,000 scientists of 110 nationalities, from institutes in more than 70 countries, work at CERN. Between them, they produce a huge quantity of scientific papers. That made CERN's decision in 2013 to release nearly all of its published articles as open access one of the most important milestones in the field of academic publishing. Since 2014, CERN has published 40,000 open access articles. But as Techdirt has noted, open access is just the start. As well as the final reports on academic work, what is also needed is the underlying data. Making that data freely available allows others to check the analysis, and to use it for further investigation -- for example, by combining it with data from elsewhere. The push for open data has been underway for a while, and has just received a big boost from CERN:

The four main LHC collaborations (ALICE, ATLAS, CMS and LHCb) have unanimously endorsed a new open data policy for scientific experiments at the Large Hadron Collider (LHC), which was presented to the CERN Council today. The policy commits to publicly releasing so-called level 3 scientific data, the type required to make scientific studies, collected by the LHC experiments. Data will start to be released approximately five years after collection, and the aim is for the full dataset to be publicly available by the close of the experiment concerned. The policy addresses the growing movement of open science, which aims to make scientific research more reproducible, accessible, and collaborative.The level 3 data released can contribute to scientific research in particle physics, as well as research in the field of scientific computing, for example to improve reconstruction or analysis methods based on machine learning techniques, an approach that requires rich data sets for training and validation.
CERN's open data portal already contains 2 petabytes of data -- a figure that is likely to rise rapidly, since LHR experiments typically generate massive quantities of data. However, the raw data will not in general be released. The open data policy document (pdf) explains why:
This is due to the complexity of the data, metadata and software, the required knowledge of the detector itself and the methods of reconstruction, the extensive computing resources necessary and the access issues for the enormous volume of data stored in archival media. It should be noted that, for these reasons, general direct access to the raw data is not even available to individuals within the collaboration, and that instead the production of reconstructed data (i.e. Level-3 data) is performed centrally. Access to representative subsets of raw data -- useful for example for studies in the machine learning domain and beyond -- can be released together with Level-3 formats, at the discretion of each experiment.
There will also be Level 2 data, "provided in simplified, portable and self-contained formats suitable for educational and public understanding purposes". CERN says that it may create "lightweight" environments to allow such data to be explored more easily. Virtual computing environments for the Level 3 data will be made available to aid the re-use of this primary research material. Although the data is being released using a Creative Commons CC0 waiver, acknowledgements of the data's origin are required, and any new publications that result must be clearly distinguishable from those written by the original CERN teams.As with the move to open access in 2013, the new open data policy is unlikely to have much of a direct impact for people outside the high energy physics community. But it does represent an extremely strong and important signal that CERN believes open data must and will become the norm.Follow me @glynmoody on Twitter, Diaspora, or Mastodon.

Read more here

posted at: 12:01am on 05-Jan-2021
path: /Policy | permalink | edit (requires password)

0 comments, click here to add the first



Obscure Analytics Tool Helps Cops Make Sense Of All That Location Data They're Grabbing Without A Warrant

Furnished content.


FOIA requests, leaked documents, data breaches, Congressional testimony… all of these have led to the outing of cellphone surveillance tech utilized by law enforcement. As far back as 2014, Chris Soghoian -- former ACLU "technologist" and current Senator Wyden advisor -- was telling cops their "secret" Stingray devices weren't all that secret anymore.But the market for tracking people via their cellphones remains uncornered. For the most part, Stingrays (cell tower spoofers) need warrants to operate. The same goes for demanding weeks or months of historical cell site location data from service providers.The courts may be deciding there's a bit more Fourth Amendment to go around these days, but cops seem to be deciding there's more Fourth than ever that should be avoided. New tools, toys, and tactics are in play. "Reverse warrants" contain the word "warrant," but they demand info on every cellphone user in a certain area at a certain time, flipping probable cause on its head. Data brokers collecting location data from apps sell access to law enforcement agencies, allowing them to engage in tracking that would be unconstitutional if it involved cell service providers.There's a lot of data flowing towards law enforcement agencies. But it's useless if it can't be analyzed. That's where a little known company steps in, giving cops a way to wrangle all that subpoenaed data into something actionable. The Intercept's Sam Richards has the details.

Until now, the Bartonville, Texas, company Hawk Analytics and its product CellHawk have largely escaped public scrutiny. CellHawk has been in wide use by law enforcement; the software is helping police departments, the FBI, and private investigators around the United States convert information collected by cellular providers into maps of people’s locations, movements, and relationships. Police records obtained by The Intercept reveal a troublingly powerful surveillance tool operated in obscurity, with scant oversight.CellHawk’s maker says it can process a year’s worth of cellphone records in 20 minutes, automating a process that used to require painstaking work by investigators, including hand-drawn paper plots. The web-based product can ingest call detail records, or CDRs, which track cellular contact between devices on behalf of mobile service providers, showing who is talking to whom. It can also handle cellular location records, created when phones connect to various towers as their owners move around.
It's yet another law enforcement middle man. Service providers and brokers collect data. Law enforcement asks for this data -- often without a warrant. CellHawk ensures the data is usable. And it's apparently been flying under the radar for years, mainly because it's not engaged in harvesting this info.But CellHawk isn't a harmless interloper. It's maximizing the impact of data gathered from hundreds of apps and innate cellphone behavior. The company claims it can turn ride-hailing app data into usable "intelligence" and morph millions of GPS data points into a rich tapestry of movements and interpersonal connections.While CellHawk remains mostly diplomatic on its website, materials sent to potential customers bluntly inform them how powerful the company's analytic software is. Marketing materials tell investigators they'll not only be able to determine whether a suspect was at a crime scene, but CellHawk will let them know where they spend most of their time, whether they've been anywhere "unusual," and the most likely location of their bedroom.The software has plenty of law enforcement fans. And those fans are writing their own rules to use the tool, since it appears to fall between the cracks of Fourth Amendment jurisprudence. This is the gap where law enforcement does a lot of work, guided only by their certainty that rights haven't been violated in this particular way in their jurisdiction prior to their deployment of tools like CellHawk. While Supreme Court precedent suggests warrants are best for tracking people via cellphone data, agencies like the Hennepin County Sheriff's Department (MN) seem to believe a lower standard applies.
It stated that the office needed “[r]easonable suspicion,” which was deemed “present when sufficient facts are established to give … a basis to believe that there is, or has been, a reasonable possibility that an individual or organization is involved in a definable criminal activity or enterprise.”
Make hay while law remains unsettled, as the saying goes. Hawk Analytics is roping in as many law enforcement customers as it can, taking advantage of the lack of on-point court rulings to sell data analytics to agencies engaged in questionable harvesting of location-tracking data. The Intercept's informal FOIA poll found CellHawk in use in multiple states by multiple departments. It appears the FBI is also a fan of CellHawk's work, which means the company now has nationwide coverage (so to speak), even if it has yet to find local buyers.Hawk Analytics is smart. It's built for the long run. Even if courts find mass collections of location data worthy of Fourth Amendment protections, cops will still be looking for someone to help them parse all the data they've obtained. Third-party data brokers may see their fortunes fall, but CellHawk will still be there to analyze the output of those that survive.

Read more here

posted at: 12:01am on 05-Jan-2021
path: /Policy | permalink | edit (requires password)

0 comments, click here to add the first



January 2021
Sun Mon Tue Wed Thu Fri Sat
         
           







RSS (site)  RSS (path)

ATOM (site)  ATOM (path)

Categories
 - blog home

 - Announcements  (1)
 - Annoyances  (0)
 - Career_Advice  (0)
 - Domains  (0)
 - Downloads  (3)
 - Ecommerce  (0)
 - Fitness  (0)
 - Home_and_Garden  (0)
     - Cooking  (0)
     - Tools  (0)
 - Humor  (0)
 - Notices  (0)
 - Observations  (1)
 - Oddities  (2)
 - Online_Marketing  (146)
     - Affiliates  (1)
     - Merchants  (1)
 - Policy  (2300)
 - Programming  (0)
     - Browsers  (1)
     - DHTML  (0)
     - Javascript  (5)
     - PHP  (0)
     - PayPal  (0)
     - Perl  (37)
          - blosxom  (0)
     - Unidata_Universe  (22)
 - Random_Advice  (1)
 - Reading  (0)
     - Books  (0)
     - Ebooks  (1)
     - Magazines  (0)
     - Online_Articles  (4)
 - Resume_or_CV  (1)
 - Reviews  (1)
 - Rhode_Island_USA  (0)
     - Providence  (1)
 - Shop  (0)
 - Sports  (0)
     - Football  (0)
          - Cowboys  (0)
          - Patriots  (0)
     - Futbol  (0)
          - The_Rest  (0)
          - USA  (0)
 - Windows  (1)
 - Woodworking  (0)


Archives
 -2021  March  (6)
 -2021  February  (42)
 -2021  January  (46)
 -2020  December  (47)
 -2020  November  (46)
 -2020  October  (48)
 -2020  September  (49)
 -2020  August  (47)
 -2020  July  (46)
 -2020  June  (46)
 -2020  May  (49)
 -2020  April  (48)


My Sites

 - Millennium3Publishing.com

 - SponsorWorks.net

 - ListBug.com

 - TextEx.net

 - FindAdsHere.com

 - VisitLater.com