A Large-Scale Study of the Evolution of Web Pages - Microsoft

Extrait du fichier (au format texte) :

A Large-Scale Study of the Evolution of Web Pages
Dennis Fetterly
Hewlett Packard Labs
1501 Page Mill Road
Palo Alto, CA 94304
dennis.fetterly@hp.com

Mark Manasse

Marc Najork

Microsoft Research
Microsoft Research
1065 La Avenida
1065 La Avenida
Mountain View, CA 94043 Mountain View, CA 94043
manasse@microsoft.com najork@microsoft.com

Janet Wiener
Hewlett Packard Labs
1501 Page Mill Road
Palo Alto, CA 94304
janet.wiener@hp.com

ABSTRACT

1. INTRODUCTION

How fast does the web change? Does most of the content remain unchanged once it has been authored, or are the documents continuously updated? Do pages change a little or a lot? Is the extent of change correlated to any other property of the page? All of these questions are of interest to those who mine the web, including all the popular search engines, but few studies have been performed to date to answer them.
One notable exception is a study by Cho and Garcia-Molina,
who crawled a set of 720,000 pages on a daily basis over four months, and counted pages as having changed if their MD5 checksum changed. They found that 40% of all web pages in their set changed within a week, and 23% of those pages that fell into the
.com domain changed daily.
This paper expands on Cho and Garcia-Molina s study, both in terms of coverage and in terms of sensitivity to change. We crawled a set of 150,836,209 HTML pages once every week, over a span of
11 weeks. For each page, we recorded a checksum of the page, and a feature vector of the words on the page, plus various other data such as the page length, the HTTP status code, etc. Moreover, we pseudo-randomly selected 0.1% of all of our URLs, and saved the full text of each download of the corresponding pages.
After completion of the crawl, we analyzed the degree of change of each page, and investigated which factors are correlated with

Les promotions



MSFT SurfaceLaptopIntel Fact Sheet
MSFT SurfaceLaptopIntel Fact Sheet
02/10/2025 - www.microsoft.com
Windows Hello for Business with facial recognition and Enhanced Sign-In Security Surface Laptop for Business Near-edgeless display and Surface's signature 3:2 ratio for more screen in a compact footprint Premium experiences drive AI advantage NPUs delivering 40 or 48 TOPS of on-device AI performance to support today's capabilities and tomorrow's innovations5 Anti-reflective technology reduces reflections up to 50% Optional smart card reader16 Exceptional AI-enabled collaboration and Copilot+...

Msft Microsoft Surface Pro 11th Edition Fact Sheet
Msft Microsoft Surface Pro 11th Edition Fact Sheet
10/10/2025 - www.microsoft.com
Surface Pro for Business Fact Sheet May 2024 The most flexible laptop, reimagined. The new Surface Pro is the most flexible 2-in-1 laptop, now reimagined with more speed and battery life for all-new AI experiences, powered by Snapdragon? X Elite and Plus processors with an industry leading NPU. All wrapped up in an ultra-portable design that can replace your tablet, your laptop, and power your multi-monitor set-up. The new Surface Pro Flex Keyboard allows you to position your Surface Pro and...

Architectures reconfigurables et traitement de proble`mes ... - Microsoft
Architectures reconfigurables et traitement de proble`mes ... - Microsoft
16/11/2016 - www.microsoft.com
RECHERCHE Architectures reconfigurables et traitement de proble`mes NP-difficiles : un nouveau domaine d application Youssef Hamadi    David Merceron  '  ' LIRMM, UMR 5506 CNRS/Universite´ Montpellier II 161, Rue Ada, 34392 Montpellier Cedex 5 hamadi@lirmm.fr ''' EURIWARE, 12-14 rue du fort de St-Cyr 78067 St Quentin-en-Yvelines Cedex damercer@euriware.fr RE´SUME´. L algorithme GSAT est un algorithme de recherche locale. Cette me´thode recherche la premie`re instanciation...

DSCOVR: Randomized Primal-Dual Block Coordinate ... - Microsoft
DSCOVR: Randomized Primal-Dual Block Coordinate ... - Microsoft
23/08/2018 - www.microsoft.com
DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization lin.xiao@microsoft.com Lin Xiao Microsoft Research AI Redmond, WA 98052, USA weiyu@cs.cmu.edu Adams Wei Yu Machine Learning Department, Carnegie Mellon University Pittsburgh, PA 15213, USA qihang-lin@uiowa.edu Qihang Lin Tippie College of Business, The University of Iowa Iowa City, IA 52245, USA wzchen@microsoft.com Weizhu Chen Microsoft AI and Research Redmond, WA 98052, USA October 13,...

User-Driven Access Control: Rethinking Permission ... - CiteSeerX
User-Driven Access Control: Rethinking Permission ... - CiteSeerX
23/08/2018 - www.microsoft.com
User-Driven Access Control: Rethinking Permission Granting in Modern Operating Systems Franziska Roesner, Tadayoshi Kohno {franzi, yoshi}@cs.washington.edu University of Washington Alexander Moshchuk, Bryan Parno, Helen J. Wang {alexmos, parno, helenw}@microsoft.com Microsoft Research, Redmond Crispin Cowan crispin@microsoft.com Microsoft Abstract tionality and security for access to the user s data and resources. From a functionality standpoint, isolation inhibits the client-side manipulation...

Vers une approche simplifiée pour introduire le caractère ... - Microsoft
Vers une approche simplifiée pour introduire le caractère ... - Microsoft
23/11/2017 - www.microsoft.com
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/262881208 Vers une approche simplifiée pour introduire le caractère incrémental dans les systèmes de dialogue Conference Paper · July 2014 CITATION READS 1 26 3 authors, including: Hatim Khouzaimi Romain Laroche Orange Labs / Laboratoire Informatique d'Avi & Microsoft Maluuba 12 PUBLICATIONS 42 CITATIONS 58 PUBLICATIONS 185 CITATIONS SEE PROFILE All content following this page was uploaded by Hatim Khouzaimi on 28 April 2015. The user has requested enhancement of the downloaded file. SEE PROFILE 21ème...

MSR Quantum applications - Microsoft
MSR Quantum applications - Microsoft
23/08/2018 - www.microsoft.com
( What Can We Do with a Quantum Computer? ( Matthias Troyer  Station Q, ETH Zurich | 1 Classical computers have come a long way Antikythera mechanism ENIAC astronomical positions (1946) (100 BC) Kelvin s harmonic analyzer prediction of tides (1878) Difference Engine (1822) Is there anything that we cannot solve on future supercomputers? Titan, ORNL (2013) Matthias Troyer | | 2 How long will Moore s law continue? Do we see signs of the end of Moore s law? Can we go below 7nm...

1 Introduction - Microsoft
1 Introduction - Microsoft
11/04/2018 - www.microsoft.com
One-Way Accumulators: A Decentralized Alternative to Digital Signatures (Extended Abstract) Josh Benaloh Clarkson University Michael de Mare Giordano Automation Abstract This paper describes a simple candidate one-way hash function which satis es a quasi-commutative property that allows it to be used as an accumulator. This property allows protocols to be developed in which the need for a trusted central authority can be eliminated. Space-e cient distributed protocols are given for document time...
 
 

Table des matières - LaCie
Table des matières - LaCie
11/04/2018 - www.lacie.com
Moniteur LCD LaCie 324i Manuel d utilisation Table des matières page 1 Table des matières 1. Introduction................................................................................................................... 7 1.1. Caractéristiques................................................................................................................................ 8 1.2. Caractéristiques techniques du LaCie 324i.............................................................................................

Consolerepairs Privacypolicy Chit
Consolerepairs Privacypolicy Chit
26/09/2025 - www.nintendo.com
Informativa sulla privacy dei Servizi di riparazione Nintendo Ultimo aggiornamento: 03.2025 La presente Informativa sulla privacy si applica all'uso del Portale Riparazioni Nintendo gestito su questo sito web da Nintendo of Europe SE ( Nintendo , noi , ci ), con sede legale in Goldsteinstrasse 235, 60528 Francoforte sul Meno, Germania, e all'elaborazione dei servizi di riparazione. Nintendo ? il titolare del trattamento ai sensi del Regolamento generale sulla protezione dei dati (Regolamento...

Pour commencer Chapitre 1
Pour commencer Chapitre 1
20/03/2012 - www.thomsontv.fr
Chapitre 1 Informations importantes Sécurité bien à la tension indiquée sur l'autocollant placé au dos de votre téléviseur. Lorsqu'une prise secteur ou celle d'un autre appareil est utilisée pour la déconnexion, celle-ci doit rester facilement accessible. Sur certains modèles, l'indicateur lumineux est situé sur le côté du téléviseur. L'absence d'une indication lumineuse déconnecté du secteur. Pour déconnecter complètement le téléviseur, la prise secteur doit être retirée....

Wii Wheel - Nintendo
Wii Wheel - Nintendo
04/04/2017 - www.nintendo.com
C/RVL A HA USZ Wii Wheel!" Accessory for the Wii Remote!" Operations Manual NEED HELP WITH INSTALLATION, MAINTENANCE OR SERVICE? Mode d emploi  page 5 Manual de Operaciones  página 8 Nintendo Customer Service SUPPORT.NINTENDO.COM or call 1-800-255-3700 BESOIN D AIDE POUR L INSTALLATION, L ENTRETIEN OU LA RÉPARATION? Service à la Clientèle de Nintendo SUPPORT.NINTENDO.COM ou appelez le 1 800 255-3700 ¿NECESITAS AYUDA DE INSTALACION, MANTENIMIENTO O SERVICIO? Servicio al Cliente...

Lave Vaisselle Electrolux
Lave Vaisselle Electrolux
16/04/2012 - tools.professional.electrolux.com
Lave Vaisselle Electrolux Tunnel de séchage G->D Quelques modèles de lave - vaisselle Electrolux peuvent remplir toute leur efficacité si combinés avec la gamme de traitement Electrolux. Le système modulaire permet d'avoir des solutions ergonomiques et sûres pour traiter, transport et assortir la vaisselle de cuisine et obtenir la meilleure circulation possible en salle de lavage de la vaisselle. La gamme comprend des transporteurs à cordeaux et à lame pour plateaux, tables classantes manuelles...

Télécharger ce communiqué en pdf
Télécharger ce communiqué en pdf
23/09/2016 - www.philips.fr
Information presse 15 mars 2016 Philips Hue collabore avec les grands acteurs des télécommunications et assurance pour promouvoir l'Internet des objets à domicile Suresnes, France  A l occasion du salon Light & Building 2016, Philips Lighting, société de Royal Philips (NYSE: PHG, AEX, PHIA), leader des solutions d éclairage en France et dans le Monde, renforce son positionnement sur le marché de la « maison connecté » en développant des partenariats avec des entreprises telles que...