archive.today

archive.today
Screenshot of the archive.today home page
Type of site
Web archiving
Available inMultilingual
URL
  • archive-is.tumblr.com
  • lj.rossia.org/~archive-today
RegistrationNo
Launched16 May 2012 (2012-05-16)[1]

archive.today (also known as archive.is, among other domains) is a web archiving website that saves snapshots on demand. It has support for JavaScript-heavy sites such as Google Maps and X.[2] archive.today records two snapshots: one replicates the original webpage including any functional live links; the other is a screenshot of the page.[3]

In 2025, the U.S. Federal Bureau of Investigation (FBI) began a criminal investigation into the owner of the archive.today's domain name, who it believes is possibly based in Russia. archive.today came under further scrutiny in 2026, when it was discovered that it had been threatening Jani Patokallio and performing a denial-of-service attack on his blog that investigated the website.

History

archive.today was founded in 2012 as a web archive. It allegedly registered its trademark in the Czech Republic in 2013.[4] The site originally branded itself as archive.today, but changed the primary mirror to archive.is in May 2015.[5] It began to deprecate the archive.is domain in favor of other mirrors in January 2019.[6] According to the archive.today blog, the website had saved about 500 million pages by 2021,[7][8] 700 terabytes in total size.[9]

In July 2013, archive.today began supporting the API of the Memento Project at Los Alamos National Laboratory.[10][11] Due to budget constraints at LANL, the Memento Project was disestablished in September 2025.[12]

The Russian independent media outlet Mediazona uses website to preserve social media profiles and posts of Russian servicemen killed in the Russo-Ukrainian war, as part of its Russia 200 project, a named database of confirmed Russian military casualties compiled jointly with the BBC Russian Service and a team of volunteers.[13][14] Individual profile pages on 200.zona.media link to snapshots of social media posts by relatives, obituaries in local media, and other open-source evidence used to verify each death.

In early 2023, a team of researchers at the University of Amsterdam identified archive.today as the most-used open-access archiving service among fact-checking organisations, based on the European Digital Media Observatory's dataset on the Russo-Ukrainian war.[15][16]

In August 2023, the Wikitravel Press co-founder and Google Cloud executive Jani Patokallio (the eldest son of a Finnish diplomat and renowned writer Pasi Patokallio) published an investigation on his blog Gyrovague regarding archive.today's funding sources and the founder's identity.[8][17] The founder tends to believe that the Patokallo family’s increased interest is linked to how the archive.today website is being used in the context of the Russian-Ukrainian war [18]

On 30 October 2025, the US Federal Bureau of Investigation (FBI) subpoenaed archive.today's domain registrar, Tucows. The subpoena stated its purpose was to identify the owner(s) of the archive.today domain name, and that it was part of a criminal investigation conducted by the FBI, the nature of which was not disclosed.[19][20] The Catalan daily Ara interpreted the action as part of a campaign to selectively criminalize anonymous digital archives reliant on micro-donations (such as Anna's Archive, eliminated by Google from its search results), even though industrial datasets used for training large language models (such as the Common Crawl, financed by OpenAI and Anthropic) also fail to compensate content creators and owners.[9] News coverage of the subpoena mentioned Patokallio's report. The FBI has said there were "several indications" the founder was based in Russia.[17][21]

In November 2025, the DNS provider AdGuard DNS reported that it had been pressured by a French organization calling itself Web Abuse Association Defense (WAAD) to block archive.today and its mirror domains. WAAD alleged that archive.today had refused to remove child sexual abuse material since 2023, invoking French LCEN law to demand action. AdGuard DNS contacted archive.today directly and reported that the flagged content was promptly removed upon notification, and that archive.today stated it had never received prior complaints about those URLs. AdGuard's investigation found that WAAD was a recently registered association with minimal public presence, and described the complaints as suspicious, noting evidence of possible impersonation of a real French lawyer in prior similar complaints sent to other companies. AdGuard announced it would file a criminal complaint with French police.[22][23]

Screenshot of LibreWolf's developer tools on the "Network" tab, with multiple automated connections to "gyrovague.com" made by a JavaScript script (all of them are blocked by uBlock Origin browser extension)
Screenshot of archive.today performing a DDoS attack on gyrovague.com

On 8 January 2026, Patokallio's hosting provider Automattic notified him that it had received a GDPR complaint from a person identifying herself as "Nora". The complaint alleged that the 2023 Gyrovague investigation "contains extensive personal data… presented in a narrative that is defamatory in tone and context." After Patokallio submitted a rebuttal, Automattic sided with him and left the post up.[24] Subsequent investigation suggested that "Nora" was likely an appropriated identity — the name belonged to either a real person or a trademark of a clothing brand, whose only connection to archive.today had been a prior content takedown request.[25][24] On 10 January 2026, the archive.today webmaster sent Patokallio an email asking him to temporarily remove the 2023 post; when Patokallio declined, the DDoS attack began several days later.[24]

On 14 January 2026, it emerged that archive.today had modified its CAPTCHA page to discretely send repeated requests to Gyrovague, thereby causing visitors to unwittingly contribute to a DDOS attack against the blog. A Tumblr account seemingly associated with archive.today had recently posted several public criticisms of Patokallio. Emails released by Patokallio show archive.today requesting the temporary removal of his report and later threatening him with AI pornography.[17][21] On 20 February 2026, the English Wikipedia banned links to archive.today, citing the DDoS attack and evidence that archived content was tampered with to insert Patokallio's name.[26] The decision was made despite concerns over maintaining content verifiability[26] while removing and replacing the second-largest archiving service used across the Wikimedia Foundation's projects.[27] The Wikimedia Foundation had stated its readiness to take action regardless of the community verdict.[26][27] Patokallio expressed his satisfaction with the outcome.[4]

During the community discussion, editors discovered that archive.today's operator had tampered with archived snapshots of webpages. In captures of a blog post related to the "Nora" pseudonym, the operator had replaced instances of "Nora" with "Jani Patokallio", including in comment fields that previously read "Comment as: Nora [surname]".[25] The alterations were subsequently reverted. The discovery was cited as a key factor in the blacklisting decision, as it undermined the premise that archived snapshots were faithful reproductions of the original pages.[25]

This was not the first time Wikipedia had restricted links to archive.today. In 2013, the community blacklisted archive.is, citing concerns about botnets, linkspamming, and the opaque manner in which the site was operated. The decision was overturned in 2016 following a new request for comment, and archive.today was removed from the spam blacklist. At the time of the 2026 ban, the site was the second-largest archiving service used across all Wikimedia Foundation projects, with over 695,000 links spread across approximately 400,000 pages.[25]

Funding

The site's funding model has been a persistent source of uncertainty. According to the creator, as of 2021 advertising and donations together covered less than 20% of operating expenses, with donations amounting to approximately €6,000.[8] PayPal donations, previously accepted, were discontinued around 2022 because the creator could no longer top up the account — interpreted by Patokallio as evidence that the operator is located in Russia, given the creator's complaints about the difficulty of cross-border payments "across the Iron Curtain."[8] Current donation channels include Liberapay, a French non-profit micropayment platform, and BuyMeACoffee.[8] The creator has expressed skepticism toward cryptocurrencies and has not adopted them.[8]

Advertising revenue has been volatile. The FAQ once included a "promise it will have no ads at least till the end of 2014,"[28] but Yahoo! network ads were subsequently injected into mobile web page views (though not on desktop). The creator has stated that on good days ads "almost cover expenses," but on bad days the site is de-monetized by ad networks because archived content inevitably includes advertiser-unfriendly material.[8]

An anonymous comment on Patokallio's post alleged that Btdigg, a BitTorrent DHT search engine, is operated by the same people behind archive.today,[29] though this claim has not been independently verified by any secondary source

Features

Archiving

archive.today can capture individual pages in response to explicit user requests.[30][31][28] Since its beginning, it has supported crawling pages with URLs containing the now-deprecated hash-bang fragment (#!).[32] The website records only text and images, excluding XML, RTF, spreadsheet (xls or ods) and other non-static content. However, videos for certain sites, like Twitter, are saved.[33] It keeps track of the history of snapshots saved, requesting confirmation before adding a new snapshot of an already saved page.[34][35] Once a web page is archived, it cannot be deleted directly by any Internet user.[36] Users can download archived pages as a ZIP file, except pages archived since 29 November 2019,[37] when archive.today changed their browser engine from PhantomJS to Chromium (non-headless).[38] archive.today does not obey robots.txt because it acts "as a direct agent of the human user."[28]

Pages are captured at a browser width of 1,024 pixels. CSS is converted to inline CSS, removing responsive web design and selectors such as :hover and :active. Content generated using JavaScript during the crawling process appears in a frozen state.[39] HTML class names are preserved inside the old-class attribute. When text is selected, a JavaScript applet generates a URL fragment seen in the browser's address bar that automatically highlights that portion of the text when visited again. Web pages can be duplicated from archive.today to web.archive.org as second-level backup, but archive.today does not save its snapshots in WARC format. The reverse—from web.archive.org to archive.today—is also possible,[40] but the copy usually takes more time than a direct capture.

Archive of a Wikipedia webpage by archive.today on 5 January, 2026

While saving a page, a list of URLs for individual page elements and their content sizes, HTTP statuses and MIME types is shown. This list can only be viewed during the crawling process. Removing advertisements, popups or expanding links from archived pages is possible by asking the owner to do it on his blog.[41]

According to the site's FAQ, archive.today's storage layer runs on Apache Hadoop and Apache Accumulo, with all data stored on the Hadoop Distributed File System (HDFS). Textual content is replicated three times across servers in two data centers, both located in Europe, with at least one hosted by the French provider OVH; images are replicated twice.[28] The site does not store snapshots in WARC format.[28]

The scraping component has used a modified version of the Chromium browser since November 2019, replacing the previous PhantomJS-based engine.[28] To circumvent anti-scraping measures, archive.today routes its scraping requests through a rotating pool of IP addresses, which Patokallio described as a "botnet".[8]

The research toolbar enables advanced keywords operators, using * as the wildcard character. Paired quotation marks address the search to an exact sequence of keywords present in the title or in the body of the webpage, whereas the insite operator restricts it to a specific Internet domain.[42] While saving a dynamic list, archive.today search box shows only a result that links the previous and the following section of the list (e.g. 20 links for page).[43] The other web pages saved are filtered, and sometimes may be found by one of their occurrences.[34] The search feature is backed by Google CustomSearch. If it delivers no results, archive.today attempts to utilize Yandex Search.[44]

Bypassing paywalls

archive.today users often use the service to bypass paywalls, similarly to the defunct website 12ft.[19][45]

Worldwide availability

Australia and New Zealand

In March 2019, the site was blocked for six months by several internet providers in Australia and New Zealand in the aftermath of the Christchurch mosque shootings in an attempt to limit distribution of the footage of the attack.[46][47]

China

According to GreatFire.org, archive.today has been blocked in mainland China since March 2016,[48] archive.li since September 2017,[49] archive.fo since July 2018,[50] as well as archive.ph since December 2019.[51]

Finland

On 21 July 2015, the operators blocked access to the service from all Finnish IP addresses, stating on Twitter that they did this in order to avoid escalating a dispute they allegedly had with the Finnish government.[52][53]

Russia

In 2016, the Russian communications agency Roskomnadzor began blocking access to archive.is from Russia.[54][55][53]

On 23 March 2026, archive.today and several mirror domains were blocked by Russian authorities.[56]

See also

  • Digital preservation – Practice to keep digital assets accessible in long term
  • Link rot – URLs ceasing to function
  • List of web archiving initiatives

References

  1. ^ "When did the Archive-is site originally launch?". Archive.today Blog. 18 February 2014 – via Tumblr. {{cite web}}: |archive-date= requires |archive-url= (help)CS1 maint: deprecated archival service (link) CS1 maint: url-status (link)
  2. ^ Brinkmann, Martin (22 April 2015). "Create publicly available web page archives with Archive.is". Ghacks. Archived from the original on 12 April 2019. Retrieved 13 June 2015.
  3. ^ Brunelle, Justin F.; Kelly, Mat; Weigle, Michele C.; Nelson, Michael L. (25 January 2015). "The impact of JavaScript on archivability" (PDF). International Journal on Digital Libraries. 17 (2): 95–117. doi:10.1007/s00799-015-0140-8. S2CID 8433375. Archived (PDF) from the original on 27 May 2019.
  4. ^ a b McCurdy, Will (21 February 2026). "Wikipedia Blacklists Archive.today Links Over Alleged DDoS Attack on Blogger". PC Magazine. Archived from the original on 21 February 2026. Retrieved 2 March 2026.
  5. ^ "Why did you change the URL back from archive-today to archive-is?". Archive.is Blog. 3 May 2015. {{cite web}}: |archive-date= requires |archive-url= (help)CS1 maint: deprecated archival service (link) CS1 maint: url-status (link)
  6. ^ @archiveis (4 January 2019). "Please do not use archive.IS mirror for linking, use others mirrors [.TODAY .FO .LI .VN .MD .PH]. .IS might stop working soon" (Tweet). Archived from the original on 6 January 2019 – via Twitter.
  7. ^ "What percentage of 5-char-codes is used now? [...]". Archive.is blog. Tumblr. 3 September 2021. Archived from the original on 29 January 2026. Retrieved 11 February 2026.
  8. ^ a b c d e f g h Patokallio, Jani (5 August 2023). "archive.today: On the trail of the mysterious guerrilla archivist of the Internet". Gyrovague. Archived from the original on 13 August 2023. Retrieved 1 January 2024.
  9. ^ a b Cuesta, Albert (15 November 2025). "L'FBI, a la caça del web arxivat que incomoda els mitjans". Ara (in Catalan). Archived from the original on 17 November 2025. Retrieved 2 March 2026.
  10. ^ Nelson, Michael L. (9 July 2013). "Archive.is Supports Memento". Research and Teaching Updates. Web Science and Digital Libraries Research Group at Old Dominion University. Archived from the original on 27 July 2013. Retrieved 17 September 2013.
  11. ^ "archive.is". Memento Protocol Information. Memento Development Group. Archived from the original on 15 September 2013. Retrieved 17 September 2013.
  12. ^ Taylor, Nicholas (7 August 2025). "Memento TimeTravel sunset". memento-dev (Mailing list). Archived from the original on 18 August 2025. Retrieved 21 November 2025.
  13. ^ "Потери России в войне с Украиной. Сводка «Медиазоны»". Mediazona. Retrieved 27 April 2026.
  14. ^ "Russian losses in the war with Ukraine. Mediazona count, updated". Mediazona. Retrieved 27 April 2026.
  15. ^ "Losing our memory of fake news". Community Research and Development Information Service. 24 February 2023. Archived from the original on 9 December 2025. Retrieved 2 March 2026.
  16. ^ Porcellini, Valentin. "Mapping the "memory loss" of disinformation in fact-checks: the challenge of preserving disinformation traces". vera.ai. Archived from the original on 26 January 2023. Retrieved 2 March 2026.
  17. ^ a b c Brodkin, Jon (10 February 2026). "Archive.today CAPTCHA page executes DDoS; Wikipedia considers banning site". Ars Technica. Archived from the original on 10 February 2026. Retrieved 11 February 2026.
  18. ^ "Archive.today blog".
  19. ^ a b Koebler, Jason. "FBI Tries to Unmask Owner of Infamous Archive.is Site". 404 Media. Archived from the original on 6 November 2025. Retrieved 6 November 2025.
  20. ^ Kirchner, Malte (5 November 2025). "Archive.today: FBI Demands Data from Provider Tucows". heise.de.
  21. ^ a b Ferreira, Bruno (15 February 2026). "Notorious 'Archive Today' website allegedly leads bizarre DDoS campaign against security blogger — Wikipedia considers removing all links to the Archive". Tom's Hardware. Retrieved 15 February 2026.
  22. ^ Meshkov, Andrey (13 November 2025). "Behind the complaints: Our investigation into the suspicious pressure on Archive.today". AdGuard DNS Blog. Retrieved 27 April 2026.
  23. ^ "AdGuard DNS publishes investigation results revealing that the organization pressuring Archive.today is highly suspicious". GIGAZINE. 17 November 2025. Retrieved 27 April 2026.
  24. ^ a b c Brodkin, Jon (10 February 2026). "Archive.today CAPTCHA page executes DDoS; Wikipedia considers banning site". Ars Technica. Retrieved 27 April 2026.
  25. ^ a b c d Brodkin, Jon (20 February 2026). "Wikipedia blacklists Archive.today, starts removing 695,000 archive links". Ars Technica. Retrieved 27 April 2026.
  26. ^ a b c Brodkin, Jon (20 February 2026). "Wikipedia blacklists Archive.today, starts removing 695,000 archive links". Ars Technica. Archived from the original on 20 February 2026. Retrieved 20 February 2026.
  27. ^ a b Lewczuk, Maciej (11 February 2026). "Archive.today zamienił użytkowników w nieświadomych hakerów. Wikipedia reaguje na atak DDoS". PurePC (in Polish). Archived from the original on 11 February 2026. Retrieved 2 March 2026.
  28. ^ a b c d e f "Archive.today FAQ". archive.today.{{cite web}}: CS1 maint: deprecated archival service (link)
  29. ^ "Anonymous comment (6 August 2023)". Gyrovague. 5 August 2023. Retrieved 27 April 2026.
  30. ^ Dascalescu, Dan (18 February 2013). "Web page archiving". Dan Dascalescu's Wiki. Archived from the original on 22 September 2013. Retrieved 3 October 2013.
  31. ^ Koebler, Jason (29 October 2014). "Dear GamerGate: Please Stop Stealing Our Shit". Motherboard. Archived from the original on 1 February 2026. Retrieved 22 March 2017. There is no way for a website to protect itself from having an Archive.today user mirror the site.
  32. ^ "Home page of Archive.is in 2013". {{cite web}}: |archive-date= requires |archive-url= (help)CS1 maint: deprecated archival service (link) CS1 maint: url-status (link)
  33. ^ "Archive.today blog". {{cite web}}: |archive-date= requires |archive-url= (help)CS1 maint: deprecated archival service (link) CS1 maint: url-status (link)
  34. ^ a b Occhipinti, Kris (15 April 2016), Archiving Websites with the Archive.is, archived from the original on 27 January 2022, retrieved 27 January 2022 – via YouTube
  35. ^ "Example snapshot history on archive.is".{{cite web}}: CS1 maint: deprecated archival service (link)
  36. ^ "Some Frequently Asked Question". Archive.today Blog. 24 January 2013 – via Tumblr. {{cite web}}: |archive-date= requires |archive-url= (help)CS1 maint: deprecated archival service (link) CS1 maint: url-status (link)
  37. ^ "The "download zip" button has been giving a "Not found" error for quite some time". Archive.is blog. 17 July 2020. {{cite web}}: |archive-date= requires |archive-url= (help)CS1 maint: deprecated archival service (link) CS1 maint: url-status (link)
  38. ^ "What scraper or headless browser are you using? it works so well". Archive.is blog. 20 May 2020. {{cite web}}: |archive-date= requires |archive-url= (help)CS1 maint: deprecated archival service (link) CS1 maint: url-status (link)
  39. ^ JavaScript-generated loading animation of Dailymotion video https://archive.today/20200121182128/https://www.dailymotion.com/video/x3sexy8 appearing in a frozen state
  40. ^ https://archive.today/20190324174341/https://web.archive.org/web/20130520191911/https://es.wikipedia.org/wiki/Wikipedia
  41. ^ "Example user request on the Archive.is blog". Archive.is blog. {{cite web}}: |archive-date= requires |archive-url= (help)CS1 maint: deprecated archival service (link) CS1 maint: url-status (link)
  42. ^ For example, the string insite: https://en.wikipedia.org "World Cup" returns the https://archive.today/search/?q=insite%3A+http%3Aen.wikipedia.org+ "World+Cup"/ related snapshots
  43. ^ Example of dynamic list: "au:"thomas aquinas"". WorldCat. Archived from the original on 23 March 2019. Retrieved 15 December 2018.
  44. ^ "Just realized that I can search for keywords in the search bar for archive today, was this a recently added feature?". Archive.is. 18 January 2022. {{cite web}}: |archive-date= requires |archive-url= (help)CS1 maint: deprecated archival service (link) CS1 maint: url-status (link)
  45. ^ Bonifield, Stevie (6 November 2025). "FBI subpoenas the web registrar behind Archive.is". The Verge. Retrieved 18 February 2026. The site is commonly used to dodge paywalls, similar to 12ft.io, which the News/Media Alliance successfully had taken down earlier this year, claiming it 'offered illegal circumvention technology' to access copyrighted content without paying for it.
  46. ^ "ISPs in AU and NZ start censoring the internet without legal precedent". Private Internet Access. 19 March 2019. Archived from the original on 28 April 2023. Retrieved 20 March 2019.
  47. ^ "New Zealand ISPs Say They're Blocking Sites That Fail To Remove Christchurch Shooting Video". Gizmodo Australia. 19 March 2019. Archived from the original on 18 May 2019. Retrieved 20 March 2019.
  48. ^ "archive.is is 100% blocked in China". GreatFire Analyzer. 12 August 2018. {{cite web}}: |archive-date= requires |archive-url= (help)CS1 maint: deprecated archival service (link) CS1 maint: url-status (link)
  49. ^ "archive.li is 100% blocked in China". Great Fire Analyzer. 12 August 2018. {{cite web}}: |archive-date= requires |archive-url= (help)CS1 maint: deprecated archival service (link) CS1 maint: url-status (link)
  50. ^ "archive.fo is 100% blocked in China". Great Fire Analyzer. 12 August 2018. {{cite web}}: |archive-date= requires |archive-url= (help)CS1 maint: deprecated archival service (link) CS1 maint: url-status (link)
  51. ^ "archive.ph is 100% blocked in China". en.greatfire.org. {{cite web}}: |archive-date= requires |archive-url= (help)CS1 maint: deprecated archival service (link) CS1 maint: url-status (link)
  52. ^ Lapintie, Lassi (22 July 2015). "Suomalaisilta estettiin haktivistien suosimalla verkkosivulla käynti" [Finns' access to website used by hacktivists blocked]. Iltalehti (in Finnish). Archived from the original on 27 May 2019. Retrieved 4 March 2016.
  53. ^ a b Toler, Aric (22 February 2018). "How to Archive Open Source Materials". bellingcat. Archived from the original on 17 August 2025. Retrieved 17 February 2026.
  54. ^ Elistratov, Vladimir (29 January 2016). "Roskomnadzor zablokiroval servis archive.is, khranyashchiy kopii veb-saytov" Роскомнадзор заблокировал сервис archive.is, хранящий копии веб-сайтов. TJournal (in Russian). Archived from the original on 30 August 2017. Retrieved 30 January 2016.
  55. ^ Cushing, Tim (4 February 2016). "Russia Blocks Another Archive Site Because It Might Contain Old Pages About Drugs". Techdirt. Archived from the original on 23 March 2019. Retrieved 26 February 2016.
  56. ^ "Russian authorities block paywall removal site Archive.today". TechCrunch. 23 March 2026. Archived from the original on 23 March 2026. Retrieved 23 March 2026.