How to Recover your Content from Wayback Machine (Internet Archive)

If your website was lost or hacked, you might have the unfortunate task of recovering the content. We always recommend making regular backups of your site, but if they are not available you have another option.

The Internet Archive, also known as the Wayback Machine takes periodic snapshots of many sites across the internet and may have a copy of your site. So, follow along and we’ll teach you how to search for archives and recover your content from the Wayback Machine. You can then use these pieces to rebuild your site from scratch.

Search for Archives

  1. Visit the Wayback Machine at https://archive.org/web.
  2. Type your web address in the search field then click the Browse History button. It will list how many times your site was saved over a time period. For example:
    Saved 34 times between November 9, 2008 and May 28, 2019.
  3. You will also see a timeline and a calendar. Click the year to view what dates your site was archived.
  4. Click the date on the calendar to view a snapshot of what was saved. You can try to navigate the site to view any available content. Keep in mind, it may not look exactly like your site since it depends on what was archived at the time.
  5. I recommend checking each year and date to ensure you find all of the content.

Copy Content Manually

Now that you know how to search for and find your website snapshots, you can begin copying the text and images to your computer.

  1. Navigate to each page of the site and copy the text, then paste it into a text editor such as Notepad, Google Docs, or MS Word.
  2. Visit each page in the Internet Archive then right-click and save any images you want to recover to a folder on your computer.
  3. In some cases, you may be able to recover some of the website code. Right-click then select View page source to access the site code. Save it to a text editor for later use.

Scrape Internet Archive Content

If you don’t have time to manually copy each page of the website you’re recovering another option is to pull or scrape all the site content using a script. The following are some of the most popular options available. Keep in mind that these are often coded by 3rd parties or individuals and may require testing and troubleshooting to make them function successfully.

3rd Party Services

Want to save time? You can pay a 3rd party service to scrape and recover your website for you. Some will even restore content from CMSs such as WordPress. The pricing and scope of service will differ based on the site, so we recommend checking and comparing them to see which one best meets your needs.

Now that you know how to find and recover website content from the Wayback Machine (Internet Archive), you can begin rebuilding your site. Hopefully, your site will return to its former glory with help from the archived copy. We recommend archiving your website with the Wayback Machine, so you will have updated snapshots.

Leave a Reply