Skip to content

How to Use the wget Command in Linux

Published: at 06:06 AM

Introduction

The wget is a command-line utility in Linux used for downloading files from the web. It supports multiple protocols, such as HTTP, HTTPS, and FTP, and is designed to work non-interactively, meaning it can run in the background without requiring user input. This makes it an excellent tool for retrieving large files, downloading entire websites, and handling interrupted downloads. In this guide, we’ll focus on the most common options and arguments to help you get the most out of wget.

TL;DR

You can find a shorter cheat sheet version of this article here.

Table of contents

Open Table of contents

Basic Syntax of wget

At its simplest, the wget command is used like this:

wget [URL]

This command downloads the file located at the specified URL and saves it in the current working directory.

For example:

wget https://learntheshell.com/sample.zip

This will download sample.zip to your current folder.


Downloading Files to a Specific Directory

By default, wget saves files in the directory from which you run the command. If you want to specify a different location, use the -P option followed by the directory path:

wget -P /path/to/directory [URL]

For example, to download a file and save it to the /home/user/Downloads directory:

wget -P /home/user/Downloads https://learntheshell.com/sample.zip

Resuming Interrupted Downloads

If your download is interrupted, you don’t have to start over. Using the -c option (short for “continue”), wget will resume the download from where it left off:

wget -c https://learntheshell.com/sample.zip

This feature is particularly useful for downloading large files.

Downloading Multiple Files

To download several files at once, you can create a text file with each URL on a separate line, then pass the file to wget with the -i option:

  1. Create a text file (urls.txt) with the URLs you want to download:

    https://learntheshell.com/file1.txt
    https://learntheshell.com/file2.txt
  2. Use wget to download all the files listed in that file:

    wget -i urls.txt

wget will download each file in the list one after the other.

Downloading in the Background

To download files in the background, freeing up your terminal for other tasks, use the -b option:

wget -b https://learntheshell.com/sample.zip

When using this option, wget runs in the background and logs output to a file named wget-log. To check the progress of the download, use:

tail -f wget-log

Limiting Download Speed

If you don’t want wget to use all available bandwidth, you can limit the download speed using the --limit-rate option. This can be helpful when you need to conserve bandwidth or run other network-intensive tasks:

wget --limit-rate=200k https://learntheshell.com/sample.zip

In this example, the download speed is limited to 200 KB/s. You can specify the rate in bytes (B), kilobytes (k), or megabytes (m).

Recursive Downloading (Downloading Entire Websites)

To download a website or directory recursively (i.e., download all linked files within the target page), use the -r option:

wget -r https://learntheshell.com/

This will download the website, including all linked pages. You can limit the depth of recursion by adding the -l (lowercase L) option:

wget -r -l 2 https://learntheshell.com/

This restricts wget to downloading two levels deep.

Downloading for Offline Viewing

To download an entire website for offline viewing, including all assets like images and CSS files, use the -p option along with recursive downloading:

wget -r -p https://learntheshell.com/

Additionally, to make the links in the downloaded HTML files suitable for local browsing, use the --convert-links option:

wget -r -p --convert-links https://learntheshell.com/

This ensures that all links are converted to point to your local copies of the files.

Using Custom User Agents

Sometimes, websites block downloads from non-browser clients like wget. You can bypass this by specifying a user agent, making wget appear like a typical web browser:

wget --user-agent="Mozilla/5.0" https://learntheshell.com/sample.zip

This command makes wget pretend to be Mozilla Firefox, which may help avoid blocking by some websites.

Handling Authentication (Username and Password)

For files behind HTTP authentication (such as password-protected areas), wget supports basic authentication using the --user and --password options:

wget --user=username --password=password https://learntheshell.com/protected-file.zip

This command lets you download files that require a login.

If you don’t want to specify password in the command, use --ask-password option. wget will prompt for password:

wget --user=username --ask-password https://learntheshell.com/protected-file.zip

Note that --password and --ask-password are mutually exclusive.

To check if a URL or multiple URLs are valid without downloading the content, you can use the --spider option:

wget --spider https://learntheshell.com/

This is useful for verifying links in scripts or ensuring web pages are accessible without actually downloading them.

Mirroring Websites

To mirror a website, including all its pages and directory structures, use the --mirror option. This option is equivalent to -r -N -l inf --no-remove-listing, which ensures a complete mirror of the website:

wget --mirror https://learntheshell.com/

The --mirror command preserves timestamps and directory structure, creating an exact copy of the website for offline use.


Conclusion

The wget command is a powerful and flexible tool for downloading files from the web, supporting many protocols and options. Whether downloading individual files, mirroring entire websites, or managing bandwidth usage, wget provides a wealth of functionality to suit almost any need. By mastering the most commonly used options and arguments, you can efficiently handle everything from simple downloads to complex web scraping tasks.