Use wget command to download the entire subdirectory under the parent directory Use the wget command to download the entire subdirectory under the parent directory. The command is as follows: wget -r --level=0 -E --ignore-length -x -k -p -erobots=off -np -N http://www.remote.com/remote/presentation/dir
The entire folder of the remote server will be downloaded to the current file directory of your computer. How to use wget to download all files in a directory wget -r -np -nH -R index.html http://url/including/files/you/want/to/download/
The meaning of each parameter: -r : traverse all subdirectories -np: Do not go to the upper subdirectory -nH : Do not save files to the hostname folder -R index.html : Do not download the index.html file wget downloads the entire website or a specific directory You need to download all the files in a certain directory. The command is as follows wget -c -r -np -k -L -p www.xxx.org/pub/path/
While downloading. There are images or links to external domains. If you need to download at the same time, you must use the -H parameter. wget -np -nH -r --span-hosts www.xxx.org/pub/path/
-c Resume downloading -r recursive download, download all files in a directory (including subdirectories) of the specified web page -nd does not create directories layer by layer when recursively downloading, and downloads all files to the current directory -np does not search the upper directory when recursively downloading, such as wget -c -r www.xxx.org/pub/path/ Without the -np parameter, other files in the pub directory above path will be downloaded at the same time. -k converts absolute links to relative links. It is best to add this parameter to download the entire site and browse the webpage offline. -L does not enter other hosts when recursing, such as wget -c -r www.xxx.org/ If there is a link like this on the website: www.yyy.org, without the -L parameter, it will recursively download the www.yyy.org website like a fire burning the mountain -p Download all files required by the web page, such as pictures, etc. -A specifies the file pattern list to download, multiple patterns are separated by commas -i is followed by a file that specifies the URL to download. There are other uses that I searched for on the Internet, so I’ll write them down here for your own use in the future. Common uses of wget wget usage format Usage: wget [OPTION]… [URL]… * Use wget to mirror the site: wget -r -p -np -k http://dsec.pku.edu.cn/~usr_name/ # or wget -m http://www.tldp.org/LDP/abs/html/ * Download a partially downloaded file on an unstable network and download during idle time wget -t 0 -w 31 -c http://dsec.pku.edu.cn/BBC.avi -o down.log & # Or read the list of files to be downloaded from filelist wget -t 0 -w 31 -c -B ftp://dsec.pku.edu.cn/linuxsoft -i filelist.txt -o down.log & The above code can also be used to download during periods of relatively idle network time. My usage is: in mozilla, copy the URL link that is inconvenient to download at the time into the memory and paste it into the file filelist.txt, and execute the second line of the above code before leaving the system at night. * Download using a proxy wget -Y on -p -k https://sourceforge.net/projects/wvware/ The proxy can be set in the environment variable or in the wgetrc file # Set the proxy in the environment variable export PROXY=http://211.90.168.94:8080/ # Set proxy in ~/.wgetrc http_proxy = http://proxy.yoyodyne.com:18023/ ftp_proxy = http://proxy.yoyodyne.com:18023/ List of various wget options * start up -V, --version Display the version of wget and exit -h, --help print syntax help -b, --background After startup, execute in the background -e, --execute=COMMAND Execute the command in the `.wgetrc' format. For the wgetrc format, refer to /etc/wgetrc or ~/.wgetrc * Record and input files -o, --output-file=FILE Write records to FILE file -a, --append-output=FILE Append records to FILE file -d, --debug print debug output -q, --quiet Quiet mode (no output) -v, --verbose Verbose mode (this is the default setting) -nv, --non-verbose turns off verbose mode, but not quiet mode -i, --input-file=FILE Download URLs appearing in FILE -F, --force-html Treat input files as HTML format files -B, --base=URL Use URL as the prefix for relative links appearing in the file specified by the -F -i parameters --sslcertfile=FILE Optional client certificate --sslcertkey=KEYFILE Optional client certificate KEYFILE –egd-file=FILE specifies the file name of the EGD socket * download --bind-address=ADDRESS Specify the local address (host name or IP, used when there are multiple IPs or names locally) -t, --tries=NUMBER Set the maximum number of connection attempts (0 means unlimited). -O --output-document=FILE Write the document to the FILE file -nc, --no-clobber Do not overwrite existing files or use .# prefix -c, --continue Continue downloading unfinished files --progress=TYPE Set the progress bar mark -N, --timestamping Do not re-download files unless they are newer than the local file -S, --server-response print server response --spider don't download anything -T, --timeout=SECONDS Set the response timeout in seconds -w, --wait=SECONDS wait SECONDS seconds between attempts --waitretry=SECONDS wait 1...SECONDS seconds between reconnections --random-wait wait 0...2*WAIT seconds between downloads -Y, --proxy=on/off Turn proxy on or off -Q, --quota=NUMBER Set download capacity limit --limit-rate=RATE Limit download rate * Table of contents -nd –no-directories Do not create directories -x, --force-directories Force creation of directories -nH, --no-host-directories Do not create host directories -P, --directory-prefix=PREFIX save files to directory PREFIX/… --cut-dirs=NUMBER Ignore NUMBER levels of remote directories * HTTP options –http-user=USER Set HTTP user name to USER. –http-passwd=PASS Set http password to PASS. -C, --cache=on/off Enable/disable server-side data caching (usually enabled). -E, --html-extension save all text/html documents with .html extension --ignore-length ignore `Content-Length' header field --header=STRING insert string STRING into headers –proxy-user=USER Set the proxy user name to USER –proxy-passwd=PASS Set the proxy password to PASS --referer=URL Include the `Referer: URL' header in HTTP requests -s, --save-headers Save HTTP headers to file -U, --user-agent=AGENT sets the agent name to AGENT instead of Wget/VERSION. --no-http-keep-alive Disable HTTP keepalive connection (forever connection). --cookies=off Do not use cookies. --load-cookies=FILE Load cookies from file FILE before starting the session --save-cookies=FILE Save cookies to FILE after the session ends * FTP Options -nr, --dont-remove-listing Do not remove `.listing' files -g, --glob=on/off Turn on or off the globbing mechanism for file names --passive-ftp Use passive transfer mode (default). --active-ftp Use active transfer mode --retr-symlinks When recursing, point links to files (not directories) * Recursive download -r, –recursive Recursive download - use with caution! -l, --level=NUMBER Maximum recursion depth (inf or 0 for infinite). --delete-after partially delete files after now completes -k, --convert-links Convert non-relative links to relative links -K, --backup-converted Before converting file X, back it up as X.orig -m, --mirror is equivalent to -r -N -l inf -nr. -p, --page-requisites Download all images displayed in HTML file * Inclusion and exclusion in recursive downloads (accept/reject) -A, --accept=LIST Semicolon separated list of accepted extensions -R, --reject=LIST Semicolon separated list of extensions not accepted -D, --domains=LIST Semicolon separated list of accepted domains --exclude-domains=LIST Semicolon separated list of excluded domains --follow-ftp Follow FTP links in HTML documents --follow-tags=LIST Semicolon separated list of HTML tags to follow -G, --ignore-tags=LIST Semicolon separated list of ignored HTML tags -H, --span-hosts go to external hosts when recursing -L, --relative Follow only relative links -I, --include-directories=LIST list of allowed directories -X, --exclude-directories=LIST List of directories not to be included -np, --no-parent Do not trace back to the parent directory This is the end of this article about wget downloading an entire website (an entire subdirectory) or a specific directory. For more information about wget downloading all the file contents in a directory, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope everyone will support 123WORDPRESS.COM in the future! You may also be interested in:- Detailed explanation of wget command in Linux
- Introduction and comparison of curl command and wget command in Linux
- vbs combined with wget to download website pictures
- Configure wget scheduled task script in windows system
- dos uses wget.exe to make antivirus software upgrades more automated
- Recursively mirror a website using wget
- Detailed explanation of wget command in Linux
- Detailed Introduction to wget Command in Linux
|