當前位置:編程學習大全網 - 編程語言 - 如何使用bash壹個大文件分割成許多小文件

如何使用bash壹個大文件分割成許多小文件

1.如果要分割的文件,使用split:

split -l 500 all all

將文件拆分成每個具有500線的幾個文件。如果您想將文件分割成4個文件差不多大小的,用這樣的:

split -l $(( $( wc -l < all ) / 4 + 1 )) all all

2. 直視split命令,它應該做妳想做的(及以上):

$ split --help

Usage: split [OPTION]... [INPUT [PREFIX]]

Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default

size is 1000 lines, and default PREFIX is 'x'. With no INPUT, or when INPUT

is -, read standard input.

Mandatory arguments to long options are mandatory for short options too.

-a, --suffix-length=N generate suffixes of length N (default 2)

--additional-suffix=SUFFIX append an additional SUFFIX to file names.

-b, --bytes=SIZE put SIZE bytes per output file

-C, --line-bytes=SIZE put at most SIZE bytes of lines per output file

-d, --numeric-suffixes[=FROM] use numeric suffixes instead of alphabetic.

FROM changes the start value (default 0).

-e, --elide-empty-files do not generate empty output files with '-n'

--filter=COMMAND write to shell COMMAND; file name is $FILE

-l, --lines=NUMBER put NUMBER lines per output file

-n, --number=CHUNKS generate CHUNKS output files. See below

-u, --unbuffered immediately copy input to output with '-n r/...'

--verbose print a diagnostic just before each

output file is opened

--help display this help and exit

--version output version information and exit

SIZE is an integer and optional unit (example: 10M is 10*1024*1024). Units

are K, M, G, T, P, E, Z, Y (powers of 1024) or KB, MB, ... (powers of 1000).

CHUNKS may be:

N split into N files based on size of input

K/N output Kth of N to stdout

l/N split into N files without splitting lines

l/K/N output Kth of N to stdout without splitting lines

r/N like 'l' but use round robin distribution

r/K/N likewise but only output Kth of N to stdout

3. 像其他人有妳split。所接受的命令替換是沒有必要的。僅供參考,我加入了幾乎什麽壹直請求。註意-n命令行來指定夾頭,該數small*文件不包含正好500線split。

$ seq 2000 > all

$ split -n l/4 --numeric-suffixes=1 --suffix-length=1 all small

$ wc -l small*

583 small1

528 small2

445 small3

444 small4

2000 total

另外,您也GNU並行:

$ < all parallel -N500 --pipe --cat cp {} small{#}

$ wc -l small*

500 small1

500 small2

500 small3

500 small4

2000 total

正如妳所看到的,這個咒語是GNU的並行實際上是most-的並行pipeline。

  • 上一篇:碘化銫如何將X射線轉化為可見光?
  • 下一篇:什麽是soa roa 談談兩種技術的原理及適用場景
  • copyright 2024編程學習大全網