Categories:
合并、分割 PDF
比较简单的方法是 PyPDF2 。 网上有很多相关教程。
但是 PyPDF2 并没有对文件大小进行优化/压缩。
所以需要一种压缩 PDF 的方法(如果你以后用 Latex 编译 PDF, 也可以用这个 对 PDF 文件压缩,不然 Latex 的原生文件非常大)。
http://blog.sciencenet.cn/blog-467089-773990.htm
终于用latex写完博士论文第一稿了,编译后发现足足有94.1MB!这么大,怎么发给各位老师修改,免不了要压缩一下。稍加搜索,就找到了linux里面的ghostscript工具,可以实现pdf文件的压缩。
ghostscript -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf
压缩后,图片分辨率明显变低了,有点看不清楚。如果需要清楚些,修改一个参数即可:-dPDFSETTINGS=/printer,修改后的文件大小为24MB。还可以使用其他命令:
https://www.ghostscript.com/doc/current/VectorDevices.htm
-dPDFSETTINGS=configuration
Presets the “distiller parameters” to one of four predefined settings: /screen selects low-resolution output similar to the Acrobat Distiller “Screen Optimized” setting.
/ebook selects medium-resolution output similar to the Acrobat Distiller “eBook” setting.
/printer selects output similar to the Acrobat Distiller “Print Optimized” setting.
/prepress selects output similar to Acrobat Distiller “Prepress Optimized” setting.
/default selects output intended to be useful across a wide variety of uses, possibly at the expense of a larger output file.
关于 Ghostscript 的提示:
http://milan.kupcevic.net/ghostscript-ps-pdf/
PDF Creation and Manipulation
Basic Usage
Convert PostScript to PDF:
gs -q -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=fileout.pdf filein.ps
Merge/combine PDF and/or PostScript files:
gs -q -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=fileout.pdf filein.ps filein2.pdf
Extract a page from a PostScript or a PDF document:
gs -q -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -dFirstPage=3 -dLastPage=3 -sOutputFile=fileout.pdf filein.ps
Additional Options
PDF optimization level selection options
-dPDFSETTINGS=/screen (screen-view-only quality, 72 dpi images)
-dPDFSETTINGS=/ebook (low quality, 150 dpi images)
-dPDFSETTINGS=/printer (high quality, 300 dpi images)
-dPDFSETTINGS=/prepress (high quality, color preserving, 300 dpi imgs)
-dPDFSETTINGS=/default (almost identical to /screen)
Paper size selection options
-sPAPERSIZE=letter
-sPAPERSIZE=a4
-dDEVICEWIDTHPOINTS=w -dDEVICEHEIGHTPOINTS=h (point=1/72 of an inch)
-dFIXEDMEDIA (force paper size over the PostScript defined size)
Other options
-dEmbedAllFonts=true
-dSubsetFonts=false
-dFirstPage=pagenumber
-dLastPage=pagenumber
-dAutoRotatePages=/PageByPage
-dAutoRotatePages=/All
-dAutoRotatePages=/None
-r1200 (resolution for pattern fills and fonts converted to bitmaps)
-sPDFPassword=password
Embedding PDFmarks
PDFmarks Create a file named “pdfmarks” with this content:
[ /Title (Document title)
/Author (Author name)
/Subject (Subject description)
/Keywords (comma, separated, keywords)
/ModDate (D:20061204092842)
/CreationDate (D:20061204092842)
/Creator (application name or creator note)
/Producer (PDF producer name or note)
/DOCINFO pdfmark
then combine the file with a PostScript or a PDF file
gs -q -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=withmarks.pdf \
nomarks.ps pdfmarks
You can also add a couple of named destinations to the “pdfmarks” file
[ /Dest /NamedDest1 /Page 1 /View [/XYZ 20 620 1.8] /DEST pdfmark
[ /Dest /NamedDest2 /Page 2 /View [/FitH 15] /DEST pdfmark
or a few bookmarks
[/Count -2 /Dest /NamedDest1 /Title (Preface) /OUT pdfmark
[ /Action /GoTo /Dest /NamedDest1 /Title (Audience) /OUT pdfmark
[ /Action /GoTo /Dest /NamedDest2 /Title (Content) /OUT pdfmark
[/Count 3 /Page 2 /View [/XYZ 10 160 1.0] /Title (Part 1) /OUT pdfmark
[ /Page 2 /View [/XYZ 10 160 1.0] /Title (A first one) /OUT pdfmark
[ /Page 3 /View [/XYZ 0 500 NULL] /Title (The second one) /OUT pdfmark
[ /Page 6 /View [/FitH 220] /Title (The third thing) /OUT pdfmark
[ /PageMode /UseOutlines /DOCVIEW pdfmark
For more information about pdfmarks see pdfmark Reference Manual.