Python PDF转图片jpg等

2016 年 12 月 10 日

找了两天还是绕回到原路了：

PDF单页转换：
可以用Wand(http://docs.wand-py.org/en/0.4.1/)来转:
from wand.image import Image

with Image(filename=’filename.pdf’) as pdf:
with pdf.convert(‘jpeg’) as image:
image.save(filename=’result.jpeg’)

PDF多页转换：

最近工作中需要把pdf文件转化为图片，想用Python来实现，于是在网上找啊找啊找啊找，找了半天，倒是找到一些代码。

1、第一个找到的代码，我试了一下好像是反了，只能实现把图片转为pdf，而不能把pdf转为图片。。。

http://zhidao.baidu.com/link?url=QUoPVmQTP9fXktULAjxLtjVx4NXju631yQNfs9nAsYe6iGfv8LwmAbWA8mjlFEkCbLb9HveeT-48QSxAWyxrsH6L25-LD6HsVjlYs2aUBeG

代码如下：

[python] view plain copy

#!/usr/bin/env python
import os
import sys
from reportlab.lib.pagesizes import A4, landscape
from reportlab.pdfgen import canvas
f = sys.argv[1]
filename = ”.join(f.split(‘/’)[-1:])[:-4]
f_jpg = filename+‘.jpg’
print f_jpg
def conpdf(f_jpg):
f_pdf = filename+‘.pdf’
(w, h) = landscape(A4)
c = canvas.Canvas(f_pdf, pagesize = landscape(A4))
c.drawImage(f, 0, 0, w, h)
c.save()
print “okkkkkkkk.”
conpdf(f_jpg)

2、第二个是文章写的比较详细，可惜的是Linux下的代码，所以仍然没用。

3、第三个文章指出有一个库PythonMagick可以实现这个功能，需要下载一个库 PythonMagick-0.9.10-cp27-none-win_amd64.whl 这个是64位的。

这里不得不说自己又犯了一个错误，因为自己从python官网上下载了一个python 2.7,以为是64位的版本，实际上是32位的版本，所以导致python的版本（32位）和下载的PythonMagick的版本（64位）不一致，弄到晚上12点多，总算了发现了这个问题。。。

4、然后，接下来继续用搜索引擎搜，找到很多stackoverflow的问题帖子，发现了2个代码，不过要先下载PyPDF2以及ghostscript模块。

先通过pip来安装 PyPDF2、PythonMagick、ghostscript 模块。

[plain] view plain copy

C:UsersAdministrator>pip install PyPDF2
Collecting PyPDF2
Using cached PyPDF2-1.25.1.tar.gz
Installing collected packages: PyPDF2
Running setup.py install for PyPDF2
Successfully installed PyPDF2-1.25.1
You are using pip version 7.1.2, however version 8.1.2 is available.
You should consider upgrading via the ‘python -m pip install –upgrade pip’ command.
C:UsersAdministrator>pip install C:PythonMagick-0.9.10-cp27-none-win_amd64.whl
Processing c:pythonmagick-0.9.10-cp27-none-win_amd64.whl
Installing collected packages: PythonMagick
Successfully installed PythonMagick-0.9.10
You are using pip version 7.1.2, however version 8.1.2 is available.
You should consider upgrading via the ‘python -m pip install –upgrade pip’ command.
C:UsersAdministrator>pip install ghostscript
Collecting ghostscript
Downloading ghostscript-0.4.1.tar.bz2
Requirement already satisfied (use –upgrade to upgrade): setuptools in c:python27libsite-packages (from ghostscript)
Installing collected packages: ghostscript
Running setup.py install for ghostscript
Successfully installed ghostscript-0.4.1
You are using pip version 7.1.2, however version 8.1.2 is available.
You should consider upgrading via the ‘python -m pip install –upgrade pip’ command.

下面是代码

代码1：

[python] view plain copy

import os
import ghostscript
from PyPDF2 import PdfFileReader, PdfFileWriter
from tempfile import NamedTemporaryFile
from PythonMagick import Image
reader = PdfFileReader(open(“C:/deep.pdf”, “rb”))
for page_num in xrange(reader.getNumPages()):
writer = PdfFileWriter()
writer.addPage(reader.getPage(page_num))
temp = NamedTemporaryFile(prefix=str(page_num), suffix=“.pdf”, delete=False)
writer.write(temp)
print temp.name
tempname = temp.name
temp.close()
im = Image(tempname)
#im.density(“3000”) # DPI, for better quality
#im.read(tempname)
im.write(“some_%d.png” % (page_num))
os.remove(tempname)

代码2：

[python] view plain copy

import sys
import PyPDF2
import PythonMagick
import ghostscript
pdffilename = “C:deep.pdf”
pdf_im = PyPDF2.PdfFileReader(file(pdffilename, “rb”))
print ‘1’
npage = pdf_im.getNumPages()
print(‘Converting %d pages.’ % npage)
for p in range(npage):
im = PythonMagick.Image()
im.density(‘300’)
im.read(pdffilename + ‘[‘ + str(p) +‘]’)
im.write(‘file_out-‘ + str(p)+ ‘.png’)
#print pdffilename + ‘[‘ + str(p) +’]’,’file_out-‘ + str(p)+ ‘.png’

然后执行时都报错了，这个是代码2 的报错信息：

[plain] view plain copy

Traceback (most recent call last):
File “C:c.py”, line 15, in
im.read(pdffilename + ‘[‘ + str(p) +’]’)
RuntimeError: pythonw.exe: PostscriptDelegateFailed `C:DEEP.pdf’: No such file or directory @ error/pdf.c/ReadPDFImage/713

总是在上面的 im.read(pdffilename + ‘[‘ + str(p) +’]’) 这一行报错。

于是，根据报错的信息在网上查，但是没查到什么有用的信息，但是感觉应该和GhostScript有关，于是在网上去查安装包，找到一个在github上的下载连接，但是点进去的时候显示无法下载。

最后，在csdn的下载中找到了这个文件：GhostScript_Windows_9.15_win32_win64，安装了64位版本，之后，再次运行上面的代码，都能用了。

不过代码2需要做如下修改，不然还是会报 No such file or directory @ error/pdf.c/ReadPDFImage/713 错误：

[python] view plain copy

#代码2
import sys
import PyPDF2
import PythonMagick
import ghostscript
pdffilename = “C:deep.pdf”
pdf_im = PyPDF2.PdfFileReader(file(pdffilename, “rb”))
print ‘1’
npage = pdf_im.getNumPages()
print(‘Converting %d pages.’ % npage)
for p in range(npage):
im = PythonMagick.Image(pdffilename + ‘[‘ + str(p) +‘]’)
im.density(‘300’)
#im.read(pdffilename + ‘[‘ + str(p) +’]’)
im.write(‘file_out-‘ + str(p)+ ‘.png’)
#print pdffilename + ‘[‘ + str(p) +’]’,’file_out-‘ + str(p)+ ‘.png’

这次有个很深刻的体会，就是解决这个问题过程中，大部分时间都是用在查资料、验证资格资料是否有用上了，搜索资料的能力很重要。

而在实际搜索资料的过程中，国内关于PythonMagick的文章太少了，搜索出来的大部分有帮助的文章都是国外的，但是这些国外的帖子文章，也没有解决我的问题或者是给出有用的线索，最后还是通过自己的思考，解决了问题。

http://blog.csdn.net/sqlserverdiscovery/article/details/51425543

2016-11-17经测试，在安装GhostScript_Windows_9.18_win32_win64之后可以顺利运行，那么终于搞定了PDF和JPG之间的转换，收工！

###############测试直接把JPG保存为PDF
# Image.save(newfilename,outfile, "PDF", resoultion = 100.0)
###############测试直接把pDF保存为JPG



# 代码2
import PyPDF2
import PythonMagick


pdf_file_name = 'zhishi0/zhishi.pdf'
pdf_im = PyPDF2.PdfFileReader(open(pdf_file_name, "rb"))
n_page = pdf_im.getNumPages()
print('Converting %d pages.' % n_page)
for p in range(n_page):
    im = PythonMagick.Image(pdf_file_name + '[' + str(p) + ']')
    im.density('300')
    # im.read(pdffilename + '[' + str(p) +']')
    im.write('zhishi/file_out-' + str(p) + '.png')
    # print pdffilename + '[' + str(p) +']','file_out-' + str(p)+ '.png'

转载自演道,想查看更及时的互联网产品技术热点文章请点击http://go2live.cn

About The Author

bjmayor

程序员，码农，php,python,ios,android,go，产品经理，创业。

11 Comments

KelCysfuh

Buy Generic Plavix Cheap Do Viagra Tablets Go Bad [url=http://tadalaffbuy.com]online pharmacy[/url] Uso De La Viagra
2019 年 7 月 26 日
EllNaKe

Propecia While On Nodular Acne Cialis Brausetabletten [url=http://bmamasstransit.com]buy generic cialis online[/url] Order Now Generic Dutasteride Amoxicillin Drug Facts For Lyme Disease Viagra Prescription Cost
2019 年 7 月 22 日
KelCysfuh

Viagra Generic Very Very Cheap Come Acquistare Cialis Comprar Cialis Generico 10 Mg [url=http://gaprap.com]viagra[/url] Priligy Ervaring Tamoxifene Effets Secondaires Purchase Generic Propecia
2019 年 7 月 13 日
EllNaKe

Meds From Canada No Prescription [url=http://brandciali.com]cialis[/url] Angela Women’S Ginseng
2019 年 7 月 13 日
EllNaKe

Tadalafil Best Price 20 Mg [url=http://buyviaa.com]buy viagra online[/url] Commander Baclofen 25mg Can I Take Sudafed With Keflex
2019 年 7 月 3 日
KelCysfuh

Cialis Tarif Pharmacie Emballage Priligy Buy Rogaine In Greece [url=http://realviaonline.com]cialis from canada[/url] Viagra Suisse Cialis Et Cancer
2019 年 6 月 30 日
EllNaKe

Viagra Compare Price Noscript Meds Canada Priligy Ebay [url=http://xzanax.com][/url] Need Prescription Amoxicillin Doxycycline 20mg For Sale
2019 年 6 月 21 日
KelCysfuh

Side Effects Of Amoxicillin Cats Generic Viagra Canada Customs Viagra A 16 Ans [url=http://howtogetvia.com]viagra online pharmacy[/url] Cialis Giornaliero Funziona
2019 年 6 月 18 日
EllNaKe

Preis Cialis 20mg 4 Stuck Kamagra Manufacturers Best Place Buy Cialis 40 Mg Online [url=http://cialislis.com]buy generic cialis[/url] Lamotrigine Canadian Pharmacy India Propecia Farmaco
2019 年 6 月 13 日
KelCysfuh

Low Testosterone Bph Propecia [url=http://cialonlinecs.com]cialis[/url] Cialis rique Cialis Generico Paypal Nolvadex Sans Ordonance
2019 年 6 月 7 日
EllNaKe

Viagra Se Vende Bajo Receta [url=http://cialionline.com]cialis overnight shipping from usa[/url] Cialis Generico En Valencia Amoxicillin Abuse Alli Equals
2019 年 6 月 5 日

2024年四月
M	T	W	T	F	S	S
« Jan
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Related Posts

About The Author

bjmayor