批量把 word 转换成图片格式 pdf

有类似软件可以达到这个目的吗？

在线网站也可以，我找到一个 maipdf 可以将文字版的 pdf 转换为图片版的 pdf （ turn PDF into images PDF ，https://maipdf.com/est/maiconvert.html ），但只能单个单个来，我这 word 批量转成 pdf ，再到这网站上也不好操作。

有没有一步到位的，可以直接批量 word 转换为图片格式的 pdf （也就是每页都是图片）。
没有一步到位的，有标准 pdf 转图标格式 pdf 的也可以。

或者考虑直接写一个类似程序？有什么思路吗

Apol1oBelvedere

2024-08-14 11:29:47 +08:00

试了一下 GPT 很容易写出这样的程序，对于把各个开源文件转换模块用胶水粘合成一段代码的需求，GPT 能实现。

jpyl0423

2024-08-14 11:35:13 +08:00

原来用 vbs 转过，网上搜的脚本，随便搜就能搜到

zhidazhai

2024-08-14 11:50:10 +08:00

是我没明白内容吗？ WORD 可以直接转成 PDF ，PDF 可以导出或另存为 JPG ，都是批量自动化操作

xHliu

2024-08-14 12:30:20 +08:00

word to pdf ，pdf to jpg ，jpg to pdf

要是急着用，就多转换一圈，

pililink

2024-08-14 12:49:35 +08:00

可以去看看 wps 有没有这个功能

wu529778790

2024-08-14 15:24:52 +08:00

https://github.com/Stirling-Tools/Stirling-PDF
这个开源的，有你要的这个功能

zdl0929

2024-08-14 15:43:10 +08:00

@Apol1oBelvedere #1 雀氏，在 gpt 帮助下，很快完成了

kinkin666

2024-08-14 15:46:14 +08:00

com.luhuiguo:aspose-words:23.1
这个用来把 word 转 pdf

org.apache.pdfbox:pdfbox-app:2.0.13
这个用来把 pdf 转图片

zdl0929

2024-08-14 15:46:42 +08:00

找了半天没合适的，然后 gpt 半小时搞定😂
------
# 读取文件夹中的所有 word 文件，把每一个转换为图像文件，再将图像文件合并到一个同名的 pdf 文件中
import os
from docx2pdf import convert
from pdf2image import convert_from_path
import img2pdf
import shutil

def word_to_pdf(word_file, pdf_file):
convert(word_file, pdf_file)

def pdf_to_images(pdf_file, image_prefix):
images = convert_from_path(pdf_file)
# 如果 image_prefix 文件夹不存在创建文件夹
os.makedirs(os.path.join("imagetmp", image_prefix), exist_ok=True)
image_paths = []
for i, image in enumerate(images):
image_path = os.path.join("imagetmp/"+ image_prefix, f'page_{i + 1}.png')
image.save(image_path, 'PNG')
image_paths.append(image_path)
return image_paths
def images_to_pdf(images, pdf_file):
with open(pdf_file, "wb") as f:
f.write(img2pdf.convert([i for i in images if i.endswith(".png")]))

def convert_word_files_to_pdf(source_directory, target_directory):
for root, dirs, files in os.walk(source_directory):
for file in files:
if file.endswith(".docx"):
source_file = os.path.join(root, file)
pdf_file = os.path.join(root, file.replace(".docx", ".pdf"))
image_pdf_file = os.path.join(root, file.replace(".docx", ".pdf"))
word_to_pdf(source_file, pdf_file)
images = pdf_to_images(pdf_file, file.replace(".docx", ""))
images_to_pdf(images, image_pdf_file)
# os.remove(pdf_file)
target_dir = root.replace(source_directory, target_directory)
os.makedirs(target_dir, exist_ok=True)
shutil.move(image_pdf_file, target_dir)

convert_word_files_to_pdf(source_dir, dist_dir)

这是一个专为移动设备优化的页面（即为了让你能够在 Google 搜索结果里秒开这个页面），如果你希望参与 V2EX 社区的讨论，你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://ex.noerr.eu.org/t/1064851