PyPDF2 编码问题 PyPDF2.utils.PdfReadError Illegal character in Name Object

PyPDF2 编码问题 PyPDF2.utils.PdfReadError Illegal character in Name ObjectPyPDF2编码问题PyPDF2.utils.PdfReadErrorIllegalcharacterinNameObject参考资料:https://github.com/mstamy2/PyPDF2/issues/438使用PyPDF2做合并PDF文件时报错如下:Traceback(mostrecentcalllast):File”D:\pr…

大家好,又见面了,我是你们的朋友全栈君。

PyPDF2 编码问题 PyPDF2.utils.PdfReadError Illegal character in Name Object

参考资料:https://github.com/mstamy2/PyPDF2/issues/438

使用 PyPDF2 做合并 PDF 文件时报错如下:

Traceback (most recent call last):
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\generic.py", line 484, in readFromStream
    return NameObject(name.decode('utf-8'))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcb in position 8: invalid continuation byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\projects\myproject\apps\backstage\views\busi_contract_manage_view.py", line 703, in post
    merge_pdf_result = merge_pdf(final_files, pdf_path)
  File "D:\projects\myproject\apps\utils\doc_convert_util.py", line 86, in merge_pdf
    pdf_writer.write(new_file)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 482, in write
    self._sweepIndirectReferences(externalReferenceMap, self._root)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 556, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, data[i])
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 577, in _sweepIndirectReferences
    newobj = data.pdf.getObject(data)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 1611, in getObject
    retval = readObject(self.stream, self)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\generic.py", line 66, in readObject
    return DictionaryObject.readFromStream(stream, pdf)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\generic.py", line 579, in readFromStream
    value = readObject(stream, pdf)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\generic.py", line 60, in readObject
    return NameObject.readFromStream(stream, pdf)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\generic.py", line 492, in readFromStream
    raise utils.PdfReadError("Illegal character in Name Object")
PyPDF2.utils.PdfReadError: Illegal character in Name Object

找到对应的报错文件 

File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\generic.py", line 484

第484行 原代码:

try:
    return NameObject(name.decode('utf-8'))
except (UnicodeEncodeError, UnicodeDecodeError) as e:
    # Name objects should represent irregular characters
    # with a '#' followed by the symbol's hex number
    if not pdf.strict:
        warnings.warn("Illegal character in Name Object", utils.PdfReadWarning)
        return NameObject(name)
    else:
        raise utils.PdfReadError("Illegal character in Name Object")

在 except 中加入代码 

return NameObject(name.decode('gbk'))

修改后

try:
    return NameObject(name.decode('utf-8'))
except (UnicodeEncodeError, UnicodeDecodeError) as e:
    try:
        return NameObject(name.decode('gbk'))
    except (UnicodeEncodeError, UnicodeDecodeError) as e:
        # Name objects should represent irregular characters
        # with a '#' followed by the symbol's hex number
        if not pdf.strict:
            warnings.warn("Illegal character in Name Object", utils.PdfReadWarning)
            return NameObject(name)
        else:
            raise utils.PdfReadError("Illegal character in Name Object")

修改后仍会报错,需要修改修改另一处

Lib/site-packages/PyPDF2/utils.py 第238行

原代码

r = s.encode('latin-1')
if len(s) < 2:
    bc[s] = r
return r

 修改后代码:

try:
    r = s.encode('latin-1')
except Exception as e:
    r = s.encode('utf-8')
if len(s) < 2:
    bc[s] = r
return r

 

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。

发布者:全栈程序员-用户IM,转载请注明出处:https://javaforall.cn/152402.html原文链接:https://javaforall.cn

【正版授权,激活自己账号】: Jetbrains全家桶Ide使用,1年售后保障,每天仅需1毛

【官方授权 正版激活】: 官方授权 正版激活 支持Jetbrains家族下所有IDE 使用个人JB账号...

(0)


相关推荐

  • 十大下载激活成功教程版最厉害的软件_pix4D激活成功教程

    十大下载激活成功教程版最厉害的软件_pix4D激活成功教程中国著名的D版和激活成功教程软件下载网站 (1)无忧软件网 – 不可多得的激活成功教程软件下载基地,附有无忧书库,无忧字体,代码基地,无忧教学,**园地,完全游戏http://www.51soft.com/ ;(2)精品软件秀 – 软件下载网页,可惜更新太慢!分类清楚,更新及时,也值得一看。http://www.ohsoft.com/ ;(3)163软件园 – 163软件园是国内著名的软件网站,网站定位是提供“提

    2022年10月13日
  • 线代复习知识点大纲

    线代复习知识点大纲

  • latex中希腊字母_LaTeX怎么念

    latex中希腊字母_LaTeX怎么念日常写论文,ppt作汇报等,经常需要编写公式,为方便查阅希腊字母对应latex表示,特写此表格,以便大家查阅。小写 Latex表示 大写 Latex表示 \alpha A A \beta B B \gamma \Gamma \delta \Delta \epsilon …

    2022年10月13日
  • java怎样编写程序_makefile编写实例

    java怎样编写程序_makefile编写实例最近准备花费很长一段时间写一些关于Java的从入门到进阶再到项目开发的教程,希望对初学Java的朋友们有所帮助,更快的融入Java的学习之中。主要内容包括JavaSE、JavaEE的基础知识以及如何

  • 断开和服务器共享连接的方法「建议收藏」

    断开和服务器共享连接的方法「建议收藏」断开和服务器共享连接的方法

  • display:flex垂直居中

    display:flex垂直居中布局说明:1.场次为一场比赛     2.比赛双方是交战的两个队伍        3.一场比赛可以有多种玩法,所以场的每个玩法的布局的高度都不确定。主要说下我学到的垂直居中的flex。1.第一次尝试。1divclass=”parent”>2h1>我是通过flex的水平垂直居中噢h1>3h1>我是通过fl

发表回复

您的电子邮箱地址不会被公开。

关注全栈程序员社区公众号