-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
H20上新版本magic-pdf不可用 #1293
Comments
@myhloli 您好,这个问题有结论吗,能否提供一个在H20上可用的环境版本 |
抱歉,我们没有H系列显卡进行测试,目前只能参考 #558 的案例使用高版本cuda的paddlegpu尝试,如果仍有兼容性问题,请卸载paddlepaddle和paddlepaddle-gpu,并重新安装paddlepaddle使用cpu进行推理 |
我们一开始就是使用的paddle cpu版本,cpu版本报错才去尝试paddlegpu的。而高版本cuda的paddlegpu(cuda12.3)会和torch2.3.1版本不兼容,其他库又依赖2.3.1,所以导致高cuda的paddlegpu装不上。总之尝试了paddle的cpu、cuda各版本,都没成功 |
#558 (comment) #558的评论区也有人遇到这个问题 |
cpu版本不应该不兼容吧,根据用户反馈,cpu不兼容的情况一般是cpu不支持avx/avx2指令集,你也可以通过这个点查一下 |
可以将unimernet更新到0.2.2,移除了对torchtext的依赖,这样就可以更新torch到2.3.1以上,如果提示其他包对torch 版本限制,可以先不管,直接手动强制更新torch(需要同步更新torchvision到匹配版本 |
好的,谢谢建议~我们尝试一下 |
将宿主机(H20)的cuda版本更新到了12.4,Floating point exception的问题解决了 |
Description of the bug | 错误描述
在H20上使用magic-pdf 0.9.x 和0.10.x会报错,报错信息如下。同样环境下,magic-pdf==0.8.1是没有问题的。
2024-12-11 14:34:18.129 | INFO | magic_pdf.model.pdf_extract_kit:call:184 - layout detection time: 0.33
2024-12-11 14:34:18.175 | INFO | magic_pdf.model.pdf_extract_kit:call:192 - mfd time: 0.04
2024-12-11 14:34:18.176 | INFO | magic_pdf.model.pdf_extract_kit:call:199 - formula nums: 0, mfr time: 0.0
2024-12-11 14:34:19.285 | INFO | magic_pdf.model.pdf_extract_kit:call:230 - ocr time: 1.11
2024-12-11 14:34:19.286 | INFO | magic_pdf.model.doc_analyze_by_custom_model:doc_analyze:168 - -----page_id : 2, page total time: 1.48-----
2024-12-11 14:34:19.672 | INFO | magic_pdf.model.doc_analyze_by_custom_model:doc_analyze:178 - gc time: 0.39
2024-12-11 14:34:19.673 | INFO | magic_pdf.model.doc_analyze_by_custom_model:doc_analyze:182 - doc analyze time: 12.32, speed: 0.24 pages/second
C++ Traceback (most recent call last):
0 at::_ops::linear::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&)
1 at::native::linear(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&)
2 at::_ops::addmm::call(at::Tensor const&, at::Tensor const&, at::Tensor const&, c10::Scalar const&, c10::Scalar const&)
3 at::_ops::addmm::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, at::Tensor const&, c10::Scalar const&, c10::Scalar const&)
Error Message Summary:
FatalError:
Erroneous arithmetic operation
is detected by the operating system.[TimeInfo: *** Aborted at 1733898861 (unix time) try "date -d @1733898861" if you are using GNU date ***]
[SignalInfo: *** SIGFPE (@0x7522f18fd914) received by PID 41 (TID 0x752433404480) from PID 18446744073467320596 ***]
Floating point exception (core dumped)
How to reproduce the bug | 如何复现
情况与#908 类似,但测试过paddle的cpu、gpu等多个版本,都失败了
<style> </style>测试情况:
Operating system | 操作系统
Linux
Python version | Python 版本
3.10
Software version | 软件版本 (magic-pdf --version)
0.10.x
Device mode | 设备模式
cuda
The text was updated successfully, but these errors were encountered: